The next breakthrough to take the AI world by storm may be 3D model generators. This week, OpenAI opened source Point-E, a machine learning system that generates a 3D object in the event of a text prompt. According to a paper published alongside the codebase, Point-E can produce 3D models in one to two minutes on a single Nvidia V100 GPU.
Point-E does not create 3D objects in the traditional sense. Instead, it creates point clouds, or discrete groups of data points in space that represent a 3D shape—hence the lite acronym. (The “E” in Point-E stands for “efficiency,” ostensibly because it’s faster than previous methods for generating 3D objects.) Point clouds are easy to collect from a computational standpoint, but they don’t capture the precision of an object’s shape or texture—a major limitation of Point-E currently.
To get around this limitation, the Point-E team trained an additional AI system to convert Point-E’s point clouds into grids. (Meshes—the groups of vertices, edges, and faces that define an object—are commonly used in 3D modeling and design.) But she notes in the paper that a model can sometimes miss certain parts of objects, resulting in blocky or distorted shapes.
Outside of the network generation model, which stands on its own, Point-E consists of two models: a text-to-image model and an image-to-3D model. A text-to-image model, similar to generative art systems such as OpenAI’s DALL-E 2 and Stable Diffusion, was trained on labeled images to understand associations between words and visual concepts. On the other hand, the image-to-3D model was fed a set of images paired with 3D objects so that it would learn to translate effectively between the two.
When given a text prompt—for example, “3D-printable gear, one gear is 3 inches in diameter and half an inch thick”—the Text-to-Image model in Point-E creates an artificial object that is fed the image to the 3D model, which then It creates a point cloud.
After training models on a data set of “several million” 3D objects and their associated metadata, Point-E can produce colored point clouds that recursively match text prompts, OpenAI researchers say. It’s not perfect – Point-E’s image-to-3D model sometimes fails to understand the image from the text-to-image model, resulting in a shape that doesn’t match the text vector. However, it is faster than the previous latest technology – at least according to the OpenAI team.
“While our method performs worse in this assessment than modern techniques, it does produce samples in a fraction of the time,” they wrote in the paper. “This may make it more practical for certain applications, or it may allow for higher-quality 3D object detection.”
What exactly are the applications? Well, the OpenAI researchers have pointed out that Point-E point clouds can be used to fabricate real-world objects, for example through 3D printing. With the additional network transformation model, the system – once more refined – can also find its way into game development and animation workflows.
OpenAI may be the latest company to jump into the 3D object creator fray, but — as we pointed out earlier — it certainly isn’t the first. Earlier this year, Google released DreamFusion, an expanded version of Dream Fields, a 3D system unveiled in 2021. Unlike Dream Fields, DreamFusion requires no prior training, which means it can create 3D representations of objects Without 3D data.
While all eyes are on 2D art generators right now, composite AI for models could be the next big industry disruptor. 3D models are widely used in film, television, interior design, architecture, and various fields of science. Architectural firms use them to display proposed buildings and landscapes, for example, while engineers make use of models as designs for new devices, vehicles, and structures.
3D models typically take some time to manufacture – anywhere from several hours to several days. Artificial intelligence like Point-E could change that if the kinks are worked out one day, and make OpenAI a respectable profit doing so.
The question is what kind of intellectual property disputes might arise in time. There is a large market for 3D models, with several online marketplaces including CGStudio and CreativeMarket allowing artists to sell the content they have created. If Point-E can make its way and its models to market, model artists may protest, pointing to evidence that modern generative AI borrows heavily from its training data — existing 3D models, in the case of Point-E. Like DALL-E 2, Point-E does not attribute or quote any of the artists that may have influenced its generations.
But OpenAI will leave this issue for another day. Point-E’s paper or GitHub page made no mention of copyright.
To their credit, the researchers an act stated that they expected Point-E to suffer from else Problems, such as biases inherited from training data and lack of guarantees about which models can be used to create “dangerous things”. Perhaps that’s why they’re keen to describe Point-E as a “starting point” that they hope will inspire “further work” in the field of text-to-3D synthesis.
#OpenAI #releases #PointE #artificial #intelligence #creates #models