Adobe and the Australian National University (ANU) have made a groundbreaking announcement in the field of 3D visualizations. They have unveiled the first artificial intelligence (AI) model capable of generating 3D images from a single 2D image. This development is set to transform the creation of 3D models, as the algorithm developed by the researchers can generate these images in a matter of seconds.
The AI model, known as the large reconstruction model (LRM), is based on a highly scalable neural network containing one million datasets with 500 million parameters. These datasets include images, 3D shapes, and videos. The combination of this high-capacity model and large-scale training data enables the model to be highly generalizable and produce high-quality 3D reconstructions from various testing inputs.
According to Yicong Hong, the lead author of the project report and an Adobe intern and former graduate student at ANU, their LRM is the first large-scale 3D reconstruction model. This breakthrough technology is expected to have wide-ranging applications in augmented reality and virtual reality systems, gaming, cinematic animation, and industrial design.
Previously, 3D imaging software was limited to specific subject categories with pre-established shapes. Later advancements in image generation were achieved with programs like DALL-E and Stable Diffusion, which leveraged the generalization capability of 2D diffusion models for multi-views. However, these programs were limited to pre-trained 2D generative models. Other systems utilized per-shape optimization, but they were often slow and impractical.
Inspired by the evolution of natural language models within massive transformer networks, Hong’s team asked if it was possible to learn a generic 3D prior for reconstructing an object from a single image. Their answer was a resounding yes. The LRM can reconstruct high-fidelity 3D shapes from various real-world images and those created by generative models. It can produce a 3D shape in just five seconds without the need for post-optimization.
The success of the LRM lies in its ability to draw upon a vast database of image parameters and predict a neural radiance field (NeRF), which allows for the generation of realistic-looking 3D imagery based solely on 2D pictures, even low-resolution ones. NeRF encompasses image synthesis, object detection, and image segmentation capabilities.
It has been 60 years since the first computer program that allowed users to generate and manipulate simple 3D shapes was created. Over the decades, 3D programs have seen remarkable advancements. Now, with Adobe and ANU’s breakthrough, the field of 3D visualizations is set to be revolutionized once again.
References:
– Adobe and Australian National University Unveil Breakthrough: AI Generates 3D Images from 2D in Seconds [Article Link]