OpenAI, a research laboratory for artificial intelligence, has recently released Shap-E, a condition-based generative model for 3D assets. Shap-E has the potential to directly generate parameters from a single text description. As the paper explains, Shap-E can generate complex and diverse 3D assets which are represented as textured meshes with neural radiance fields (NeRF) or implicit functions in just a few seconds.
Unlike other 3D generation models which develop a single output layout, Shap-E was trained in two steps. First, an encoder is designed which maps 3D items into parameters of implicit function. After, a complication diffusion model is trained on the encoder’s output. Shaping has the ability to achieve faster convergence than Point-E and has the same diffusion technique utilized by DALL-E and Point-E.
Shap-E coordinates a variety of features, however its limitations should be noted. First, it is only able to generate individual object prompts with basic elements. Also, the images produced may appear blurry and grainy. Furthermore, it cannot recognize multiple attributes even if they are present within the same text.
OpenAI is committed to providing the community with highly sophisticated AI-enabled products, and Shap-E highlights the company’s commitment to advancing the state of machine learning research. The project is being led by Jack Urbanek, a research scientist working on OpenAI’s artificial intelligence-enhanced research. Shap-E is open-source and available with the model weights, inference code, and sample on GitHub.