OpenAI Expands into Video with the Introduction of Sora AI Model
OpenAI, the pioneering artificial intelligence (AI) company behind the immensely popular ChatGPT, is making its foray into the world of video with the launch of its latest AI model, Sora. Building on the success of DALL-E, OpenAI’s image generation AI tool, Sora allows users to input their desired scene and receive a high-resolution video clip in return. In addition to generating video clips from still images, Sora can enhance existing videos and fill in missing frames.
As chatbots and image generators have already found their way into consumer and business applications, video seems to be the next logical frontier for generative AI. However, the rise of AI-generated deepfakes has raised concerns about the potential for misinformation. According to machine learning company Clarity, the number of AI-generated deepfakes increased by a staggering 900% year-over-year.
To compete in this emerging field, OpenAI is taking on video generation AI tools from industry giants like Meta and Google, who recently announced their own offerings, Lumiere. Startups like Stability AI with their product Stable Video Diffusion, and even Amazon with Create with Alexa, have also entered the video generation space.
At present, Sora is limited to producing videos that are one minute or less in length. However, OpenAI, with the backing of Microsoft, has plans to become a multimodal AI platform by combining text, images, and video generation. OpenAI’s COO, Brad Lightcap, believes that text and code alone are insufficient representations of the world, and that a multimodal approach is necessary to truly harness the full power and capabilities of AI models.
OpenAI has been cautious with the release of Sora, only making it available to a select group of safety testers known as red teams. These testers evaluate the model’s performance in areas such as misinformation and bias. The company has not publicly demonstrated Sora beyond the 10 sample clips provided on its website, but technical documentation is set to be released soon.
OpenAI is also focused on addressing concerns surrounding the identification of AI-generated content. They are developing a detection classifier to identify Sora-generated video clips and plan to include specific metadata in the output to aid in the identification of AI-generated content. This aligns with Meta’s efforts to use metadata for identifying AI-generated images during this election year.
Similar to ChatGPT, Sora utilizes the Transformer architecture—originally introduced by Google researchers in a 2017 paper. OpenAI describes Sora as a foundational model capable of understanding and simulating the real world, setting the stage for even more advanced AI models in the future.
As OpenAI continues to innovate and expand its capabilities, the introduction of Sora marks a significant step forward in bringing generative AI to the realm of video. However, the potential risks of misinformation and the proliferation of deepfakes emphasize the need for robust safeguards and responsible use of this technology. With ongoing advancements and more accessible AI tools, the future of video generation holds immense creative possibilities and opens up new avenues for human-machine interaction.
Note: This article is generated by OpenAI’s language model.