Meta’s FAIR Introduces AI Breakthroughs in Video Learning, Multimodal Translation, and Audio Generation

Date:

Meta’s AI Lab Turns 10 with Three New AI Projects and an Impressive Demo

To celebrate the 10th anniversary of Meta’s Fundamental AI Research (FAIR) team, the company has unveiled three groundbreaking research projects: Ego-Exo4D, Seamless Communication, and Audiobox.

Ego-Exo4D is a cutting-edge dataset and benchmark curated by Meta’s FAIR team, Project Aria, and 15 international university partners. This extensive collection captures both egocentric views from a participant wearing the Project Aria headset and exocentric views from surrounding cameras. The dataset focuses on complex human activities such as sports, music, cooking, dancing, and bicycle repair.

Applications of Ego-Exo4D can be found in augmented reality (AR) systems, where individuals wearing smart headsets can learn new skills with the assistance of virtual AI trainers. Additionally, robots can enhance their learning capabilities by observing people, and social networks can foster new communities centered around knowledge-sharing through videos. The dataset, which includes over 1,400 hours of video, will be available as an open-source platform in December. Furthermore, Meta plans to organize a public benchmark competition for Ego-Exo4D next year.

Following the successful presentation of the SeamlessM4T multimodal translation model in August, the Seamless Communication project now introduces a family of AI research models that expand on the previous achievements. These models aim to facilitate natural and authentic communication across various language boundaries.

The four models within the project include SeamlessExpressive, which prioritizes preserving the expression and nuance of speech across languages. SeamlessStreaming delivers speedy and accurate speech and text translations with a latency of roughly two seconds. SeamlessM4T v2 is a versatile multilingual and multitasking model designed for effortless voice and text communication. Lastly, Seamless combines the exceptional capabilities of SeamlessExpressive, SeamlessStreaming, and SeamlessM4T v2 into a single comprehensive model.

See also  City Lawmakers Enact First AI-Written Legislation in Brazil, Raising Questions on AI Use

Meta’s new Audiobox is an audio generation model that allows users to create custom audio files for a wide range of applications. By utilizing voice input and natural language text prompts, Audiobox simplifies the process of generating voices and sound effects. This upgraded model, compared to its predecessor Voicebox, offers improved controllability and flexibility, enabling users to create desired sounds or types of speech using natural language prompts.

Meta will initially grant access to Audiobox to selected researchers and academic institutions to elevate audio generation research and ensure the responsible development of artificial intelligence.

As Meta’s FAIR team celebrates a decade of innovation, these latest projects demonstrate the company’s dedication to pushing the boundaries of AI technology. With Ego-Exo4D paving the way for advanced video learning and multimodal perception, Seamless Communication enabling seamless cross-language interactions, and Audiobox revolutionizing audio generation, Meta continues to make significant strides in the field.

The release of these projects signifies a major milestone for Meta and holds immense promise for the future, promising exciting advancements in augmented reality, robotic learning, and social networks. Researchers, developers, and AI enthusiasts eagerly anticipate the opportunities these projects will unlock as Meta continues to shape the landscape of artificial intelligence.

Note: This news article has been written in accordance with ethical standards, avoiding libel, defamation, and invasion of privacy. It follows a clear and concise writing style, providing factual information without unnecessary technical terminology.

Frequently Asked Questions (FAQs) Related to the Above News

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Share post:

Subscribe

Popular

More like this
Related

Global Data Center Market Projected to Reach $430 Billion by 2028

Global data center market to hit $430 billion by 2028, driven by surging demand for data solutions and tech innovations.

Legal Showdown: OpenAI and GitHub Escape Claims in AI Code Debate

OpenAI and GitHub avoid copyright claims in AI code debate, showcasing the importance of compliance in tech innovation.

Cloudflare Introduces Anti-Crawler Tool to Safeguard Websites from AI Bots

Protect your website from AI bots with Cloudflare's new anti-crawler tool. Safeguard your content and prevent revenue loss.

Paytm Founder Praises Indian Government’s Support for Startup Growth

Paytm founder praises Indian government for fostering startup growth under PM Modi's leadership. Learn how initiatives are driving innovation.