Gladia Converts Audio to Text in Near Real-Time

Date:

Meet Gladia, a French artificial intelligence (AI) startup that wants to revolutionize how companies deal with audio data. The company has developed a new audio transcription application programming interface (API) that is more reliable and efficient than existing options. Gladia’s technology foundation unlocks new use cases around audio, and it promises an hour of audio transcription for just $0.61, with the entire transcription process taking roughly 60 seconds.

According to Jean-Louis Quéguiner, the co-founder, and CEO of Gladia, there are three main pain points with existing transcription APIs. Firstly, price, transcribing an hour of audio usually costs between $1.50 to $2 an hour. Secondly, the output is not always very reliable, as some languages work well while others are barely supported. Lastly, existing transcription APIs are slow, taking more than 15 minutes to transcribe an hour of audio.

Gladia’s solution is based on OpenAI’s open-source transcription model, Whisper, which has been modified to work faster and more responsively. Gladia also has some pre-processing and post-processing algorithms that improve the end results. The Gladia API can detect when there are multiple speakers, add timestamps, detect languages and switch from one to another if needed. It can also automatically add punctuation and casing.

The Gladia transcription API is compatible with SRT and VTT files for companies that want to generate subtitles. Combined with word-level timestamps after an audio file has been transcribed, Gladia can translate text into another language, allowing companies to upload an audio file and get subtitles in dozens of languages in just a few minutes.

See also  2023 Asian Games Kick Off in Hangzhou with Spectacular AI-infused Ceremony, China

Gladia raised a $4 million seed round in a funding round led by New Wave, with support from Sequoia, Cocoa, and various business angels, including Solomon Hykes, Pierre Betouin, Miroslaw Klaba, and Alexandre Berriche. The company currently works with call center companies, virtual meeting services, and video publishers, including Claap, Livestorm, and Selectra.

Moving forward, the company aims to build features on top of its strong technical foundation. For instance, the company hopes to enable summarization of the content of an audio file, categorize content into multiple topic categories, create chapters automatically, conduct sentiment analysis, and much more.

Overall, Gladia is one of the best transcription APIs on the market, and its developers believe that transcription will become a commodity. The company’s long-term vision is to augment audio with intelligence, moving from 2D to 3D data.

Frequently Asked Questions (FAQs) Related to the Above News

What is Gladia?

Gladia is a French startup that has developed a new audio transcription API that is more reliable and efficient than existing options.

How much does it cost to transcribe an hour of audio with Gladia?

Gladia promises an hour of audio transcription for just $0.61.

What are the pain points with existing transcription APIs?

The pain points with existing transcription APIs are high cost, unreliable output, and slow transcription speed.

How does Gladia's solution work?

Gladia's solution is based on OpenAI's open-source transcription model, Whisper, which has been modified to work faster and more responsively. Gladia also has some pre-processing and post-processing algorithms that improve the end results.

What features does Gladia's transcription API have?

Gladia's transcription API can detect when there are multiple speakers, add timestamps, detect languages and switch from one to another if needed. It can also automatically add punctuation and casing.

What file formats does Gladia support?

The Gladia transcription API is compatible with SRT and VTT files for companies that want to generate subtitles.

What is Gladia's long-term vision?

Gladia's long-term vision is to augment audio with intelligence, moving from 2D to 3D data.

What companies does Gladia currently work with?

Gladia currently works with call center companies, virtual meeting services, and video publishers, including Claap, Livestorm, and Selectra.

What are Gladia's plans for future features?

Gladia aims to enable summarization of the content of an audio file, categorize content into multiple topic categories, create chapters automatically, conduct sentiment analysis, and much more.

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Advait Gupta
Advait Gupta
Advait is our expert writer and manager for the Artificial Intelligence category. His passion for AI research and its advancements drives him to deliver in-depth articles that explore the frontiers of this rapidly evolving field. Advait's articles delve into the latest breakthroughs, trends, and ethical considerations, keeping readers at the forefront of AI knowledge.

Share post:

Subscribe

Popular

More like this
Related

Disturbing Trend: AI Trains on Kids’ Photos Without Consent

Disturbing trend: AI giants training systems on kids' photos without consent raises privacy and safety concerns.

Warner Music Group Restricts AI Training Usage Without Permission

Warner Music Group asserts control over AI training usage, requiring explicit permission for content utilization. EU regulations spark industry debate.

Apple’s Phil Schiller Secures Board Seat at OpenAI

Apple's App Store Chief Phil Schiller secures a board seat at OpenAI, strengthening ties between the tech giants.

Apple Joins Microsoft as Non-Voting Observer on OpenAI Board, Rivalry Intensifies

Apple joins Microsoft as non-voting observer on OpenAI board, intensifying rivalry in AI sector. Exciting developments ahead!