OpenAI recently unveiled its enhanced ChatGPT Plus, incorporating a Real-Time Natural Voice Mode feature. Despite facing controversy during its initial introduction, OpenAI has made significant internal updates to gradually release Voice Mode for select users.
This advanced Voice Mode aims to promote more fluid and real-time dialogues within the ChatGPT mobile app, available on both iOS and Android platforms. By leveraging the capability of voice interactions through its GPT-4o AI model, OpenAI is striving to improve user experience by enabling ChatGPT to engage in conversations that mimic human interaction more closely.
With the new voice feature, ChatGPT can now recognize tonal variations such as humor and sarcasm, enhancing the overall conversational experience. By eliminating the need for speech-to-text translation, this upgrade significantly reduces interaction lag, making dialogues more seamless.
Initially revealed during OpenAI’s Spring Update event in May 2024, the Voice Mode feature showcased a voice named Sky that bore a resemblance to Scarlett Johansson’s voice. However, after objections from Johansson herself and legal scrutiny, OpenAI removed the Sky voice and retracted the feature to ensure it was not perceived as an imitation.
Following extensive improvements in voice interaction safety and credibility, the system now features four pre-defined voices to avoid celebrity voice replication. Additionally, safeguards are in place to prevent the AI from handling requests involving violent or copyrighted content.
Selected users of ChatGPT Plus will receive access to the new Voice Mode feature via email, with OpenAI planning to extend the rollout to all Plus subscribers by fall. Demonstrations have highlighted the feature’s practicality in various scenarios such as education, fashion advice, and assisting the visually impaired, showcasing the benefits of natural AI communication.
As concerns regarding AI misuse in fraud or impersonation escalate, OpenAI remains committed to ensuring responsible use of the Voice Mode feature. While the feature does not enable voice cloning, there are potential risks of deceptive use, especially in contexts where the AI nature may not be immediately evident.
Mira Murati, OpenAI’s Chief Technology Officer, emphasized the collaborative potential of the feature, emphasizing the company’s dedication to developing useful and cooperative AI. OpenAI’s safety protocols have been rigorously tested by over 100 external reviewers across 45 languages, underscoring the company’s commitment to maintaining safety standards.
This recent development sets OpenAI apart from competitors such as Meta’s Llama model and Anthropic’s Claude, while also posing a challenge to startups specializing in expressive voice AI like Hume. By prioritizing safety and practical utility, OpenAI continues to innovate in the AI space, offering users a more immersive and engaging conversational experience.