OpenAI announced today that it is rolling out Advanced Voice Mode to a select group of paid ChatGPT users, enabling them to experience more natural, real-time conversations. This new mode allows ChatGPT to deliver real-time responses that can be interrupted and can sense and respond to humor, sarcasm, and more. Unlike the current ChatGPT voice, Advanced Voice Mode does not need to convert speech to text and back, resulting in lower latency interactions.
In May, OpenAI demonstrated Advanced Voice Mode with an AI voice named Sky, which bore a striking resemblance to Scarlett Johansson’s voice. Johansson released a statement expressing her shock, anger, and disbelief, revealing that she had turned down multiple offers from OpenAI CEO Sam Altman. OpenAI claimed that the resemblance was unintended, but they removed the Sky voice after Johansson hired legal counsel.
Since the demo, OpenAI has focused on enhancing the safety and quality of voice conversations. Advanced Voice Mode now includes four preset voices and safeguards to prevent it from mimicking celebrity voices. OpenAI has also implemented measures to block violent or copyrighted content. The early tests with select users will help refine the feature before its broader release.
Users granted access to Advanced Voice Mode will receive an email with instructions, and OpenAI plans to gradually add more users. All Plus subscribers will have access to Advanced Voice Mode in the fall.
How to Use Advanced Voice Mode
Start a Conversation:
- Select the voice icon next to the mic icon.
Manage the Conversation:
- Mute/unmute microphone: Click the microphone icon.
- End conversation: Press the red icon at the bottom right.
Switch Modes:
- Switch between standard and Advanced Voice Mode: Select the option from the top center.
Usage Limits:
- Daily limits on audio inputs/outputs.
- App shows a warning with three minutes remaining.
- Conversation ends when limit is reached; switch to standard Voice Mode.