Microsoft has unveiled a significant update to its Copilot Audio Expressions feature, introducing a new scripted mode powered by its in-house MAI-Voice-1 AI model. This advancement enables users to generate high-fidelity, emotionally nuanced audio narrations from written text, enhancing applications in education, entertainment, and content creation.
The scripted mode allows users to input text and select from various vocal styles, including options like vampire, dragon, or witch, to match the tone and atmosphere of the content. This feature is particularly beneficial for storytelling, providing dynamic and engaging audio experiences. Additionally, the Story Mode offers multiple vocal styles, making it ideal for children’s stories or educational content.
MAI-Voice-1, Microsoft’s latest AI model, is designed for efficiency and expressiveness. Capable of generating a full minute of audio in under a second on a single GPU, it delivers high-quality speech synthesis with emotional depth. This model powers not only the new scripted mode but also other Copilot features such as Daily and Podcasts.
The integration of MAI-Voice-1 into Copilot Audio Expressions marks a significant step forward in AI-driven voice synthesis, offering users enhanced control over audio output and expanding the creative possibilities for voice-based content.