According to blog.google and deepmind.google, Google has released Gemini 3.1 Flash TTS, a new AI speech model that introduces granular audio tags for precise control over voice generation. The model was announced on April 15, 2026, by Senior Product Manager Vilobh Meshram and Principal Research Engineer Max Gubin on behalf of the Gemini team.
According to both sources, the model offers improved speech quality compared to previous versions, with audio tags that allow users to control vocal style, pace, and delivery using natural language commands. The system supports over 70 languages, according to the announcements.
According to the sources, developers can access Gemini 3.1 Flash TTS through Google AI Studio, Vertex AI, and Google Vids. Google AI Studio enables developers to fine-tune voices and export settings for consistent use across applications.
A key feature highlighted in both announcements is the integration of SynthID watermarking technology. According to the sources, all audio generated by Gemini 3.1 Flash TTS includes this watermark to identify AI-generated content and help prevent misinformation.
Both sources note that the model represents an advancement in making AI-generated speech sound more natural while providing users with greater control over expressiveness and delivery characteristics.