🎯 The Big Picture
Microsoft AI, the tech giant’s research lab, announced the release of three foundational AI models on Thursday that can generate text, voice, and images. The release signals Microsoft’s continued push to build out its own stack of multimodal AI models — and compete with rival AI labs — even though it remains tied to OpenAI. MAI-Transcribe-1 transcribes speech across 25 different languages into text and is 2.
📖 What Happened
5 times faster than Microsoft’s Azure Fast offering, according to a company press release. MAI-Voice-1 is an audio-generating model. This voice model enables users to generate 60 seconds of audio in one second and enables users to create a custom voice.
MAI-Image-2 delivers video-generating model. MAI-Image-2 was originally released on MAI Playground , a new large language model testing software, on March 19. Now, all three models are being released on Microsoft Foundry and the transcription and voice models are available in MAI Playground as well.
The models were developed by Microsoft’s MAI Superintelligence team , an AI research team led by Mustafa Suleyman, the CEO of Microsoft AI, that was formed and announced in November 2025. “At Microsoft AI, we’re building Humanist AI. We have a distinct view when creating our AI models — putting humans at the center, optimizing for how people actually communicate, training for practical use,” Suleyman wrote in the blog post .
🎤 Highlights
• 5 times faster than Microsoft’s Azure Fast offering, according to a company press release. MAI-Voice-1 is an audio-generating mode...
• MAI-Image-2 delivers video-generating model. MAI-Image-2 was originally released on MAI Playground , a new large language model te...
• The models were developed by Microsoft’s MAI Superintelligence team , an AI research team led by Mustafa Suleyman, the CEO of Micr...
🚀 Why It Matters
“You’ll see more models from us soon in Foundry and directly in Microsoft products and experiences. ” In an increasingly crowded LLM market, MAI hopes a selling point for these models is that they are cheaper than those from Google and OpenAI, the company wrote in the blog post. delivers senior writer at TechCrunch that covers venture capital trends and startups.
⚡ The Bottom Line
“You’ll see more models from us soon in Foundry and directly in Microsoft products and experiences. ” In an increasingly crowded LLM market, MAI hopes a selling point for these models is that they are cheaper than those from Google and OpenAI, the company wrote in the blog post. delivers senior writer at TechCrunch that covers venture capital trends and startups.
📰 Source: TechCrunch AI 🔗

