Microsoft has unveiled three AI models it developed in-house, VentureBeat reported on Wednesday.
Microsoft introduced the speech transcription model MAI-Transcribe-1, the voice generation model MAI-Voice-1 and the image generation model MAI-Image-2.
The three models are available through Microsoft Foundry and MAI Playground. The report said MAI-Transcribe-1 posted an average word error rate of 3.8 percent across the top 25 languages on the multilingual speech recognition benchmark FLEURS. It outperformed OpenAI's Whisper-large-v3 across all 25 languages and beat Google's Gemini 3.1 Flash in 22 of the 25 languages. Microsoft is testing the model for Copilot voice mode and a Teams meeting transcription feature.
MAI-Voice-1 generates 60 seconds of speech in 1 second and can clone an individual's voice with only a few seconds of voice samples. It costs $22 per 1 million characters. MAI-Image-2 ranked in the top three on the leaderboard of AI model evaluation platform Arena.ai, and its generation speed is more than twice as fast as the previous version. It is offered at $5 per 1 million tokens for text input and $33 per 1 million tokens for image output. WPP, one of the large advertising companies, joined as an early corporate partner.
One notable aspect of the development effort is the team size. Mustafa Suleyman (무스타파 술레이먼), head of Microsoft AI, said, "The voice model was built by 10 people, and most of the improvements in speed, efficiency and accuracy came from the model architecture and data. The image team is also fewer than 10 people."
Microsoft is also applying an aggressive pricing policy. Suleyman said, "It is priced cheaper than Amazon, Google and others, and that was an intentional decision." Microsoft shares are down about 17 percent so far this year, increasing investor pressure to monetise AI investments.
Suleyman said Microsoft will also develop its own model in the large language model (LLM) area. He said, "The goal is to provide state-of-the-art models at the highest efficiency and the lowest cost when Microsoft needs them and to be able to become fully independent."