Microsoft Teams To Translate Your Speech To Foreign Language In Real-Time During Calls With Voice Cloning Feature
Microsoft said, “Imagine being able to sound just like you in a different language. The Interpreter agent in Teams provides real-time speech-to-speech translation during meetings.”
Microsoft has unveiled plans to introduce a voice-cloning feature in Teams that will allow users to replicate their voices for communication in multiple languages during meetings. Announced at Microsoft Ignite 2024, this innovative tool, called Interpreter in Teams, offers real-time speech-to-speech translation. Starting in early 2025, Teams users will have the ability to simulate their voices in up to nine languages: English, French, German, Italian, Japanese, Korean, Portuguese, Mandarin Chinese, and Spanish.
This feature aims to enhance multilingual collaboration and bridge communication gaps in virtual meetings.
ALSO READ | Assassin's Creed Syndicate At 60 FPS? Ubisoft Finally Brings New-Gen Update 9 Years After Release
Microsoft CMO, Jared Spataro, in a blog post wrote, “Imagine being able to sound just like you in a different language. The Interpreter agent in Teams provides real-time speech-to-speech translation during meetings, and you can opt to have it simulate your speaking voice for a more personal and engaging experience,” as reported by TechCrunch.
Microsoft has shared limited specifics about the feature, which will be exclusively available to Microsoft 365 subscribers. However, the company clarified that the tool does not retain any biometric data, avoids adding artificial sentiments beyond those naturally present in the user's voice, and can be turned off via the Teams settings.
The Microsoft spokesperson further said, “Interpreter is designed to replicate the speaker’s message as faithfully as possible without adding assumptions or extraneous information. Voice simulation can only be enabled when users provide consent via a notification during the meeting or by enabling ‘Voice simulation consent’ in settings.”
Can This Feature Be Misused?
Deepfakes have rapidly proliferated across social media, blurring the line between reality and misinformation. This year alone, deepfakes featuring figures like President Joe Biden, Taylor Swift, and Vice President Kamala Harris have garnered millions of views and shares. Additionally, deepfake technology has been weaponized in personal scams, such as impersonating family members. The FTC reported that impersonation scams resulted in over $1 billion in losses last year.
In one alarming case, cybercriminals used deepfake technology to simulate a Teams meeting with a company’s executives, successfully deceiving them into transferring $25 million. Concerns about such risks and public perception led OpenAI to withhold the release of its voice cloning tool, Voice Engine, earlier this year.
Based on the details shared so far, Interpreter in Teams appears to have a fairly specific use case for voice cloning. However, that doesn’t eliminate the potential for misuse. For instance, a malicious user could input a deceptive recording—such as a request for sensitive banking details—and use the tool to generate a translation in their target’s language, opening the door to possible exploitation. We might get more information from Microsoft about this in the coming months.