The artificial intelligence (AI) division of Microsoft has accidentally leaked a whopping 38TB of private data via unsecured Azure storage. The data leak incident took place on Microsoft’s AI GitHub repository, including over 30,000 internal Microsoft Teams messages -- all caused by one misconfigured SAS token. The software giant linked the data exposure to using an excessively permissive Shared Access Signature (SAS) token, which allowed full control over the shared files, security company Wiz Research has said.
The leaked data included a disk backup of employees' workstations, private keys, passwords and more than 30,000 internal Microsoft Teams messages. The researchers shared their files using an Azure feature called SAS tokens, which allows you to share data from Azure Storage accounts.
The tech giant has said that no customer data was exposed, and no other internal services were put at risk because of this issue. No customer action is required in response to this issue. Microsoft has said it has investigated and remediated the incident that involved an employee who shared a URL for a blob store in a public GitHub repository while contributing to open-source AI learning models.
Researchers at Wiz reported the issue to the Microsoft Security Response Center (MSRC) on June 22.
"This URL included an overly-permissive Shared Access Signature (SAS) token for an internal storage account. Security researchers at Wiz were then able to use this token to access information in the storage account. Data exposed in this storage account included backups of two former employees’ workstation profiles and internal Microsoft Teams messages of these two employees with their colleagues," the company wrote in a blog post.
MSRC worked with the research and engineering teams to revoke the SAS token and prevent all external access to the storage account, mitigating the issue on June 24.
"Our investigation concluded that there was no risk to customers as a result of this exposure," the company added.
ALSO READ: iPhone 15 Pro Max's Demand Much More Than Last Year's iPhone 14 Pro Max: Kuo
The researchers have noted that their scan shows a GitHub repository under the Microsoft organization named robust-models-transfer. The repository belongs to Microsoft’s AI research division, and its purpose is to provide open-source code and AI models for image recognition. Readers of the repository were instructed to download the models from an Azure Storage URL.
"This case is an example of the new risks organizations face when starting to leverage the power of AI more broadly, as more of their engineers now work with massive amounts of training data. As data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards," Wiz Research wrote in a blog post.
ALSO READ: X May Soon Start Charging Users A Monthly Fee, Elon Musk Hints
However, this URL allowed access to more than just open-source models. It was configured to grant permissions on the entire storage account, exposing additional private data by mistake.
"Our scan shows that this account contained 38TB of additional data — including Microsoft employees’ personal computer backups. The backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees," the blog post added.