Microsoft AI Researchers Accidentally Expose Terabytes of Sensitive Data on GitHub

Title: Microsoft AI Researchers Accidentally Expose Terabytes of Sensitive Data on GitHub

Microsoft AI researchers have inadvertently exposed large volumes of sensitive data, including private keys and passwords, while publishing an open source training data storage bucket on GitHub. Cloud security startup Wiz made this discovery during its ongoing investigation into the accidental exposure of cloud-hosted data.

In their research report shared with TechCrunch, Wiz highlighted the existence of a GitHub repository belonging to Microsoft’s AI research division. This repository provided open source code and AI models for image recognition, instructing readers to download the models from an Azure Storage URL. However, Wiz revealed that the URL was misconfigured, granting permissions to the entire storage account and inadvertently exposing additional private data.

Among the exposed data were 38 terabytes of sensitive information, including personal backups of two Microsoft employees’ personal computers. The data also included other highly sensitive personal information, such as passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages exchanged among hundreds of employees.

The misconfigured URL, which had been exposing this data since 2020, allowed full control access instead of the intended read-only permissions. This meant that anyone with knowledge of where to find this data could potentially delete, replace, or inject malicious content without detection.

Wiz clarified that the storage account itself was not directly exposed. The vulnerability arose from the Microsoft AI developers including an overly permissive shared access signature (SAS) token in the URL. SAS tokens are a mechanism used by Azure to create shareable links that provide access to an Azure Storage account’s data.

Ami Luttwak, the co-founder and CTO of Wiz, emphasized the challenges faced by developers and engineers working with vast amounts of data in their race to develop new AI solutions. Luttwak highlighted the need for additional security measures and safeguards to protect sensitive information as development teams handle massive data, share it with peers, and collaborate on public open source projects. Cases like the one discovered at Microsoft are becoming increasingly difficult to monitor and prevent.

Following Wiz’s findings, Microsoft was promptly informed on June 22 and took immediate action by revoking the SAS token two days later, on June 24. The company completed its investigation on August 16, assessing any potential impact on its organization.

In a blog post shared with TechCrunch, Microsoft’s Security Response Center assured that no customer data was exposed, and no other internal services were put at risk. As a result of Wiz’s research, Microsoft has enhanced GitHub’s secret spanning service. This service now monitors all public open source code changes, specifically checking for plaintext exposure of credentials and other secrets, including any SAS tokens with overly permissive expirations or privileges.

Microsoft’s commitment to addressing and rectifying this vulnerability demonstrates its dedication to protecting user data and preventing future incidents.

Overall, this incident underscores the significance of robust security measures in AI development, particularly when handling extensive volumes of sensitive data. Companies must continually strengthen their security protocols to avoid potentially disastrous data breaches and ensure the privacy and security of their users.

Microsoft AI Researchers Accidentally Expose Terabytes of Sensitive Data on GitHub

Frequently Asked Questions (FAQs) Related to the Above News

What sensitive data was accidentally exposed by Microsoft AI researchers?

How did the exposure occur?

How long was the data exposed?

What could someone potentially do with the exposed data?

Was the storage account itself directly exposed?

How did Microsoft respond to the discovery?

Was any customer data exposed or internal services put at risk?

What steps should companies take to prevent incidents like this?

Subscribe

How to Use Chat GPT: Step by Step Guide to Start Open AI ChatGPT

Fascinating Facts on ChatGPT

ChatGPT Global News Offers Comprehensive AI-Powered News Coverage

Meet the Experts Who Trained ChatGPT

An Overview of ChatGPT

More like this
Related

Samsung’s Foldable Phones: The Future of Smartphone Screens

Unlocking Franchise Success: Leveraging Cognitive Biases in Sales

Wiz Walks Away from $23B Google Deal, Pursues IPO Instead

Southern Punjab Secretariat Leads Pakistan in AI Adoption, Prominent Figures Attend Demo

About us

Company

The latest

Samsung’s Foldable Phones: The Future of Smartphone Screens

Unlocking Franchise Success: Leveraging Cognitive Biases in Sales

Wiz Walks Away from $23B Google Deal, Pursues IPO Instead

Subscribe

Microsoft AI Researchers Accidentally Expose Terabytes of Sensitive Data on GitHub

Frequently Asked Questions (FAQs) Related to the Above News

What sensitive data was accidentally exposed by Microsoft AI researchers?

How did the exposure occur?

How long was the data exposed?

What could someone potentially do with the exposed data?

Was the storage account itself directly exposed?

How did Microsoft respond to the discovery?

Was any customer data exposed or internal services put at risk?

What steps should companies take to prevent incidents like this?

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related