OpenAI’s ChatGPT Faces Allegations of Massive Data Theft, Lawsuit Claims $3 Billion in Damages
OpenAI Inc., the creator of ChatGPT, is facing a lawsuit accusing the company of engaging in extensive data theft for training its artificial intelligence models. A group of anonymous users filed a lawsuit against OpenAI, claiming that the company has unlawfully collected personal information without consent, in pursuit of profits. The lawsuit, seeking class-action status, was filed by the Clarkson Law Firm in federal court in San Francisco, estimating potential damages at $3 billion.
According to the 157-page complaint, the plaintiffs argue that OpenAI breached privacy regulations by secretly harvesting 300 billion words from the internet, including personal information, without proper consent. The lawsuit alleges that OpenAI collects private information through users’ interactions with products and services integrated with ChatGPT, with the intent to gain an advantage in the AI arms race. Some examples cited in the suit include the collection of picture and location data from Snapchat, music choices from Spotify, financial information from Stripe, and private chats from Slack and Microsoft Teams.
The lawsuit references several statutes, including the Computer Fraud and Abuse Act, an anti-hacking law previously utilized in scraping disputes. The allegations include invasion of privacy, theft, unjust enrichment, and breaches of the Electronic Communications Privacy Act.
The plaintiffs further claim that OpenAI has deviated from its initial principle of using artificial intelligence for the benefit of humanity as a whole. The lawsuit reveals an estimated income of $200 million for ChatGPT in 2023, further emphasizing the pursuit of profits. Consequently, the complaint seeks a temporary halt to commercial access and development of OpenAI’s products.
Italy previously barred ChatGPT due to inadequate data security, especially concerning the privacy of children, in compliance with the General Data Protection Regulation (GDPR) in Europe. The current complaint mainly pertains to OpenAI’s vague privacy policy for existing users and highlights the usage of data collected from the web without explicit consent, which OpenAI has benefited from without compensation to the source.
However, the outcome of the court case remains uncertain. The complex nature of the internet and the concept of a free and open web often clash with the terms and conditions set by online platforms. Users frequently grant platforms broad licenses to use their uploaded content, making it challenging for ordinary consumers to claim compensation for the utilization of their data in training models.
In the age of digital contributions, it is arguable that anyone active online in recent decades has likely contributed their data to OpenAI’s databases. Therefore, it is possible that OpenAI’s language models may incorporate portions of individuals’ data gathered through silent scraping, leading to opinions appearing on screens as if generated by personal efforts.
Ryan Clarkson, managing partner at the legal firm suing OpenAI, asserts that all of that information is being taken at scale, without initially being intended for use by a large language model. Thus, it is uncertain whether the data acquisition is deemed as theft.
While generative AI applications like ChatGPT have sparked great interest in their potential, they have also ignited concerns surrounding privacy and disinformation. Policymakers are currently debating the possibilities and risks associated with artificial intelligence, including its impact on creative industries and the ability to distinguish fact from fiction. OpenAI CEO Sam Altman recently called for AI regulation. The lawsuit against OpenAI adds allegations of running a massive hidden web-scraping operation, violating terms of service agreements, as well as state and federal privacy and property laws.