OpenAI, the organization behind the popular language model ChatGPT, has been sued by authors alleging that the AI system was trained on their copyrighted works without their consent, credit, or compensation. The lawsuit, filed by science fiction and horror author Paul Tremblay and novelist Mona Awad, seeks class action status and was filed in the San Francisco federal court.
The authors claim that ChatGPT, which is capable of generating summaries of their works, likely used their writings as training data without permission. They believe that their works were included in the online book datasets mentioned in OpenAI’s 2020 paper introducing GPT-3, the language model powering ChatGPT. These datasets allegedly obtained material from shadow library websites such as Library Genesis and Sci-Hub, known for illegally distributing copyrighted works through torrent downloads.
According to the lawsuit, AI researchers have shown interest in these illegal shadow libraries for some time. OpenAI has not yet responded to the allegations raised in the lawsuit.
This legal action against OpenAI follows a series of lawsuits challenging the training data and usage of AI tools. Getty Images, a photo service, blocked AI-generated images last year and later sued AI art generator Stable Diffusion for allegedly copying over 12 million images from their database without permission or compensation.
In January, three artists filed a lawsuit against Stable Diffusion, Midjourney (another art generator), and art hosting site DeviantArt, claiming that their work was used to train AI models without consent or compensation. The artists argued that many other artists had been similarly affected. Responding to these concerns, Adobe released Firefly, a generative AI toolset that uses the company’s own stock image library to create images, ensuring no infringement on artists’ works. Adobe plans to integrate Firefly into its other software products, including Photoshop.
Integrating AI into the publishing process has encountered other obstacles as well. The US copyright office denied copyright protection for AI-generated art in a graphic novel, though it granted protection to the human-created writing. In addition, AI-generated submissions flooded short story publications to the point where Clarkesworld, a renowned outlet, prohibited the submission of anything even partially created with AI.
The legal dispute between OpenAI and the authors highlights the ongoing challenges surrounding the use of copyrighted material in AI training. As the AI landscape continues to evolve, these issues will likely require further attention and resolution.