Comedian and author Sarah Silverman, along with authors Christopher Golden and Richard Kadrey, have filed lawsuits against OpenAI and Meta, alleging copyright infringement. The lawsuits claim that OpenAI’s ChatGPT and Meta’s LLaMA were trained on datasets that contained their works, which were allegedly obtained illegally from shadow library websites. These websites, such as Bibliotik, Library Genesis, and Z-Library, offer books in bulk through torrent systems.
The plaintiffs argue that when ChatGPT is prompted, it generates summaries of their copyrighted works, indicating that it was trained on their materials. They further allege that the chatbot does not reproduce any of the copyright information included in their published works. In the case against Meta, the authors claim that their books were accessible in datasets used to train Meta’s LLaMA models.
Sarah Silverman holds a registered copyright for her book The Bedwetter, while Christopher Golden and Richard Kadrey own registered copyrights for multiple books, including Ararat and Sandman Slim respectively. Both lawsuits state that the authors did not provide consent for their copyrighted works to be used as training material for the companies’ AI models.
The lawsuits include six counts of copyright violations, as well as claims of negligence, unjust enrichment, and unfair competition. The authors seek statutory damages, restitution of profits, and other appropriate remedies. Meta and OpenAI have not commented on the lawsuits.
The allegations primarily revolve around the accusation that OpenAI’s ChatGPT and Meta’s LLaMA were trained on illegally obtained datasets containing the copyrighted works of the plaintiffs. The availability of these books on shadow library websites, where they are accessible through torrent systems, forms the basis of the lawsuits.
Silverman, Golden, and Kadrey contend that ChatGPT’s ability to generate summaries of their copyrighted works implies that it was trained on their specific materials. Additionally, they highlight the lack of reproduction of any copyright management information by the chatbot. In a separate lawsuit against Meta, the presence of the authors’ books in the datasets used to train LLaMA is emphasized.
The lawsuits highlight the registered copyrights held by the authors and their explicit lack of consent for their works to be used for AI model training purposes. The legal action covers multiple alleged infringements, including copyright violations, negligence, unjust enrichment, and unfair competition. The plaintiffs are seeking various forms of compensation, including statutory damages and restitution of profits.
OpenAI and Meta have refrained from commenting on the lawsuits so far. The cases address the central claim that ChatGPT and LLaMA were trained on datasets containing the plaintiffs’ copyrighted works, which were purportedly acquired illegally from shadow library websites offering books in bulk via torrent systems.