Comedian Sarah Silverman and authors Richard Kadrey and Christopher Golden have recently taken legal action against Meta Platforms (the parent company of Facebook) and OpenAI, claiming that their copyrighted content has been used without authorization to train artificial intelligence (AI) language models. These models, called large language models, are developed by Meta and OpenAI to replicate human conversation and automate tasks.
The lawsuits, filed in San Francisco federal court, shed light on the legal challenges faced by chat bot developers who rely on copyrighted material to create realistic responses. The plaintiffs allege that Meta and OpenAI used their books without permission by copying texts from illegal online shadow libraries that feature thousands of books. The lawsuit against Meta references the company’s research paper titled LLaMA: Open and Efficient Foundation Language Models, published in February 2023. According to the authors, this paper reveals that Meta used copyrighted materials in their dataset, as many of the plaintiffs’ books appear in the dataset mentioned.
In the case against OpenAI, the authors argue that summaries of their work generated by ChatGPT demonstrate that the bot was trained on their copyrighted content. As evidence, a summary of Sarah Silverman’s memoir The Bedwetter composed by ChatGPT was provided in the complaint. The plaintiffs are seeking damages and injunctive relief on behalf of a nationwide class of copyright owners who believe their works were infringed upon.
However, the plaintiffs may face a hurdle due to the 2nd Circuit Court’s decision in Authors Guild v. Google. This copyright case explored fair use in copyright law and the transformation of printed copyrighted books into an online searchable database. The court ruled that Google’s actions constituted fair use under the fair use doctrine of the U.S. copyright statute. The court found that Google’s unauthorized digitizing of copyright-protected works, creation of a search functionality, and display of snippets from those works qualified as non-infringing fair uses. It considered the purpose behind the copying, the limited public display, and the lack of significant market substitution. This ruling may have implications for the fair use defense of Meta and OpenAI in the current lawsuits.
These lawsuits underscore the potential legal consequences faced by developers who use copyrighted material without proper authorization. Attorneys for the plaintiffs argue that a significant portion of the training datasets used by OpenAI and Meta consist of copyrighted works, including books written by the plaintiffs, which were copied without consent, credit, or compensation. The lawsuits not only aim to protect the plaintiffs’ rights to their copyrighted material but also initiate a process to establish appropriate boundaries for artificial intelligence and the role of copyright laws in its development.
In conclusion, the lawsuits brought by Sarah Silverman, Richard Kadrey, and Christopher Golden against Meta and OpenAI highlight the ongoing legal debates surrounding the training of AI language models with copyrighted content. As these cases unfold, the outcome may impact the future development of AI and the rights of copyright owners.