Comedian Sarah Silverman, along with authors Christopher Golden and Richard Kadrey, has filed a class-action lawsuit against tech company OpenAI, accusing it of copyright infringement for using their books to train its ChatGPT software. The lawsuit claims that OpenAI accessed databases of copyrighted works without permission in order to train its language models.
ChatGPT is an AI program that generates text responses to user prompts, aiming to deliver natural-sounding dialogue. To achieve this, it needs to be trained on large datasets of text. The lawsuit alleges that OpenAI used copyrighted material, including Sarah Silverman’s book The Bedwetter, without obtaining the necessary consent.
The complaint states that when ChatGPT was asked to summarize Silverman’s book, it accurately generated summaries of copyrighted works, implying that the book was part of the training dataset. Although certain details in the summaries were incorrect, the overall output was accurate, indicating that ChatGPT retained knowledge of the copyrighted material used during training.
The lawsuit suggests that OpenAI obtained the copyrighted material from online book databases, including Project Gutenberg, which houses titles with expired copyrights. It also claims that a second dataset used by ChatGPT consists of copyrighted titles obtained from shadow library websites that illegally share copyrighted works.
Silverman and her co-plaintiffs accuse OpenAI of various copyright violations, as well as unjust enrichment and negligence. A separate complaint has also been filed against Meta, the parent company of Facebook and Instagram, regarding similar actions taken with its LLaMA AI writing software.
The plaintiffs’ allegations raise concerns about the unauthorized use of copyrighted material and the potential violation of intellectual property rights. As the legal battle unfolds, the case could significantly impact the way AI technologies are trained and the responsibility of tech companies in obtaining proper permissions for copyrighted works.
It remains to be seen how the court will rule on this matter and whether it will establish guidelines for the use of copyrighted material in training AI systems. The outcome of the lawsuit could have broader implications for the field of artificial intelligence and the ethical considerations surrounding the use of copyrighted content without consent.