Prominent authors Paul Tremblay and Mona Awad have taken legal action against OpenAI, an artificial intelligence (AI) firm, accusing the company of copyright infringement. The authors claim that OpenAI used their literary works to train its language model, ChatGPT, without obtaining their explicit consent.
ChatGPT is an advanced AI language model that develops its capabilities by analyzing large volumes of text and creating a training dataset. However, this process has raised serious legal concerns, as Tremblay and Awad assert that OpenAI integrated their copyrighted content into ChatGPT’s training without permission.
Both Tremblay and Awad, who reside in Massachusetts, hold registered copyrights for their respective works. The lawsuit argues that OpenAI has benefitted commercially and financially from the use of their copyrighted materials through ChatGPT.
The authors’ complaint further claims that ChatGPT generates summaries of their copyrighted works, which would only be possible if the language model was trained on their texts. It refers to a publication by OpenAI in June 2018, which highlighted the use of a dataset consisting of over 7,000 unique unpublished books from a variety of genres to train the GPT-1 model.
This lawsuit marks the first of its kind against OpenAI regarding copyright law, according to Andres Guadamuz, an intellectual property law expert at the University of Sussex. Joseph Saveri and Matthew Butterick, legal counsel for Tremblay and Awad, argue that books are ideal training tools for large language models due to their well-edited, high-quality, long-form prose, which serves as the gold standard of idea storage.
The filed complaint alleges that OpenAI negligently collected and controlled the authors’ copyrighted works and developed systems, such as ChatGPT, trained on those works without authorization. Tremblay and Awad seek statutory and additional damages in their lawsuit.
Fox News Digital attempted to reach out to OpenAI for comment but had not received a response at the time of reporting.
Given the vast amount of material used to train AI systems, it is likely that more cases of copyright infringement in the AI industry will surface in the future.