AI Firms Face Legal Battles as Writers Demand Compensation
Writers and artists are increasingly seeking compensation from AI firms, claiming that their copyrighted works have been used to train generative AI models without consent or credit. Last week, the Authors Guild sent an open letter signed by over 9,000 writers, including renowned authors like George Saunders and Margaret Atwood, urging companies such as Alphabet, OpenAI, Meta, and Microsoft to fairly compensate writers for the use of their copyrighted materials in AI training. While some have resorted to open letters and social media posts, others are escalating their efforts through lawsuits.
The undisclosed training data used for large language models (LLMs) and other generative AI systems has sparked concerns among writers and visual artists who have noticed similarities between their work and the output generated by these systems. Many have called for generative AI companies to disclose their data sources and compensate the creators whose works were utilized. However, focusing solely on copyright law to address these concerns may overlook the broader issues arising from AI, such as employment, compensation, privacy, and uncopyrightable characteristics.
One of the most prominent lawsuits in recent times involves comedian Sarah Silverman, along with four other authors, suing OpenAI for allegedly training its popular ChatGPT system on their works without permission. The lawsuits, filed as class-actions by the Joseph Saveri Law Firm, which specializes in antitrust litigation, claim copyright infringement. During a recent hearing, US district court judge William Orrick suggested that more evidence was needed to support the claims, as the system had been trained on an extensive dataset consisting of five billion compressed images.
The Silverman case may serve as a precedent-setting ruling that determines how the law views the datasets employed to train AI models. Emory University law professor Matthew Sag suggests it could influence whether companies can claim fair use when their models scrape copyrighted material. Although the outcome remains uncertain, Sag believes this lawsuit is the most compelling among those filed. OpenAI has not responded to requests for comment on the matter.
The underlying argument in these cases is that LLMs copied protected works, according to Sag. However, he clarified in his testimony to a US Senate subcommittee that models like GPT-3.5 and GPT-4 do not copy work in the traditional sense; instead, they process and learn from the training data, similar to how a student learns from studying. This distinction highlights the complex nature of the issue.
While these legal battles shed light on the vital question of compensation for artists and writers, it is important to acknowledge that copyright law alone may not sufficiently address the multifaceted challenges poised by AI. The discussions surrounding AI’s impact on society warrant a broader perspective, considering issues beyond copyright infringement. As the demands for transparency and fairness grow, companies, creatives, and legal authorities must engage in dialogue to find equitable solutions that address the concerns raised by AI’s rapid development and use of copyrighted materials.