ChatGPT Trains on Copyrighted Books Such as Harry Potter: New Scientist Finds

Date:

Recent research has discovered that two language models — ChatGPT and its successor GPT-4 — appear to have memorised details from vast numbers of copyrighted books. This raises questions about the legality of how such large language models (LLMs) are created. Both of these artificial intelligences were developed by the company OpenAI and trained on huge amounts of data. However, it is unclear exactly which texts form the basis of this training data.

To answer this question, David Bamman from the University of California, Berkeley, and his colleagues tested if the AIs could reliably tell passages from copyrighted books apart from less-known books. The AI was tested on passages from famous books such as Harry Potter and Game of Thrones, as well as seemingly random less-known novels and poems. The test revealed that the model had memorised the contents of copyrighted books, indicating that it was trained on them.

This raises several issues — not least on the potential copyright infringement that has taken place. It implies that OpenAI used passages from these books without permission, making it difficult to see how they could remain compliant with copyright regulations. What’s more, this discovery could set a dangerous precedent, with LLMs trained on vast amounts of copyright-protected material becoming more common.

OpenAI, which is based in San Francisco in the United States, was co-founded by entrepreneur Elon Musk. The company is dedicated to advancing artificial intelligence technologies. OpenAI’s mission is to ensure that artificial intelligence benefits all of humanity. Co-founder and current CEO, Sam Altman, is a highly experienced leader in the fields of technology and innovation.

See also  Authors Fail to Halt Copyright Cases Against OpenAI and Microsoft

David Bamman is a computer scientist from UC Berkeley whose work focuses on natural language processing and the use of artificial intelligence in digital humanities, publishing and media. He has published numerous research papers and books on artificial intelligence and natural language processing.

Frequently Asked Questions (FAQs) Related to the Above News

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Share post:

Subscribe

Popular

More like this
Related

Chinese Users Access OpenAI’s AI Models via Microsoft Azure Despite Restrictions

Chinese users access OpenAI's AI models via Microsoft Azure despite restrictions. Discover how they leverage AI technologies in China.

Google Search Dominance vs. ChatGPT Revolution: Tech Giants Clash in Digital Search Market

Discover how Google's search dominance outshines ChatGPT's revolution in the digital search market. Explore the tech giants' clash now.

OpenAI’s ChatGPT for Mac App Security Breach Resolved

OpenAI resolves Mac App security breach for ChatGPT, safeguarding user data privacy with encryption update.

COVID Vaccine Study Finds Surprising Death Rate Disparities

Discover surprising death rate disparities in a COVID vaccine study, revealing concerning findings on life expectancy post-vaccination.