ChatGPT Trains on Copyrighted Books Such as Harry Potter: New Scientist Finds

Date:

Recent research has discovered that two language models — ChatGPT and its successor GPT-4 — appear to have memorised details from vast numbers of copyrighted books. This raises questions about the legality of how such large language models (LLMs) are created. Both of these artificial intelligences were developed by the company OpenAI and trained on huge amounts of data. However, it is unclear exactly which texts form the basis of this training data.

To answer this question, David Bamman from the University of California, Berkeley, and his colleagues tested if the AIs could reliably tell passages from copyrighted books apart from less-known books. The AI was tested on passages from famous books such as Harry Potter and Game of Thrones, as well as seemingly random less-known novels and poems. The test revealed that the model had memorised the contents of copyrighted books, indicating that it was trained on them.

This raises several issues — not least on the potential copyright infringement that has taken place. It implies that OpenAI used passages from these books without permission, making it difficult to see how they could remain compliant with copyright regulations. What’s more, this discovery could set a dangerous precedent, with LLMs trained on vast amounts of copyright-protected material becoming more common.

OpenAI, which is based in San Francisco in the United States, was co-founded by entrepreneur Elon Musk. The company is dedicated to advancing artificial intelligence technologies. OpenAI’s mission is to ensure that artificial intelligence benefits all of humanity. Co-founder and current CEO, Sam Altman, is a highly experienced leader in the fields of technology and innovation.

See also  Cloudflare's Act Three Boosted by AI Excitement, But Is It Too Late to Buy?

David Bamman is a computer scientist from UC Berkeley whose work focuses on natural language processing and the use of artificial intelligence in digital humanities, publishing and media. He has published numerous research papers and books on artificial intelligence and natural language processing.

Frequently Asked Questions (FAQs) Related to the Above News

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Share post:

Subscribe

Popular

More like this
Related

WhatsApp Unveils New AI Feature: Generate Images of Yourself Easily

WhatsApp introduces a new AI feature, allowing users to easily generate images of themselves. Revolutionizing the way images are interacted with on the platform.

India to Host 5G/6G Hackathon & WTSA24 Sessions

Join India's cutting-edge 5G/6G Hackathon & WTSA24 Sessions to explore the future of telecom technology. Exciting opportunities await! #IndiaTech #5GHackathon

Wimbledon Introduces AI Technology to Protect Players from Online Abuse

Wimbledon introduces AI technology to protect players from online abuse. Learn how Threat Matrix enhances player protection at the tournament.

Hacker Breaches OpenAI, Exposes AI Secrets – Security Concerns Rise

Hacker breaches OpenAI, exposing AI secrets and raising security concerns. Learn about the breach and its implications for data security.