OpenAI ChatGPT and GPT-4: Memorizing Books

Date:

A recent study conducted by scientists from the University of California, Berkeley explored OpenAI’s ChatGPT and its GPT-4 model, and discovered an undisclosed secret: the model was trained using text from copyrighted books. This study, called “Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4,” was authored by Kent Chang, Mackenzie Cramer, Sandeep Soni, and David Bamman.

The researchers determined that ChatGPT and GPT-4 had “memorized” texts from a variety of genres, particularly science fiction and fantasy. This lead to potentially less knowledge of other genres, such as Global Anglophone works, works in the Black Book Interactive Project and Black Caucus American Library Association award winners. David Bamman, associate professor in the School of Information at Berkeley, succinctly summarized the article by stating, “Takeaways: open models are good; popular texts are probably not good barometers of model performance; with the bias toward sci-fi/fantasy, we should be thinking about whose narrative experiences are encoded in these models, and how that influences other behaviors.”

The unknowable nature of OpenAI’s data, as well as the question as to whether the texts in question truly exist in the model, renders them unanswerable. The research team thus proposes, for a more transparent model behavior, public training data availability.

Regarding copyright implications, Stanford University law professor Tyler Ochoa agrees that lawsuits will likely arise if generated texts are too similar to those originating from copyrighted sources. He remarked that the same notions apply to image generators.

Margaret Mitchell, AI researcher and chief ethics scientist for Hugging Face, voiced the need for more efficiency in data curation and due documentation. She remarked, “I hope this work will help further advance the state of the art in responsible data curation.”

See also  Nigerian Job Applicant Stuns Employer with His CV Statement

OpenAI is a research laboratory owned by Microsoft and specializes in artificial intelligence, especially and most notably deep learning, a field of AI that has been proven to generate immense success in comparison to other areas of AI. OpenAI develops and implements artificial general intelligence, creating machines to process human language and act on a provided task. ChatGPT is the first AI-driven chatbot released by OpenAI, which uses GPT-4 to generate automated responses and understand natural language. GPT-4 is the fourth version of OpenAI’s large-scaled, open-sourced language model, which is capable of generating human-like dialogue with minimal training data, and has prompted a new wave of AI-driven chatbot technology.

Frequently Asked Questions (FAQs) Related to the Above News

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Share post:

Subscribe

Popular

More like this
Related

China Aims to Reign as Global Tech Powerhouse, Investing in Key Innovations & Industries

China investing heavily in cutting-edge technologies like humanoid robots, 6G, & more to become global tech powerhouse.

Revolutionizing Access to Communications: The Future of New Zealand’s Telecommunications Service Obligation

Revolutionizing access to communications in New Zealand through updated Telecommunications Service Obligations for a more connected future.

Beijing’s Driverless Robotaxis Revolutionizing Transportation in Smart Cities

Discover how Beijing's driverless robotaxis are revolutionizing transportation in smart cities. Experience the future of autonomous vehicles in China today.

Samsung Unpacked: New Foldable Phones, Wearables, and More Revealed in Paris Event

Get ready for the Samsung Unpacked event in Paris! Discover the latest foldable phones, wearables, and more unveiled by the tech giant.