OpenAI ChatGPT and GPT-4: Memorizing Books

A recent study conducted by scientists from the University of California, Berkeley explored OpenAI’s ChatGPT and its GPT-4 model, and discovered an undisclosed secret: the model was trained using text from copyrighted books. This study, called “Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4,” was authored by Kent Chang, Mackenzie Cramer, Sandeep Soni, and David Bamman.

The researchers determined that ChatGPT and GPT-4 had “memorized” texts from a variety of genres, particularly science fiction and fantasy. This lead to potentially less knowledge of other genres, such as Global Anglophone works, works in the Black Book Interactive Project and Black Caucus American Library Association award winners. David Bamman, associate professor in the School of Information at Berkeley, succinctly summarized the article by stating, “Takeaways: open models are good; popular texts are probably not good barometers of model performance; with the bias toward sci-fi/fantasy, we should be thinking about whose narrative experiences are encoded in these models, and how that influences other behaviors.”

The unknowable nature of OpenAI’s data, as well as the question as to whether the texts in question truly exist in the model, renders them unanswerable. The research team thus proposes, for a more transparent model behavior, public training data availability.

Regarding copyright implications, Stanford University law professor Tyler Ochoa agrees that lawsuits will likely arise if generated texts are too similar to those originating from copyrighted sources. He remarked that the same notions apply to image generators.

Margaret Mitchell, AI researcher and chief ethics scientist for Hugging Face, voiced the need for more efficiency in data curation and due documentation. She remarked, “I hope this work will help further advance the state of the art in responsible data curation.”

OpenAI is a research laboratory owned by Microsoft and specializes in artificial intelligence, especially and most notably deep learning, a field of AI that has been proven to generate immense success in comparison to other areas of AI. OpenAI develops and implements artificial general intelligence, creating machines to process human language and act on a provided task. ChatGPT is the first AI-driven chatbot released by OpenAI, which uses GPT-4 to generate automated responses and understand natural language. GPT-4 is the fourth version of OpenAI’s large-scaled, open-sourced language model, which is capable of generating human-like dialogue with minimal training data, and has prompted a new wave of AI-driven chatbot technology.

OpenAI ChatGPT and GPT-4: Memorizing Books

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

How to Use Chat GPT: Step by Step Guide to Start Open AI ChatGPT

Fascinating Facts on ChatGPT

ChatGPT Global News Offers Comprehensive AI-Powered News Coverage

An Overview of ChatGPT

Meet the Experts Who Trained ChatGPT

More like this
Related

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

The Future of Good Jobs: Why College Degrees are Essential through 2031

About us

Company

The latest

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

Subscribe

OpenAI ChatGPT and GPT-4: Memorizing Books

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related