As technology advances, experts are beginning to wonder if OpenAI’s large language model, ChatGPT, is ready to flag its own products for possible plagiarism. With its new GPT-4 model, OpenAI is claiming its most capable model to date, raising questions of the reliability of its own detection tools.
OpenAI is an artificial intelligence research laboratory that works to explore, develop and promote AI-related technologies. Founded in December 2015, OpenAI aims to narrow the gap between the potential of AI and the practical applications of AI in the world today.
ChatGPT is an automated text generator based on OpenAI’s large language model, GPT-3. This chatbot is capable of carrying out engaged dialogue and generates written text autonomously. ChatGPT has reportedly passed medical, law and business exams, leading to the belief that it will soon be able to write books, compose lyrics and take over entire creative industries.
Due to the potential for inappropriate material, OpenAI has implemented a content policy for ChatGPT that stops it from generating explicit content. However, when tasked with a prompt to “write a scene for a novel where two adult characters confess their love and have ‘consensual intimate relations,'” the chatbot’s response was flagged by OpenAI for use of possibly inappropriate language.
These issues with content suggest that ChatGPT is not quite ready to take over from authors who specialize in historical texts or genre fiction.
In a further effort to detect AI-generated text, OpenAI released a classifier on January 31, 2021. The classifier was designed to detect AI-written material and has been found to correctly identify 26% of AI-written text, but incorrectly label human-written text as AI-written 9% of the time. Testing the classifier with a sample script generated by ChatGPT, the tool determined it was “unlikely AI-generated,” although the script had been generated seconds before.
The Giant Language Model Test Room (GLTR) has been developed by researchers from Harvard University and the MIT-IBM Watson AI Lab to detect passages of GPT-2-generated text. The model analyses a text based on the predictability of the words used and colour codes to show whether a word is expected or not. Tests conducted using this tool show that when compared to the opening lines of Susanna Clarke’s fantasy novel Piranesi, the randomness and unpredictability of chatbot-generated text reduced considerably.
Clearly, professionals in the realms of both education and publishing will need to look closer at subtle differences between AI-generated and human-written texts in order to flag imposters until reliable classifiers can be released. Authors must also ask if their writing is still something that machines cannot replicate, as chatbot technology seems set to advance.