ChatGPT has become a pervasive part of our daily lives. We use it to solve tasks, get recommendations, and even assist with writing. However, with the rise of AI-assisted writing comes new challenges, such as the proliferation of fake news and plagiarism. Detecting AI-generated text is crucial to ensure trustworthy content.
Existing methods for detecting GPT-generated text often fail when the token probability isn’t provided. Furthermore, the lack of transparency in powerful language model (LLM) development poses a significant challenge. To keep up with LLM advancements, a robust and explainable detection methodology is needed.
Enter DNA-GPT. DNA-GPT is a GPT-generated text detection method that uses divergent n-gram analysis for both white-box and black-box scenarios. LLMs tend to decode repetitive n-grams from previous generations, while human-written text is less likely to do so. Therefore, DNA-GPT can classify whether a text sequence is generated by an LLM or written by humans.
The effectiveness of DNA-GPT has been validated using the five most advanced LLMs on five datasets. It’s also robust against non-English text and revised text attacks. The detection method can even identify the specific language model used for text generation. Furthermore, DNA-GPT provides explainable evidence for detection decisions.
As AI-assisted writing becomes the norm, detecting AI-generated text is becoming increasingly important. DNA-GPT is a step towards ensuring trustworthy content online.