More than two decades after J.K. Rowling introduced the world to a universe of magical creatures, forbidden forests, and a teenage wizard, Harry Potter is finding renewed relevance in a very different body of literature: AI research. A growing number of researchers are using the best-selling Harry Potter books to experiment with generative artificial intelligence technology, citing the series’ enduring influence in popular culture and the wide range of language data and complex wordplay within its pages.
In perhaps the most notable recent example, Harry, Hermione, and Ron star in a paper titled Who’s Harry Potter? that sheds light on a new technique helping large language models to selectively forget information. Microsoft researchers Mark Russinovich and Ronen Eldan have demonstrated that AI models can be altered or edited to remove any knowledge of the existence of the Harry Potter books, including characters and plots, without sacrificing the AI system’s overall decision-making and analytical abilities. This is an important task for the industry as large language models are built on vast amounts of online data, including copyrighted material and other problematic content, leading to lawsuits and public scrutiny for some AI companies.
Another study conducted by researchers from the University of Washington, the University of California at Berkeley, and the Allen Institute for AI developed a new language model called Silo, which can remove data to reduce legal risks. However, the researchers found that the model’s performance significantly dropped if trained only on low-risk text such as out-of-copyright books or government documents.
The researchers used the Harry Potter books to investigate how individual pieces of text influence an AI system’s performance. They found that when the Harry Potter books were removed from the data store, the accuracy of the AI models decreased, as measured by perplexity.
Harry Potter has been cited in AI studies for at least a decade, but its relevance has grown as academics and technologists have focused on developing AI tools that can process and respond to natural language with relevant answers. The abundance of scenes, dialogues, and emotional moments in the Harry Potter series makes it particularly relevant to the field of natural language processing.
Even when not central to the research, Harry Potter is a favorite literary reference for many researchers. For example, one study used Rowling’s works to test the intelligence of AI systems, arguing that chatbots merely reflect the intelligence and biases of their users.
Harry Potter’s enduring popularity among researchers can be attributed to the fact that many of them grew up reading the books. The familiarity with the series makes it a convenient choice when selecting written or spoken text corpora for their experiments.
As AI continues to advance, researchers are constantly seeking ways to improve language models and reduce legal risks associated with copyrighted material. The use of Harry Potter as a tool for AI research showcases how literature can play a significant role in driving technological advancements.
In conclusion, the Harry Potter series has become a valuable resource for researchers in the field of AI. From developing techniques to selectively forget information to investigating the influence of specific texts on AI models, the wizarding world created by J.K. Rowling offers a wealth of language data and complex storylines for researchers to explore. As AI technology continues to develop, it’s clear that literature, including beloved books like Harry Potter, will continue to play a significant role in shaping the future of AI.