Title: OpinionGPT AI Model: Shedding Light on Implicit Biases or Stereotype Generator?
A team of researchers from Humboldt-Universitat zu Berlin is gaining attention for developing a unique artificial intelligence (AI) model called OpinionGPT. Unlike other language models such as OpenAI’s ChatGPT or Anthropic’s Claude 2, OpinionGPT has been intentionally tuned to generate outputs with expressed bias. However, the effectiveness of OpinionGPT in unveiling real-world biases is questionable due to the nature of its tuning data.
OpinionGPT is essentially a variant of Meta’s Llama 2, boasting a similar capability as other popular AI systems. Through a process called instruction-based fine-tuning, the model is trained to generate responses as if it were representative of 11 different bias groups, including American, German, Latin American, Middle Eastern, and various age groups. The researchers achieved this by refining the model on data sourced from AskX communities on Reddit, specifically subreddits like Ask a Woman and Ask an American.
To create this biased variant, the researchers extracted the top 25,000 posts from each bias-related subreddit. They then further filtered these posts, choosing only those that received a sufficient number of upvotes, contained no embedded quotes, and were under 80 words in length. Finally, they fine-tuned the existing 7 billion-parameter Llama 2 model by utilizing separate instruction sets for each specific bias label.
However, due to the untamed nature of the data used to fine-tune OpinionGPT, its outputs may not actually reflect real-world biases. Rather, the model tends to generate text aligned with the biases present in its training data. The researchers themselves acknowledge this limitation, emphasizing that the responses from OpinionGPT should be understood as reflecting the biases of specific Reddit communities, rather than broader demographics.
In an effort to eliminate ambiguity, the researchers plan to explore models that focus on more specific demographics, such as liberal or conservative Germans. This could potentially enhance the model’s ability to capture and reflect actual biases.
While OpinionGPT may not serve as a reliable tool for studying genuine human bias, it can still be valuable for examining the stereotypes prevalent within large document repositories such as individual subreddits or AI training sets. It offers the opportunity to delve into the biases present within these communities and explore the influence of different factors on the generated outputs.
To explore the capabilities of OpinionGPT for yourself, the researchers have made the model available online for public testing. However, it is important to note that the generated content should be taken with a grain of salt, as it can be inaccurate, false, or even obscene.
All in all, OpinionGPT presents an interesting development in the realm of AI language models. While it may fall short in truly uncovering real-world biases, it has the potential to assist in studying the stereotypes ingrained within specific communities. As more advanced models continue to be developed, with refined tuning and stricter data vetting processes, we can strive to enhance the accuracy and reliability of AI models in understanding and addressing biases in the future.