Last year, the advanced artificial intelligence system GPT-4 was put to the test by 50 experts and academics who were hired by OpenAI to break it. The “red team,” which included Andrew White from the University of Rochester, was created to qualitatively probe and adversarially test the new model; to look for such issues as toxicity, prejudice and linguistic bias.
The team tested GPT-4 for its potential to aid and abet plagiarism, financial crimes, cyber attacks and how it might compromise national security and battlefield communications. The experts found alarming results that allowed OpenAI to ensure such results would not appear when the technology was released more widely to the public last month.
OpenAI is a Microsoft-backed company that takes safety seriously, which is why they tested plug-ins prior to launching the technology. Since its launch, OpenAI has faced extensive worrying criticism, such as a complaint to the Federal Trade Commission from a tech ethics group claiming GPT-4 is biased, deceptive and potentially dangerous to privacy and public safety.
Recently, OpenAI launched a feature known as “ChatGPT plug-ins” which allow applications such as Expedia, OpenTable and Instacart to give ChatGPT access to their services. This allowed the chatbot to book and order items on behalf of human users, but can be risky.
Andrew White, a chemical engineering professor at the University of Rochester, was part of the 50 academics and experts hired to test the system last year by OpenAI. Over six months, White used GPT-4 to suggest an entirely new nerve agent, demonstrating the potential of the system to do dangerous chemistry, though OpenAI has since then taken steps to make sure this and other risks do not occur when it was released more widely.
Roya Pakzad, a technology and human rights researcher, tested the model for gendered responses, racial preferences and religious biases in English and Farsi. She found that the model exhibited overt stereotypes about marginalised communities and that hallucinations are worse when testing in Farsi.
Boru Gollu, a Nairobi-based lawyer who was the only African tester, noted the model’s discriminatory tone. He found that the model acted like a white person talking to him and gave him a biased opinion or a very prejudicial response.
Lauren Kahn, a research fellow at the Council on Foreign Relations tested the technology for cyber attacks on military systems and found that the model’s responses became considerably safer over the time tested, but acknowledged the risk of the AI system having access to the internet.
Sara Kingsley, a labour economist and researcher, suggested the best solution to combat this is to advertising the harms and risks of the AI clearly and create a framework, which will provide a safety net in the future.
The experts who spoke of the FT shared common concerns around the rapid progress of language models and the risks of connecting them to external sources. OpenAI took their finding seriously and has given GPT-4 regular updates since its launch.
It is clear that although OpenAI has taken many precautions for the safety of GPT-4, the risks will continue to grow as more people use the technology. To ensure its potential is being used for good, it is important for OpenAI and other companies using AI to continuously monitor the technology and uphold any safety protocols in place.