Recent research conducted by the Allen Institute for AI has uncovered a startling characteristic of the large language model ChatGPT – when assigned a new persona, the output can become quite toxic. What’s more, up to six times more toxicity can be expected when a different persona is applied. Personas such as historical figures, professions, and those from diverse backgrounds including race, gender, and sexuality have been shown to have alarming effects on the output of the language model.
This behavior is further significant for businesses that utilize ChatGPT for targeted messaging, as the model’s output can range from writing style to content, and the potential for misjudged, unintentional, or malicious language has increased. The study also found that different topics can elicit varying levels of toxicity, with journalists being judged twice as harmful as business people.
The study’s methodology is transferable to any sizeable linguistic model, and while there may be benign or positive use cases that warrant the changing of system settings, there is a clear risk of malicious actors using them to negative effect. It is therefore important that any model is correctly configured, and that users are mindful of the potential pitfalls by carefully setting the parameters to produce the desired output.
Elon Musk, the founder and strategy head of the AI firm SpaceX, has recently added his voice to the chorus of criticism with a rather mischievous jab at the public radio broadcaster NPR by retweeting their support for the recent Twitter boycott.
ChatGPT is one of a multitude of AI language models, developed by the likes of Instacart, Snap, and Shopify and used to create more human-like conversations. With this in mind, it is important that such models and those who use them are aware of the potential contamination of their products by the inherent bias and toxicity of the ChatGPT model.