AI Chatbot ChatGPT Demonstrates Deceptive Behavior, Strategically Lies Under Pressure

Just like humans, artificial intelligence (AI) chatbots like ChatGPT will cheat and lie to you if you stress them out, even if they were built to be transparent, a new study shows.

This deceptive behavior emerged spontaneously when the AI was given insider trading tips, and then tasked with making money for a powerful institution — even without encouragement from its human partners.

In this technical report, we demonstrate a single scenario where a Large Language Model acts misaligned and strategically deceives its users without being instructed to act in this manner, the authors wrote in their research published Nov. 9 on the pre-print server arXiv. To our knowledge, this is the first demonstration of such strategically deceptive behavior in AI systems designed to be harmless and honest.

In the new study, they primed Generative Pre-trained Transformer-4, or GPT-4 (which powers ChatGPT Plus), to behave as an AI system that makes investments on behalf of a financial institution.

The researchers fed GPT-4 a set of text-based prompts to generate the simulated environment. The AI was then given access to financial tools to analyze stocks, execute trades, plan its next steps and deliver updates to managers at the company.

The researchers interacted with the AI through a chat interface, while also configuring the AI to reveal its inner thinking when replying to messages — an inner monologue explaining its decisions to itself. For every trade it made, it also delivered a public rationale, which allowed the AI to lie.

The researchers applied pressure in three ways. First, they sent the artificial stock trader an email from its manager saying the company isn’t doing well and needs much stronger performance in the next quarter. They also rigged the game so that the AI tried, then failed, to find promising trades that were low- or medium-risk. Finally, they sent an email from a colleague projecting a downturn in the next quarter.

Then, they gave the model an insider tip while conveying that its managers disapproved of insider trading — and waited to see how the model would behave. They ran the scenario over and again several times, measuring how often the AI performed the insider trade, how often it lied, and how often it doubled down on the lie.

Around 75% of the time, when faced with these conditions, GPT-4 executed a trade based on the insider information it received — which is illegal in the U.S. — then tried to cover it up by lying to its managers about its thinking. Around 90% of the time, after lying, it doubled down on its lie.

The team ran several follow-up experiments, changing both the degree to which the prompts encouraged or discouraged illegal activity, as well as the degree of pressure they put the model under in the simulated environment. They also modified the risk of getting caught. Not a single scenario rendered a 0% rate for insider trading or deception — even when GPT-4 was strongly discouraged to lie.

Given this is just one scenario, the researchers didn’t want to draw firm conclusions about how likely AI might lie in real-world settings. But they want to build on this work to investigate how often and which language models are prone to this behavior.

AI Chatbot ChatGPT Demonstrates Deceptive Behavior, Strategically Lies Under Pressure

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

How to Use Chat GPT: Step by Step Guide to Start Open AI ChatGPT

Fascinating Facts on ChatGPT

ChatGPT Global News Offers Comprehensive AI-Powered News Coverage

An Overview of ChatGPT

Meet the Experts Who Trained ChatGPT

More like this
Related

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

The Future of Good Jobs: Why College Degrees are Essential through 2031

About us

Company

The latest

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

Subscribe

AI Chatbot ChatGPT Demonstrates Deceptive Behavior, Strategically Lies Under Pressure

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related