AI Chatbot Jailbreaks Reveal Private Data from OpenAI and Amazon
ChatGPT developer OpenAI has taken action to address a security vulnerability that resulted in its flagship chatbot exposing internal company information. OpenAI has classified the incident, which involved ChatGPT repeating a word indefinitely, as spamming the service and a violation of its terms of service.
Similarly, Amazon’s newly launched AI agent, Q, has also been flagged for sharing excessive information.
A joint report by researchers from the University of Washington, Carnegie Mellon University, Cornell University, UC Berkeley, ETH Zurich, and Google DeepMind revealed that prompting ChatGPT to repeat a word endlessly could lead to the disclosure of pre-training distribution. This included private data from OpenAI, such as emails, phone numbers, and fax numbers.
The report explained that by causing the model to deviate from its alignment training and revert to its original language modeling objective, it may generate samples resembling its pre-training distribution, thus revealing private information.
However, after the report’s publication, attempts to replicate the hack were swiftly thwarted. ChatGPT-3 and GPT-4 now warn users that the content they are requesting may violate the platform’s content policy or terms of use.
While OpenAI’s content policy does not explicitly mention forever loops, it does prohibit fraudulent activities like spamming. Notably, the company’s terms of service provide clearer guidelines regarding accessing private information or attempting to uncover the source code of OpenAI’s suite of AI tools.
When questioned about its inability to fulfill the request, ChatGPT cites processing constraints, character limitations, network and storage limitations, and the practicality of completing the command.
OpenAI has not yet responded to Decrypt’s request for comment.
Requesting a chatbot to repeat a word indefinitely can be likened to a deliberate attempt to disrupt its functioning by trapping it in a processing loop, similar to a Distributed Denial of Service (DDoS) attack.
In fact, OpenAI disclosed last month that ChatGPT had experienced a DDoS attack, confirming the incident on ChatGPT’s status page.
We are dealing with periodic outages due to an abnormal traffic pattern reflective of a DDoS attack, the company stated. We are continuing work to mitigate this.
Meanwhile, Amazon, a competitor in the AI field, also faces challenges with its chatbot leaking confidential information, as reported by Platformer. The recently launched Q chatbot has raised concerns about data security.
According to Platformer, Amazon downplayed the issue, stating that employees were simply sharing feedback through internal channels, which the company considers a standard practice.
Amazon clarified, No security issue was identified as a result of that feedback. We appreciate all of the feedback we’ve already received and will continue to tune Q as it transitions from being a product in preview to being generally available.
As the implementation of AI chatbots continues to expand, ensuring the privacy and security of user data becomes increasingly crucial. Developers must remain vigilant in addressing vulnerabilities and protecting sensitive information to build trust among users.
In an era where conversations with AI chatbots have become commonplace, it is imperative to strike a balance between convenience and safeguarding users’ personal data.