The cybersecurity world may have a new potential threat on the horizon, according to recent findings by researchers. Language models such as ChatGPT, which were previously not considered capable of exploiting complex cybersecurity vulnerabilities, have now shown a high proficiency in doing so.
A study conducted by researchers at the University of Illinois Urbana-Champaign (UIUC) revealed that GPT-4, a language model, has demonstrated an alarming ability to exploit ‘one-day’ vulnerabilities in real-world systems. In a dataset containing 15 such vulnerabilities, GPT-4 successfully exploited 87% of them, marking a significant contrast to other language models and vulnerability scanners tested in the study.
While models like GPT-3.5, OpenHermes-2.5-Mistral-7B, and Llama-2 Chat (70B), as well as tools like ZAP and Metasploit, showed a success rate of 0%, GPT-4’s performance stood out. However, the catch is that for GPT-4 to achieve such high success rates, it requires the vulnerability description from the CVE database. Without this information, its success rate drops significantly to just 7%.
This new revelation raises concerns about the potential risks posed by deploying highly capable language model agents like GPT-4 without proper safeguards. While previous studies highlighted the beneficial role these models can play in various fields, their implications for cybersecurity have largely been unexplored until now.
While it was known that LLM agents could autonomously hack ‘toy websites,’ previous research primarily focused on hypothetical scenarios rather than real-world implications. The UIUC researchers’ paper, now available on Cornell University’s pre-print server arXiv, sheds light on the concerning capabilities of these models in the realm of cybersecurity.
The findings underscore the need for a deeper understanding of the risks associated with deploying advanced language models in sensitive domains like cybersecurity. As technology continues to evolve, ensuring the security and integrity of systems becomes even more critical in safeguarding against potential threats.