Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
The Generative AI Tool Can Write Bug Reports – Just Not Useful Ones
Natural language models aren’t the boon to auditing many in the Web3 community hoped that generative artificial intelligence tools would be. After a burst of optimism, the consensus now is that AI tools generate well-written, perfectly formatted – and completely worthless – bug reports.
See Also: Live Webinar | Unmasking Pegasus: Understand the Threat & Strengthen Your Digital Defense
“To date, not a single real vulnerability has been discovered through a ChatGPT-generated bug report submitted via Immunefi,” the Web3 bug bounty platform Immunefi -which permanently banned the submission of ChatGPT-generated bug reports – said in a recent report.
The flood of bug reports Immunefi received after OpenAI launched ChatGPT in November used the same technical language commonly seen in successful bug reports, but the “claims in the reports were nonsensical,” the company said. Reports referred to functions that were not present in a project’s codebase, scrambled key programming concepts, provided vague descriptions of the vulnerabilities and offered generic impact statements.
For Web3 security – especially vulnerability discovery, the technology used by ChatGPT is “just not there,” the bug bounty platform’s founder and CEO Mitchell Amador says.
About three-quarters of the nearly 200 white hat hackers that Immunefi surveyed who said they had used ChatGPT for bug hunting agreed the results weren’t worth the effort. More than 60% of them said that ChatGPT had limited accuracy in identifying security vulnerabilities, lacked domain-specific knowledge and had difficulty handling large-scale audits.
An AI tool simply may not be able to detect new or emerging threats not already identified or logged in its training data. The use of outdated data also leads to constant errors, the report says.
The accurate vulnerabilities it does find are “extremely obvious and standard.”
Generative AI currently lacks the sophistication to provide detailed, error-free information about blockchain security topics, Immunefi said, adding that it “certainly is incapable of producing valid vulnerability reports.”
“In an industry where there’s so much at stake, we need to speak about ChatGPT or any other AI solution with more clarity. We need to answer questions like those and be thorough about how we engage with it, especially when it has the potential to become a part of someone’s toolkit,” the company said.
But it’s not all bad news: More than three-quarters of the surveyed white hats believe that ChatGPT has the potential to improve Web3 security research.
In its current state, the technology is best suited for Web3 security education, they said. It can help bug hunters summarize documentation, explain complex code and protocols to them in a simple form and provide buggy code for newer white hats to use for practice.
HackerOne co-founder Michiel Prins said the bug bounty platform has already seen an increase in concise and well-written reports generated via these AI chatbots, especially from individuals for whom English is not the first language.
“The increase in report quality reduces the back-and-forth needed between the hacker and HackerOne’s security analysts that triage and validate all hacker submissions,” he told Information Security Media Group.
But code auditors may not be able to rely on ChatGPT as the sole means of generating bug reports. These tools can aid the process, but “AI models are known to hallucinate occasionally, making up inaccurate or completely false facts. Where hackers are responsible for submitting factual and accurate information in their reports, they will have to watch out for this and make sure they are proofing any AI copy they receive for blatant inaccuracies,” Prins said.