Vibe Hacking Not Yet Possible

Open-source models: The 16 tested were &#8220;unsuitable even for basic vulnerability research&#8221;;
Underground models: The 23 tested were &#8220;hampered by usability issues, including limited access, unstable behavior, poor output formatting and restricted context length&#8221;;
Commercial models: The 18 tested were often restricted by guardrails, and &#8220;only three models succeeded in producing a working exploit,&#8221; and then only with extensive guidance by expert users.

AI Models Mostly Fail in Full Track of Vulnerability Research to Exploit

Mathew J. Schwartz (