Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
Open-Source Models Hallucinate More Than Commercial Ones, Found Study

Generative artificial intelligence assistants promise to streamline coding, but large language models’ tendency to invent non-existent package names has led to a new supply chain hazard known as “slopsquatting,” where attackers register phantom dependencies to slip malicious code into deployments.
See Also: Securing Data in the AI Era
A recent paper from researchers at the University of Texas at San Antonio, Virginia Tech and the University of Oklahoma subjected 16 popular code generation models to two prompt datasets, yielding a total of 576,000 Python and JavaScript code samples that were checked against public registries. The authors ran each prompt through every model, recording dependency recommendations and verifying their existence on PyPI and npm. This exhaustive methodology is the first large scale quantification of package hallucinations in generative large language models (see: Hackers Can Use AI Hallucinations to Spread Malware).
Results revealed a disparity between model categories. Open-source engines averaged a hallucination rate of 21.7%, while commercial offerings such as GPT 4 and GPT 4 Turbo maintained a lower – though not negligible – 5.2% error rate.
Across all experiments, researchers identified 205,474 unique hallucinated package names. They also classified the origins of the hallucinations. The researchers determined that 38% of the non-existent names echoed real libraries, sharing similar naming patterns. 13% arose from simple typos, while the remaining 51% were pure fabrications with no clear lineage.
The hallucinations proved neither random nor fleeting. In trials where prompts consistently triggered phantom dependencies, 43% of those non-existent names recurred across similar requests, and 58% reappeared at least once within ten iterations, making them ripe targets for registration by malicious actors, said Socket researchers. The repeatability makes hallucinations reliable markers for attackers seeking to preemptively publish their own packages under those names.
Security researcher Seth Larson originally coined “slopsquatting” to define this new attack vector, extending the principles of typosquatting to AI-induced errors. If threat actors create a package with the same name hallucinated by an AI model and inject malicious payloads, any automated installer using default dependency resolution will unknowingly fetch and execute hostile code. Such a compromise can propagate through CI/CD pipelines and production environments before detection.
Researchers implemented mitigation strategies in their experiments, including tuning model temperature settings to reduce randomness, embedding guardrail prompts to flag unlikely dependencies and enforcing post generation name validation against official registries. The interventions produced a marked decrease in hallucination frequency while preserving the functional quality of the generated code.
