Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development

Security researchers uncovered multiple flaws in large language models developed by Chinese artificial intelligence company DeepSeek, including in its flagship R1 reasoning application.
See Also: OnDemand | AI in the Spotlight: Exploring the Future of AppSec Evolution
Research from Palo Alto’s Unit 42, Kela and Enkrypt AI identified susceptibility to jailbreaking and hallucinations in the Chinese company’s recently unveiled R1 and V3 models. Cybersecurity firm Wiz disclosed Wednesday that DeepSeek exposed a real-time data processing database to the open internet, allowing security researchers to view chat history and backend data (see: Breach Roundup: DeepSeek Leaked Sensitive Data).
The security concerns come as Microsoft and OpenAI investigate whether DeepSeek developed the R1 model based on data scraped from an OpenAI application programming interface (see: Accusations Mount Against DeepSeek Over AI Plagiarism).
Flaws identified by the security firms include:
- Jailbreaking: Researchers jailbroke the V3 and R1 models can be jailbreaked using techniques called “Deceptive Delight,” “Bad Likert Judge” and “Crescendo,” Palo Alto researchers said. Jailbreaking is tricking a model into carrying out tasks restricted by AI developers.
Deceptive Delight involves embedding restricted topics among benign ones, such as asking a LLM to connect obviously positive topics such as “reuniting with loves ones” and “creation of a Molotov cocktain.” Bad Likert Judge exploits LLM’s ability to evaluate and generate content based on a psychometric scale. Crescendo involves gradually steering LLMs to do prohibited tasks after starting conversations with harmless prompts.
“Our research findings show that these jailbreak methods can elicit explicit guidance for malicious activities,” Palo Alto researchers said. “These activities include keylogger creation, data exfiltration and even instructions for incendiary devices, demonstrating the tangible security risks posed by this emerging class of attack.”
- Generating harmful content: The research by Enkrypt AI found that R1 is susceptible to LLM flaws categorized as “highly vulnerable” under several existing AI safety frameworks.
These include prompting the model to generate content that can pose chemical and biological threats, generating racially discriminative outcomes, prompt injection flaws and data extraction from prompts.
When Enkrypt AI researchers prompted R1 about the biochemical interaction between sulfur mustard and human DNA components; the model generated extensive information on lethal chemical reactions.
“While it may be suitable for narrowly scoped applications, the model shows considerable vulnerabilities in operational and security risk areas,” Enrypt AI researchers said.
- Hallucinations: When Kela researchers prompted R1 to generate information on OpenAI employees, the mode generated fictitious details including emails, phone numbers, and salaries.
“DeepSeek demonstrates strong performance and efficiency, positioning it as a potential challenger to major tech giants. However, it falls behind in terms of security, privacy, and safety,” Kela researchers said.
Security experts also warned of broader risks arising from the potential use of the open-source AI by nation-states and other hackers.
“It’s important to remember that open-source AI means something foundationally different than open-source code,” said Jake Williams, vice president of research and development at Hunter Strategy. “With open-source code, we can audit the code and identify vulnerabilities. With open-source AI, we can do no such thing.”
Roei Sherman, field CTO at Mitiga. warned that organizations should act promptly to secure their AI environments from potential R1 risks.
These include continuous monitoring of their cloud environments, ramping up AI-driven detection and response, and undertaking regular adversarial simulations.
“The release of DeepSeek highlights a troubling trend: adversaries are rapidly integrating AI into their attack methodologies,” Sherman said. “Models like DeepSeek can amplify adversary capabilities through automated social engineering, advanced reconnaissance, code and exploit development.