Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
DeepSeek-R1 Struggles with Logic Tests and Is Vulnerable to Jailbreaks
Chinese artificial intelligence research company DeepSeek, funded by quantitative trading firms, introduced what it says is one of the first reasoning models to rival OpenAI o1.
See Also: Live Webinar | Recon 2.0: AI-Driven OSINT in the Hands of Cybercriminals
Reasoning models aim to address hallucinations and logical errors in generative AI by engaging in self-fact checking and performing multi-step reasoning tasks. Advanced processing capabilities can also increase response times, sometimes taking several dozen seconds to “think” of responses to complex queries.
DeepSeek previewed its AI model on Wednesday, claiming it demonstrated competitive performance against OpenAI’s o1-preview model on benchmarks such as AIME, which evaluates AI capabilities using other models, and MATH, a collection of problem-solving tests. But the model, like its OpenAI counterpart, struggles with simpler logic-based tasks, such as tic-tac-toe, and is vulnerable to jailbreaking.
Regulatory pressures in China also potentially push the DeepSeek-R1 model to filter politically sensitive topics, reflecting compliance with government mandates that require AI models to align with core socialist values. These constraints limit its ability to engage in discussions about contentious issues such as Tiananmen Square protests and massacre.
DeepSeek plans to open-source the model and provide API access. Backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI for trading, DeepSeek operates with significant resources, including server clusters featuring 10,000 NVIDIA A100 GPUs. One of the company’s first models to offer a text and image analysis service forced competitors such as ByteDance, Baidu and Alibaba to either dramatically cut the usage cost of their models or offer them for free.