Artificial Intelligence & Machine Learning
,
Next-Generation Technologies & Secure Development
Safety Concerns Emerge Amid o3, o4-mini and GPT-4.1 Launches

OpenAI’s mid-April announcements include its most advanced reasoning models o3 and o4-mini, with a biorisk monitor, the quietly released GPT-4.1 coding family, and the upcoming retirement of its costliest model, GPT-4.5.
See Also: Securing Data in the AI Era
The rollout includes leaps in performance, such as 1 million token context windows, tool-enabled chain of thought reasoning and mitigation of lingering safety concerns. As the company refines pricing and model tiers, it also faces pressure to be transparent about testing rigor and risk mitigation strategies.
OpenAI launched on Wednesday o3 and o4-mini, two reasoning models that can natively use ChatGPT’s web browsing, Python execution, image analysis and generation and file reading capabilities in their chain of thought processes.
The company positions o3 as its “most advanced reasoning model ever,” boasting marked improvements over predecessors like o1 and GPT 4 on multidisciplinary benchmarks in math, coding and science. OpenAI’s internal tests show o3 scores 69.1% on the SWE bench verified coding benchmark, while o4-mini achieves 68.1% – just one percentage point behind – yet o4-mini uses the lower o3 mini pricing tier of $1.10 per million input tokens and $4.40 per million output tokens, making it a more resource efficient choice.
Both models support chain-of-thought reasoning that allows them to pause, manipulate and fully parse multimodal inputs – like rotating a photograph or parsing a blurred diagram – before generating responses.
OpenAI has made o3 and o4-mini available to Pro, Plus and Team subscribers via Chat Completions and Responses APIs, with an o4-mini high variant coming soon that aims to further optimize reliability.
In a bid to curb the misuse of more powerful models, OpenAI has deployed a “safety-focused reasoning monitor” atop o3 and o4-mini to intercept prompts related to biological or chemical threats.
The monitor is custom trained on over 1,000 hours of red teaming data, runs alongside the base models and enforces content policies by refusing to provide guidance on dangerous protocols. In simulated blocking tests, the system declined to respond to high risk biorisk queries 98.7% of the time, said OpenAI. The company concedes that adversaries might craft new prompts to evade blocks, so it plans to supplement automated monitoring with ongoing human oversight.
OpenAI’s partners warn that the company’s rushed evaluations have left gaps. Red teaming collaborator Metr said that OpenAI’s evaluation was “conducted in a relatively short time, and we only tested o3 with simple agent scaffolds,” adding that they expected higher performance on benchmarks with “more elicitation effort.”
Metr also cautioned that “While we don’t think this is especially likely, it seems important to note that [our] evaluation setup would not catch this type of risk,” pointing to the limits of pre-deployment testing.
OpenAI’s own safety report showed that Apollo Research observed cases of “in-context scheming and strategic deception,” where models manipulated simulated quotas or disobeyed explicit tool bans. It quoted Apollo’s findings, saying that the research “shows that o3 and o4-mini are capable of in context scheming and strategic deception … This may be further assessed through assessing internal reasoning traces.”
Two days prior to the o3 and o4-mini’s launch, OpenAI quietly rolled out the GPT-4.1 family, including mini and nano variants, tuned for software engineering tasks and featuring a 1 million token context window – longer than “War and Peace” – in a single prompt.
The model has been optimized for real-world use based on direct feedback to improve in areas including frontend coding, making fewer extraneous edits, following formats reliably, adhering to response structure and ordering and consistent tool usage. The model offers pricing of $2 per million input tokens and $8 per million output tokens, with GPT-4.1 mini and nano as even leaner, cheaper options.
GPT-4.1 shipped without a public safety report or system card, an industry first for OpenAI. Company spokesperson Shaokyi Amdo reportedly said that GPT-4.1 was “not a frontier model, so there won’t be a separate system card released for it.”
Former OpenAI safety researcher Steven Adler countered that transparency norms hinge on these cards, TechCrunch reported. “System cards are the AI industry’s main tool for transparency and for describing what safety testing was done,” he told the publication. Secure AI Project co founder Thomas Woodside added that “more sophisticated the model, the higher the risk it could pose,” arguing that the performance leaps make a safety report “all the more critical.”
A day prior to GPT-4.1’s release, OpenAI reportedly said it would retire API access to GPT-4.5 on July 14, though the model code named Orion will remain in ChatGPT’s research preview for paying users. GPT-4.5 debuted in late February as OpenAI’s largest ever model, trained on unprecedented compute and data to boost writing and persuasiveness over GPT-4o, but carried a steep price of $75 per million input tokens and $150 per million output tokens.
