Model Appears to Be a Way Station on the Road to Something Greater

OpenAI on Thursday released its latest generative AI model, but don’t call it the next big thing just yet. More thoughtful, persuasive and emotionally intelligent, GPT-4.5 aims to feel less like an algorithm and more like a conversation partner. Whether it’s a meaningful step forward or just an expensive flex is up for debate.
See Also: The Comprehensive Guide for a Viable BYOD Policy
Internally known as Orion, GPT-4.5 is available to subscribers of the $200-a-month ChatGPT Pro service and developers using paid API tiers.
The model is the largest and most compute-intensive model OpenAI has released to date. The company has reportedly trained it using unprecedented levels of data and computational power, a strategy that has historically yielded dramatic performance improvements in areas ranging from natural language understanding to coding. But the white paper accompanying its launch cautions that the approach may be nearing its limits.
In previous iterations from GPT-1 through GPT-4, scaling up the model size typically resulted in leaps in performance. With GPT-4.5, the improvements are more nuanced, suggesting that the benefits of additional compute and data may be diminishing over time (see: How Test Time Compute Can Help Scale AI).
One of the most touted enhancements of GPT-4.5 is its ability to better understand user prompts and generate responses that feel more natural and human. OpenAI claims that the model exhibits higher emotional intelligence and is less prone to hallucinations. Its responses conveyed information with greater warmth and displayed sensitivity when handling emotionally charged prompts.
CEO Sam Altman said that GPT-4.5 is the first AI that “felt like talking to a thoughtful person” and that he was “astonished” with its “good advice.” “This isn’t a reasoning model and won’t crush benchmarks.” Altman added. It is a “different kind of intelligence.”
GPT-4.5’s performance on specialized tasks has drawn mixed reviews. On benchmarks that assess mathematical problem-solving, coding proficiency and logical reasoning, the model falls short compared to dedicated reasoning systems such as o3-mini, Anthropic’s Claude 3.7 Sonnet and even some offerings from Chinese AI firm DeepSeek. While GPT-4.5 can match or exceed other non-reasoning models on academic tests like AIME and GPQA, its overall performance suggests that scaling alone may not be sufficient to overcome certain inherent limitations in the pre-training methodology.
The model is more expensive to operate than its predecessors. For developers accessing GPT-4.5 through the API, OpenAI has set the cost at $75 per million input tokens and $150 per million output tokens – figures that dwarf the rates for GPT-4o, which are $2.50 and $10 respectively.
Beyond raw performance and cost, GPT-4.5 has unexpected persuasion capabilities. The model demonstrated in a series of internal tests an aptitude for coaxing favorable responses from other AI systems. One benchmark measured its ability to convince another instance of GPT-4o to donate virtual funds or reveal a secret code word, tasks in which GPT-4.5 outshone its predecessor and other models like o3-mini.
The newfound persuasiveness brings with it a host of concerns, especially in light of past incidents where AI-generated misinformation and deepfakes have swayed public opinion or facilitated social engineering attacks. OpenAI has acknowledged these risks, adding that GPT-4.5 does not yet reach the high-risk threshold for persuasive manipulation. The company says it is revising its testing methods and safety protocols to ensure that future models mitigate these risks.
The launch of GPT-4.5 comes at a time when the AI industry is witnessing a proliferation of new models from competitors. Recent releases from Anthropic and DeepSeek alone have challenged the notion that simply scaling up models will yield consistent performance gains. OpenAI’s approach with GPT-4.5, a maximalist strategy aimed at capturing the full breadth of human language and emotion, is in contrast with efforts by rivals who are experimenting with leaner models trained on limited budgets.
Industry insiders reportedly pointed to GPT-4.5 as a transitional product – one that bridges the gap between traditional pre-trained models and the next generation of AI systems that incorporate explicit reasoning capabilities. Two former OpenAI employees have suggested that GPT-4.5 was originally intended to be a more radical leap – what would have been called GPT-5 – but fell short of delivering the expected breakthrough. OpenAI has positioned GPT-4.5 as the final installment in its pre-training series before shifting focus to models that blend fast, intuitive responses with more deliberate, chain-of-thought reasoning. CEO Sam Altman hinted at this evolution in a recent tweet, saying that a new model combining these approaches is on the horizon.
Critics have not held back. Industry analysts dismissed GPT-4.5 as a “nothing burger” when measured against the promise of transformative AI. Weak benchmark scores in domains like mathematics and logical reasoning, along with the model’s exorbitant operational costs, have fueled a narrative that OpenAI’s latest release may not justify its lofty ambitions. But some others argue that the model’s qualitative improvements, particularly in terms of empathetic response and natural language understanding, show meaningful progress in bridging the gap between machine-generated text and human conversation.