95% of Enterprise AI Stuck in Pilot Purgatory: How Do the Rest Succeed?

Generative artificial intelligence dominates boardroom talk but is still scarce in production. After billions of dollars in spending, 95% of enterprise projects never make it past pilots, MIT researchers have found.
See Also: OnDemand | Transform API Security with Unmatched Discovery and Defense
U.S. enterprises have poured $40 billion into generative AI initiatives over the past three years, but have not seen measurable return on investment, according to MIT. The group studied more than 300 public AI initiatives and surveyed 153 leaders. Most projects stalled in pilot purgatory instead of transforming workflows. They were funded, hyped and presented in boardrooms, only to be abandoned before reaching production (see: Why the ROI of Enterprise AI Still Eludes Many Firms).
The Pushback
The “95%” figure has sparked as much debate as it has headlines. Some experts cautioned that the report may oversimplify or mischaracterize the challenges of enterprise adoption. Wharton professor Kevin Werbach said the survey data behind the 95% figure is unclear, with little transparency on methodology or sample size, raising concerns about academic rigor. Oxford fellow Ajit Jaokar called the study “a clever marketing gimmick” that aligns too neatly with Nanda’s decentralized AI agenda while overlooking more prosaic but decisive hurdles such as procurement cycles, workflow redesign and job changes. Strategy professors Nathan Furr and Andrew Shipilov from Harvard Business Review said the headline obscures a subtler reality that companies are repeating the mistakes of the digital transformation era, scattering resources across unfocused pilots instead of linking AI tests to core business opportunities.
Moving Beyond Proof of Concept
Whether the true figure is 95% or lower, experts agree that enterprises are struggling to move AI past proof of concept.
Stalled pilots define enterprise AI adoption. Nearly every large firm has launched pilots, but most experiments appear to be stuck in limbo. The MIT Nanda report found 80% of companies explored gen AI tools and half run pilots but only 5% reached production. The gap is widest among enterprises with more than $100 million in annual revenue, which launch the most pilots but convert them at the lowest rate. Midmarket firms scaled faster and moved projects from pilot to deployment in an average rate of 90 days, compared with nine months or more at Fortune 500 companies.
Learning Gap
The barrier is not infrastructure, regulation or talent but what the authors call “learning gap.” Most enterprise AI systems cannot retain memory, adapt to feedback or integrate into workflows. Tools work in isolation, generating content or analysis in a static way, but fail to evolve alongside the organizations that use them. For executives, the result is a sea of proofs of concept with little business impact.
“Chatbots succeed because they’re easy to try and flexible, but fail in critical workflows due to lack of memory and customization,” the report said.
Many pilots never survive this transition, Mina Narayanan, research analyst at the Center for Security and Emerging Technology, told Information Security Media Group.
“Generative AI pilots may not adequately test an AI system’s effectiveness in the real world. They may perform well in controlled demonstrations, but those conditions often differ in subtle ways from live deployment environments,” she said. “If the right people aren’t involved in the pilot’s design, the scope may miss critical workflows, making enterprises hesitant to scale.”
Enterprise Adoption and Shadow AI Economy
The weakness of enterprise deployments is especially striking given the strong adoption of consumer-facing AI. Workers across industries said they rely on ChatGPT or Copilot, often several times a day. But the same people reject the enterprise-grade systems their companies purchase, describing them as brittle, over-engineered or misaligned with their needs. The paradox is that employees are proving AI’s usefulness informally while official deployments languish.
The MIT Nanda survey found only 40% of firms purchased large language model subscriptions, while more than 90% of employees reported using personal AI tools for work. This shadow AI economy is where adoption is happening inside companies, even as official projects stalled (see: How Enterprises Can Mitigate the Quiet Threat of Shadow AI).
The implications of this shadow economy are complex. On one hand, it shows clear employee demand, as workers gravitate toward flexible, responsive and familiar tools. On the other, it exposes enterprises to compliance and security risks. Corporate lawyers and procurement officers interviewed in the report admitted they rely on ChatGPT for drafting or analysis, even when their firms purchased specialized tools costing tens of thousands of dollars. When asked why they preferred consumer tools, their answers were consistent: ChatGPT produced better outputs, was easier to iterate with and required less training.
“Our purchased AI tool provided rigid summaries with limited customization options,” one attorney told the researchers. “With ChatGPT, I can guide the conversation and iterate until I get exactly what I need. The fundamental quality difference is noticeable. ChatGPT consistently produces better outputs, even though our vendor claims to use the same underlying technology.”
Shadow AI Economy Risks
Narayanan said that this trend poses real risks. Employees that use consumer tools such as ChatGPT without authorization may fall out of compliance and unwittingly disclose sensitive business or personal data, putting themselves and their organizations at risk. “These tools may retain and regurgitate the information that employees provide, potentially opening the door for malicious actors to exploit sensitive enterprise data,” Narayanan said.
To mitigate the issue, companies should set clear policies that define permissible uses and train employees on the risks of gen AI, she said.
Media, Telecom Sectors Take Lead
Despite the noise surrounding adoption, genuine disruption is concentrated in two sectors. Media and telecom were the only industries that showed evidence of structural change. MIT Nanda’s disruption index, which scores industries on shifts in market share, new entrants, business models and customer behavior, gave tech and media a score of two. Every other sector scored between zero and 1.5. Healthcare and energy ranked lowest, with almost no observable impact.
A COO at a midmarket manufacturing firm told the report authors: “The hype on LinkedIn says everything has changed, but in our operations, nothing fundamental has shifted. We’re processing some contracts faster, but that’s all that has changed.”
How to Achieve ROI
The economic stakes of crossing the divide are significant. The companies that broke through – the 5% – report measurable return on investment. Front-office benefits include 40% faster lead qualification and 10% better customer retention through AI-powered follow-ups. Back-office savings are even more dramatic: $2 million to $10 million annually from eliminating BPO contracts, 30% reductions in external agency spend and $1 million per year saved on outsourced financial risk checks.
For affected sectors such as media and technology, “80% of executives anticipate reduced hiring volumes within 24 months.” But the interviewed executives told researchers they saw efficiency improvements rather than headcount reductions, with job impacts limited mainly to outsourced support functions such as customer support operations, administrative processing and standardized development tasks.
“These roles exhibited vulnerability prior to AI implementation due to their outsourced status and process standardization,” the report said.
What Sets Successful Projects Apart
The key difference between abandoned pilots and survivors appears to be the approach. Organizations that succeed tend to buy rather than build, and they treat vendors like business process outsourcing partners rather than software suppliers. The MIT Nanda report found external partnerships succeed about 67% of the time, compared to 33% for internally built projects. The most effective buyers demand customization, hold vendors accountable for business outcomes and allow frontline managers – not central AI labs – to drive adoption.
An unidentified CIO told the report authors: “We’ve seen dozens of demos this year. Maybe one or two are genuinely useful. The rest are wrappers or science projects.”
Narayanan said partnerships help reduce complexity. Vendors can simplify some of the overhead of acquiring AI systems by sharing development, testing and maintenance responsibilities, she said. But enterprises need strong contracts that protect data privacy, ensure risk management practices and guarantee performance-based outcomes. They must also guard against vendor lock-in by ensuring the portability of data and knowledge after contracts end, she said.
Even as the divide persists, the report said the window for experimentation is closing. Enterprises are beginning to lock in vendor relationships with systems that learn and adapt, creating switching costs that compound over time.
Way Forward
Looking ahead, Narayanan expects uneven adoption across industries. Technology and media firms will likely lead the charge because they’re already experimenting and building institutional knowledge about how to integrate AI effectively. “Signs of maturity may include shifts in workforce composition or changes in the type and volume of services offered. But some industries may continue to show little progress, particularly if workers view AI systems as unreliable,” she said.
The “GenAI divide” problem is less about technology and more about choices. Companies are spending billions on science projects that don’t integrate or improve. Employees are turning to shadow AI tools that are powerful but bypass governance and risk controls. The small fraction of organizations that succeed “demand deep customization, drive adoption from the front lines and hold vendors accountable to business metrics,” the report said.
