Wharton & MIT Can't Agree on AI: Here's What Both are Missing on Building Real AI Projects
Based on AI News & Strategy Daily | Nate B Jones's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
MIT’s 95% failure rate and Wharton’s 75% success rate reflect different ROI definitions, time horizons, and thresholds for what counts as value.
Briefing
A 75% “success” rate and a 95% “failure” rate from two major studies on enterprise generative AI don’t contradict each other so much as they measure success on incompatible definitions. Wharton’s higher success figure reflects how executives track ROI—often through productivity, time saved, and throughput—while MIT’s failure rate comes from a much tighter standard that effectively treats projects as failures unless they can prove near-term, dollar-and-cents impact on the bottom line. The headline numbers feel mutually exclusive, but they’re built on different scorecards, different time horizons, and different thresholds for what counts as value.
That mismatch matters because it shapes how organizations decide what to fund, how to staff AI work, and what “good” looks like when results arrive unevenly. MIT’s approach pushes leadership toward a hard-nosed view of software ROI—especially important as AI tools can become dramatically more expensive per employee than prior software categories. Wharton’s approach, by contrast, mirrors how executives actually manage: they often accept operational metrics as proxies for business impact. The practical takeaway is to treat the viral top-line percentages as a starting point, not a decision framework.
Where enterprise AI programs tend to succeed more steadily, the discussion points to three missing building blocks that aren’t captured well by either study’s metrics. First is institutional fluency through team-level context awareness. In this model, context engineering isn’t a narrow job for specialists; it’s a deliberate capability teams maintain. Teams articulate their domain workflows, uncertainties, and value-driving processes to AI systems so the output is locally useful. Leaders then can observe “accountable acceleration” when teams consistently apply that context in their work.
Second is AI problem-solving skills—treated as a critical patch on team fluency rather than a one-time training topic. The hard part isn’t just learning to prompt or analyze; it’s understanding how LLMs process information well enough to decompose problems into forms the model can handle. Crucially, the ownership model flips in the AI era: individual contributors must own quality and decide whether the AI’s output meets the bar. Managers and teams can hold the AI literacy and shared methods, but without individual ownership—especially the willingness to challenge the model when it’s wrong—value stalls.
Third is “taste,” described as a democratized quality instinct for choosing the right problems and recognizing what excellent looks like. Pre-AI organizations could centralize taste in a small priesthood of experts; AI-native speed requires pushing that judgment down to teams (or at least broader units) so they can act autonomously without sacrificing standards. Taste is not universal like core LLM skills; it’s vertical- and situation-specific, closely tied to local domain knowledge and the ability to spot where the real juice is in the profitability matrix.
Taken together, the core argument is that the path to real AI projects isn’t found in arguing over 75% versus 95%. It’s built by institutionalizing context, reconfiguring ownership and skills across individuals and teams, and socializing taste—so organizations can deliver consistent value even as study headlines keep changing.
Cornell Notes
Two enterprise AI studies report sharply different outcomes—MIT’s 95% failure rate versus Wharton’s 75% success rate—because they use incompatible ROI definitions. MIT applies a very strict standard requiring measurable dollar-and-cents bottom-line impact within a short window, while Wharton reflects executive practice using softer operational metrics like productivity, time saved, and throughput. The more actionable framework offered is “institutional fluency,” built from three capabilities: team-level context awareness, AI problem-solving skills paired with individual-level ownership, and democratized “taste” for selecting the right problems and judging quality. These elements explain why organizations can show steadier progress than headline percentages suggest.
Why do MIT’s 95% failure rate and Wharton’s 75% success rate both appear credible?
What does “institutional fluency” mean in practical terms for AI adoption?
How does context engineering change when it becomes a team-level responsibility?
What’s the proposed flip in ownership and skills for AI problem solving?
What is “taste,” and why can’t it stay centralized in an AI-native company?
Review Questions
- How do MIT’s and Wharton’s ROI measurement approaches differ, and why does that make their headline percentages non-comparable?
- In the proposed AI problem-solving model, what must individual contributors do that managers previously could handle at the team level?
- Why is “taste” described as both essential and non-universal across organizations?
Key Points
- 1
MIT’s 95% failure rate and Wharton’s 75% success rate reflect different ROI definitions, time horizons, and thresholds for what counts as value.
- 2
MIT’s tighter standard demands near-term, bottom-line dollar impact, while Wharton’s success metric aligns with executive use of productivity, time saved, and throughput.
- 3
Enterprise AI success depends less on arguing over headline numbers and more on building “institutional fluency” that persists across teams.
- 4
Team-level context awareness is treated as the foundation: teams must deliberately articulate domain workflows, uncertainties, and value drivers to AI systems.
- 5
AI problem solving requires a skills/ownership inversion: AI literacy can be shared at the team level, but quality ownership must sit with individual contributors.
- 6
Democratized “taste” helps teams choose the right problems and judge quality, enabling autonomy without sacrificing standards.
- 7
Taste is vertical- and situation-specific, so it must be socialized locally rather than assumed as a universal capability.