Artificial intelligence-powered coding tools are widely touted as accelerators for software development. However, a recent randomized, controlled trial conducted by computer scientists reveals a surprising, counter-intuitive outcome: these tools might actually be making developers slower.
The Unexpected Reality of AI in Coding
Researchers from Model Evaluation & Threat Research (METR), a non-profit group, published findings that challenge the prevailing narrative. Their study indicates that despite high expectations, AI coding tools led to a measurable decrease in developer speed. More astonishingly, developers who used these tools vastly overestimated their productivity gains, experiencing a form of “AI hallucination” themselves.
Participants in the study predicted a significant 24% boost in task completion speed. Yet, the data showed an actual 19% increase in completion time. Even post-study, developers maintained the belief that AI had accelerated their work by about 20%, starkly contrasting with the empirical evidence of slowdown.
“Surprisingly, we find that allowing AI actually increases completion time by 19 percent — AI tooling slowed developers down.”
Inside the METR Research: A Deep Dive
The study involved 16 seasoned developers actively contributing to large, open-source projects. They provided 246 real-world issues, such as bug fixes and new features, along with their time estimates. These tasks were then randomly assigned to either allow or disallow the use of AI tools. Developers permitted to use AI predominantly opted for Cursor Pro integrated with Claude 3.5/3.7 Sonnet. The experimental work spanned from February to June 2025.
Five Key Factors Behind the Productivity Dip
The researchers pinpointed several reasons for the observed slowdown:
- Over-optimism about AI usefulness: Developers held unrealistic expectations regarding AI’s capabilities.
- High developer familiarity with repositories: Highly experienced developers working on well-known projects found less practical value in AI assistance, suggesting AI’s help might be redundant for those with deep domain knowledge.
- Large and complex repositories: AI performance degraded significantly in extensive codebases exceeding 1 million lines, struggling to grasp the full context.
- Low AI reliability: Developers accepted less than 44% of AI-generated suggestions, subsequently spending considerable time cleaning up and reviewing the AI’s output.
- Implicit repository context: The AI tools often failed to comprehend the nuanced, unwritten contextual information inherent in complex projects.
Other minor factors, such as AI generation latency and suboptimal input context, might have also played a role, though their precise impact remains unclear.
Wider Implications and Supporting Evidence
The METR study isn’t an isolated incident. Other recent research has similarly challenged the widespread hype around AI’s immediate productivity benefits:
- A study by AI coding firm Qodo found that perceived benefits of AI software assistance were often offset by the extra work required to validate AI-generated code.
- An economic survey from Denmark indicated that generative AI had no measurable impact on jobs or wages.
- An Intel study suggested that AI PCs, despite their advanced capabilities, could make users less productive.
- Call center workers at a Chinese electrical utility reported that while AI could speed up certain tasks, it often created additional work, leading to an overall slowdown.
This “added work” aspect is critical. The METR study graphics illustrate that “When AI is allowed, developers spend less time actively coding and searching for/reading information, and instead spend time prompting AI, waiting on and reviewing AI outputs, and idle.”
Anecdotally, many coders find AI tools useful for quickly testing new scenarios or automating routine tasks. However, these incremental benefits often don’t translate into overall time savings because rigorous validation of AI-generated code remains indispensable. Unlike an intern, AI tools don’t learn from feedback in a way that truly reduces oversight needs, making programming potentially more enjoyable but not necessarily more efficient.
Important Caveats and Future Outlook
The authors of the study—Joel Becker, Nate Rush, Beth Barnes, and David Rein—emphasize that their findings should be viewed within a narrow context. This research represents a snapshot in time, based on specific experimental tools and conditions.
They clarify: “The slowdown we observe does not imply that current AI tools do not often improve developer’s productivity – we find evidence that the high developer familiarity with repositories and the size and maturity of the repositories both contribute to the observed slowdown, and these factors do not apply in many software development settings.”
The researchers also note that their conclusions do not invalidate the utility of current AI systems in all scenarios, nor do they preclude future AI models from offering substantial improvements in developer productivity. The journey of AI integration into software development is still evolving, and its true potential is yet to be fully realized.