Redacción HC
11/09/2025
The rapid adoption of artificial intelligence (AI) coding tools has been accompanied by bold promises: faster workflows, fewer bugs, and a boost in developer productivity. Platforms like GitHub Copilot and advanced IDE assistants have been marketed as game-changers for software engineering. But what happens when the data doesn’t match the hype?
A recent study by Model Evaluation & Threat Research (METR) challenges prevailing assumptions. Conducted in early 2025, it suggests that experienced open-source developers actually became slower when using cutting-edge AI assistants. Far from being a universal accelerator, AI in this context led to a measurable slowdown — an outcome that raises new questions for companies, policymakers, and technologists.
The METR team asked a straightforward but underexplored question: How do AI tools available in early 2025 affect the productivity of experienced developers working on their own open-source repositories?
This inquiry is crucial. While algorithmic benchmarks and self-reported surveys often show positive results, they fail to capture real-world complexities — from debugging legacy code to maintaining high-quality repositories. Misjudging the impact of AI could mislead investment decisions and even shape misguided policies.
To address the question, METR designed a randomized controlled trial (RCT) — a gold standard in experimental design.
Compensation was set at $150/hour, underscoring the professional nature of the trial. The primary outcome: task completion time, supplemented by quality checks such as testing, documentation, and style compliance.
The results were both striking and counterintuitive:
This disconnect between perception and reality points to what the authors describe as an illusion of productivity.
The study’s robustness checks ruled out simpler explanations like task imbalance or inconsistent quality between conditions.
Earlier trials have shown AI accelerating developer tasks. For example:
The divergence lies in context. Unlike controlled coding exercises or corporate tasks, METR’s study focused on experienced contributors working in mature, complex repositories. In such environments, the marginal value of AI appears lower, while the overhead of integration is higher.
The findings carry important lessons:
At the policy level, the authors caution against overestimating AI’s productivity boost, which could distort forecasts in innovation, labor markets, and risk regulation.
The METR study underscores a vital point: not all AI deployments accelerate work. In fact, in the hands of skilled open-source developers working on complex repositories, AI slowed them down.
For organizations considering widespread adoption, the message is clear: test rigorously, measure carefully, and adapt thoughtfully. For policymakers, the results highlight the importance of grounding AI narratives in data, not hype.
The road to AI-enhanced productivity is far from linear. Sometimes, the fastest way forward is a slower, more measured step back.
Reference:
Becker J, Rush N, Barnes E, Rein D. Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity [Internet]. arXiv; 2025 Jul [cited 2025 Sep 6]. Available from: https://arxiv.org/abs/2507.09089
Becker J, Rush N, Barnes E, Rein D. Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity [Internet]. San Francisco: Model Evaluation & Threat Research (METR); 2025 Jul [cited 2025 Sep 6]. Available from: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
Jain A, et al. The Impact of AI on Developer Productivity: Evidence from GitHub Copilot [Internet]. arXiv; 2023 Feb [cited 2025 Sep 6]. Available from: https://arxiv.org/abs/2302.06590
Sadowski C, et al. The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot [Internet]. arXiv; 2024 Oct [cited 2025 Sep 6]. Available from: https://arxiv.org/abs/2410.02091
![]()