Could ChatGPT Pass Engineering School? A Wake-Up Call for Higher Education


Spanish
Inteligencia Artificial
Inteligencia Artificial
Rawpixel

Redacción HC
27/11/2024

As artificial intelligence continues to evolve, its impact on education is growing impossible to ignore. A new study published in Proceedings of the National Academy of Sciences (PNAS) presents a compelling and concerning scenario: language models like GPT-4 are now capable of correctly answering most university-level STEM exam questions. Conducted by over 80 researchers from the École Polytechnique Fédérale de Lausanne (EPFL), the study asks a provocative question: Could ChatGPT earn an engineering degree?

The implications go far beyond a simple yes or no. If AI systems can reliably answer college-level questions, what does this mean for the validity of assessments, the role of learning, and the future of academic credentials?

AI vs. Academia: What Was Really Tested?

The research team gathered over 1,000 real exam questions—both open-ended and multiple-choice—from 50 undergraduate and graduate-level STEM courses at EPFL. These questions spanned fields like physics, biology, chemistry, mathematics, computer science, and various engineering disciplines.

Two models were tested: GPT-3.5 and GPT-4. Using eight different prompting techniques (instructions to shape AI responses), each question was posed in multiple formats. A question was considered successfully answered if any variation yielded a correct response.

The researchers also classified questions by difficulty, drawing on Bloom’s taxonomy and instructor evaluations, to assess whether AI performance varied by complexity.

Key Results: Can GPT-4 Actually Pass?

1. GPT-4 Scores High

GPT-4 answered 65.8% of questions correctly on average, and with prompt variation, 85.1% of all questions were answered correctly at least once. This performance wasn’t isolated to low-level recall; GPT-4 did well even on questions requiring intermediate reasoning.

“These results suggest that many current academic assessments are vulnerable to AI assistance,” the authors warn.

2. Multiple-Choice Dominance

As expected, GPT-4 performed better on multiple-choice questions than on open-ended responses. This reinforces concerns about overreliance on exam types that don’t sufficiently challenge deep understanding.

3. Variable Performance by Difficulty

While performance dipped as difficulty increased—particularly on conceptual or analytical tasks—GPT-4 still showed substantial success, indicating it can navigate a surprising range of cognitive challenges.

What’s at Stake for Higher Education?

Vulnerable Assessment Systems

The study raises red flags for traditional grading structures. If a student can pass a final exam with ChatGPT’s help, does that exam truly assess their knowledge?

This could lead to devaluation of academic degrees, undermining the reliability of educational institutions as certifiers of human capability.

Risk to Learning Itself

If students lean too heavily on AI tools, they may skip the essential cognitive process of struggling with, understanding, and integrating knowledge. The result? Graduates with credentials—but without competence.

“It’s like flying a plane on autopilot without learning how to take control,” the authors note.

Policy Implications

Universities must urgently re-examine what, how, and why they assess. A comparison is drawn to calculators: useful in context, but not a replacement for foundational understanding.

Solutions on the Horizon: Rethinking How We Evaluate

1. Hybrid Evaluations

Incorporating project-based learning, hands-on assessments, and oral defenses can minimize AI interference and emphasize synthesis and creativity.

2. Supervised Exams

For critical evaluations, the study recommends a return to in-person assessments with limited technology access, particularly for core competencies.

3. AI-Aware Pedagogy

Rather than banning ChatGPT, institutions should teach students how to use AI responsibly—as a learning aid, not a shortcut. This involves designing tasks that go beyond regurgitation, focusing instead on analysis, judgment, and innovation.

4. Faculty Training and Institutional Frameworks

Instructors need support to design robust evaluations and stay informed on the capabilities (and limitations) of generative AI. Universities must also adopt clear policies to govern AI use in academic contexts.

The Bigger Picture: Global Impact and Local Action

While the study focuses on EPFL in Switzerland, its lessons are universally applicable. In regions like Latin America, where hybrid and online learning are now widespread, the research calls for localized strategies to adapt curricula and assessment to an AI-enhanced reality.

“Education systems need to evolve—not to resist AI, but to build with and beyond it,” says co-author Beatriz Borges.

If institutions fail to act, the credibility of academic qualifications—and the future of skilled human labor—may be at risk.

Conclusion: A New Era of Assessment Begins

The question isn’t just whether ChatGPT could earn an engineering degree. It’s whether universities can evolve fast enough to ensure that real human learning remains at the center of higher education.

This study makes one thing clear: a passive approach to AI in the classroom is no longer an option. From evaluation design to curriculum reform, the academic world must face the AI challenge head-on—with urgency, creativity, and a renewed commitment to authentic education.


Topics of interest

Academia

Referencia: Borges B, Foroutan N. Could ChatGPT get an engineering degree? Evaluating higher education vulnerability to AI assistants. Proc Natl Acad Sci USA. 2024;121(49). https://doi.org/10.1073/pnas.2414955121.

License

Creative Commons license 4.0. Read our license terms and conditions
Beneficios de publicar

Latest Updates

Figure.
Forest Biodiversity and Canopy Complexity: How Mixed Species Forests Boost Productivity
Figure.
Academic Degrees Redefining Forestry Professional Development
Figure.
When Animals Disappear, Forests Lose Their Power to Capture Carbon