Redacción HC
27/11/2024
As artificial intelligence continues to evolve, its impact on education is growing impossible to ignore. A new study published in Proceedings of the National Academy of Sciences (PNAS) presents a compelling and concerning scenario: language models like GPT-4 are now capable of correctly answering most university-level STEM exam questions. Conducted by over 80 researchers from the École Polytechnique Fédérale de Lausanne (EPFL), the study asks a provocative question: Could ChatGPT earn an engineering degree?
The implications go far beyond a simple yes or no. If AI systems can reliably answer college-level questions, what does this mean for the validity of assessments, the role of learning, and the future of academic credentials?
The research team gathered over 1,000 real exam questions—both open-ended and multiple-choice—from 50 undergraduate and graduate-level STEM courses at EPFL. These questions spanned fields like physics, biology, chemistry, mathematics, computer science, and various engineering disciplines.
Two models were tested: GPT-3.5 and GPT-4. Using eight different prompting techniques (instructions to shape AI responses), each question was posed in multiple formats. A question was considered successfully answered if any variation yielded a correct response.
The researchers also classified questions by difficulty, drawing on Bloom’s taxonomy and instructor evaluations, to assess whether AI performance varied by complexity.
GPT-4 answered 65.8% of questions correctly on average, and with prompt variation, 85.1% of all questions were answered correctly at least once. This performance wasn’t isolated to low-level recall; GPT-4 did well even on questions requiring intermediate reasoning.
“These results suggest that many current academic assessments are vulnerable to AI assistance,” the authors warn.
As expected, GPT-4 performed better on multiple-choice questions than on open-ended responses. This reinforces concerns about overreliance on exam types that don’t sufficiently challenge deep understanding.
While performance dipped as difficulty increased—particularly on conceptual or analytical tasks—GPT-4 still showed substantial success, indicating it can navigate a surprising range of cognitive challenges.
The study raises red flags for traditional grading structures. If a student can pass a final exam with ChatGPT’s help, does that exam truly assess their knowledge?
This could lead to devaluation of academic degrees, undermining the reliability of educational institutions as certifiers of human capability.
If students lean too heavily on AI tools, they may skip the essential cognitive process of struggling with, understanding, and integrating knowledge. The result? Graduates with credentials—but without competence.
“It’s like flying a plane on autopilot without learning how to take control,” the authors note.
Universities must urgently re-examine what, how, and why they assess. A comparison is drawn to calculators: useful in context, but not a replacement for foundational understanding.
Incorporating project-based learning, hands-on assessments, and oral defenses can minimize AI interference and emphasize synthesis and creativity.
For critical evaluations, the study recommends a return to in-person assessments with limited technology access, particularly for core competencies.
Rather than banning ChatGPT, institutions should teach students how to use AI responsibly—as a learning aid, not a shortcut. This involves designing tasks that go beyond regurgitation, focusing instead on analysis, judgment, and innovation.
Instructors need support to design robust evaluations and stay informed on the capabilities (and limitations) of generative AI. Universities must also adopt clear policies to govern AI use in academic contexts.
While the study focuses on EPFL in Switzerland, its lessons are universally applicable. In regions like Latin America, where hybrid and online learning are now widespread, the research calls for localized strategies to adapt curricula and assessment to an AI-enhanced reality.
“Education systems need to evolve—not to resist AI, but to build with and beyond it,” says co-author Beatriz Borges.
If institutions fail to act, the credibility of academic qualifications—and the future of skilled human labor—may be at risk.
The question isn’t just whether ChatGPT could earn an engineering degree. It’s whether universities can evolve fast enough to ensure that real human learning remains at the center of higher education.
This study makes one thing clear: a passive approach to AI in the classroom is no longer an option. From evaluation design to curriculum reform, the academic world must face the AI challenge head-on—with urgency, creativity, and a renewed commitment to authentic education.
Topics of interest
AcademiaReferencia: Borges B, Foroutan N. Could ChatGPT get an engineering degree? Evaluating higher education vulnerability to AI assistants. Proc Natl Acad Sci USA. 2024;121(49). https://doi.org/10.1073/pnas.2414955121.
![]()