Redacción HC
01/03/2025
In an era where artificial intelligence is rapidly reshaping education, a new conceptual study urges us to look beyond chatbots and explore how multimodal large language models (MLLMs)—AI systems capable of processing not just text, but also images, audio, and video—can revolutionize science education.
Published in Learning and Individual Differences, this work by researchers from the Technical University of Munich (TUM) and the University of Georgia proposes a theoretical framework that reimagines the role of AI in classrooms—not as a replacement for teachers, but as a powerful co-educator that enhances personalization, feedback, and student engagement.
Traditional AI tools in education often focus on text-based tasks—summarizing, translating, or answering questions. But science education requires more than words. It involves interpreting diagrams, analyzing experiments, and making sense of complex data across multiple formats.
Multimodal models like GPT-4 Vision can process videos of a lab experiment, analyze charts, and generate tailored explanations combining audio, images, and written content. This opens doors to an enriched learning experience—especially for disciplines like biology, physics, and chemistry.
“The shift from linear to multimodal learning may unlock deeper cognitive engagement,” the authors argue, drawing from cognitive load theory and Mayer’s multimedia learning principles.
The study, while conceptual, outlines a detailed framework for integrating MLLMs effectively. It identifies four core areas:
MLLMs can generate educational materials across formats—e.g., interactive diagrams, narrated animations, or hybrid quizzes. A physics teacher could ask the AI to create a visual step-by-step explanation of Newton’s laws, tailored for 8th graders.
Students at different levels receive customized feedback and instruction, adapting not only to their textual responses but also to how they interpret graphs or speak aloud during oral assessments.
Example: An MLLM analyzes a student’s spoken hypothesis and provides targeted tips for improving experimental design.
Science is about doing—not just knowing. MLLMs can guide learners through inquiry-based processes, like setting up virtual experiments, analyzing results, and writing reports. These experiences simulate authentic scientific workflows.
Whether interpreting hand-drawn sketches or video responses, MLLMs can deliver real-time feedback in diverse formats. A student might upload a video explaining an experiment; the AI returns a text critique with annotated visuals and spoken suggestions.
While the possibilities are promising, the authors also emphasize critical limitations:
Hence, the authors argue for ethical frameworks and teacher training to ensure that AI remains a tool—not a replacement.
MLLMs can bridge gaps in under-resourced communities, offering voice-assisted lessons and visual content in local languages. For Latin America, this could mean virtual biology labs in Spanish, narrated with regionally relevant examples.
“Inclusive design can make science accessible to learners with visual or hearing impairments,” the study notes.
Governments and institutions should fund pilot programs to test MLLM-based tools in real classrooms. These pilots should monitor effectiveness, student engagement, and data security.
Far from sidelining teachers, the framework views them as co-pilots of AI. Educators should be trained in interpreting AI-generated feedback, customizing MLLM outputs, and managing ethical dilemmas.
The paper calls for empirical studies to validate the proposed model—especially experiments comparing traditional versus MLLM-enhanced teaching on student outcomes.
The study paints a vision where MLLMs evolve into intelligent assistants—not mere tools, but adaptive collaborators capable of transforming science education. Imagine a classroom where students record themselves explaining a physics problem, and receive instant multimodal feedback: a diagram, a narrated explanation, and suggestions to improve their logic.
This is not just an AI-enhanced worksheet—it’s a rethinking of how science is taught and experienced.
As AI grows more capable, the question is no longer whether it will influence education—but how. This study offers a roadmap grounded in theory and responsibility. The takeaway is clear: when guided ethically and integrated wisely, MLLMs can humanize and personalize science learning in ways never before possible.
Let’s take the next step—together, with AI as a partner.
Topics of interest
Referencia: Bewersdorff A, Hartmann C, Hornberger M, Seßler K, Bannert M, Kasneci E, Kasneci G, Zhai X, Nerdel C. Taking the next step with generative artificial intelligence: The transformative role of multimodal large language models in science education. Learn Individ Differ. 2025;118:102601. doi:10.1016/j.lindif.2024.102601
![]()