Abstract
Traditional digital learning systems capture only narrow traces — grades, clicks, surveys — that miss the interactions between cognitive, affective, and physiological states. We propose Educational Omics: a framework that organizes learning data into six interrelated dimensions, instantiated through a five-layer data-lake architecture and validated in a Python programming general-education course.
Problem & Motivation
Traditional digital learning systems capture only narrow learning traces — exam scores, platform clicks, survey responses. These indicators cannot capture the interactions between cognition, affect, and physiology that shape learning. Heterogeneous data streams (speech transcripts, wearable signals, environmental sensors, AI dialogue logs) live in disconnected systems, making multimodal learning analytics both methodologically and technically difficult.
Method
Drawing on the -omics integration paradigm from the life sciences, we propose the Educational Omics framework organizing learning data into six interrelated dimensions: Cognomics (cognition), PhysioNeuromics (physiology/neural), Sociomics (social interaction), Environomics (environment), Linguomics (language), and Ethicomics (ethics). System-side, the Educational Omics Data Lake is realized as a five-layer architecture: data acquisition, integration & synchronization, storage, analytics & modeling, and application & feedback. The framework was instantiated through the Uedu platform and case-validated in a Python programming general-education course.
Findings
- Student-side: students described generative-AI tools (QuizGPT, Uedu MyGPTs) as the most supportive resource for debugging, conceptual clarification, and confidence-building. One student built a cross-platform oscilloscope automation interface with the AI acting as a sustained problem-solving partner.
- Cross-course adoption: across six weeks, generative-AI tools were used in language writing, education, economics, history, and engineering — with some courses adopting routinely and others exploring more tentatively.
- Instructor role: classes with active instructor guidance showed steadier engagement, while self-exploration classes showed higher variance — indicating instructional design mediates outcomes beyond technical access.
- Data-lake validation: the framework successfully integrated multimodal data, supporting the Omics architecture as a viable infrastructure for sustainable, data-driven educational transformation.
Implications
Education should not be reduced to exam scores. Just as biomedicine uses genomics to understand life, education can use “educational omics” to understand learning. For instructors, this means understanding students through cognition, language, physiology, social interaction, environment, and ethics — only multimodal integration reveals patterns like a student showing high stress, anxious language, and a noisy learning environment as connected phenomena. The Educational Omics Data Lake is the core theoretical and technical foundation that supports all data collection, analysis, and research across the Uedu platform.
Citation
BibTeX
@inproceedings{chang2025edu_omics,
author = {Chia-Kai Chang and Kuei-Hao Li},
title = {Designing an Educational Omics Data Lake: A Multimodal Infrastructure for Technology-Enhanced Learning},
booktitle = {Proc. Int. Conf. on Modern Educational Technology (ICMET)},
pages = {284--290},
year = {2025},
month = dec,
doi = {10.1109/ICMET67594.2025.11452000},
}