Insider Brief
- A Cornell University study finds that education technology developers and teachers define risks from AI tools differently, with teachers more concerned about broader social and pedagogical harms.
- While developers prioritize reducing technical issues such as bias and misinformation, educators worry that overreliance on large language models may erode student critical thinking, increase teacher workload, and exacerbate educational inequality.
- The researchers recommend more educator involvement in product design, customizable tools, regulatory oversight, and protections for teachers who choose not to adopt AI systems, calling for a shift toward more nuanced, teacher-centered evaluation of educational technology.
A new study from Cornell University warns that a disconnect between education technology developers and classroom teachers could lead to unintended harms from the use of large language models (LLMs) in K-12 schools.
“Education technology should center educators, and doing that requires researchers, education technology providers, school leaders and policymakers to come together and take action to mitigate potential harms from the use of LLMs in education,” Cornell University quoted Emma Harvey, a doctoral student in the field of information science and lead author of “‘Don’t Forget the Teachers’: Towards an Educator-Centered Understanding of Harms from Large Language Models in Education.”
The research, presented in late April at the Association for Computing Machinery’s CHI conference in Japan, highlights a growing reliance on artificial intelligence tools such as ChatGPT in lesson planning, tutoring, and administrative tasks, according to the university. While developers focus on technical flaws like misinformation and biased outputs, the study found that teachers are more concerned with broader social and educational consequences, including threats to critical thinking, equity, and their own workload.
Led by Harvey and faculty members Allison Koenecke and Rene Kizilcec, the Cornell team interviewed 24 educators and six developers working at educational technology firms. Their analysis revealed differing definitions of harm. Developers prioritized technical performance and safeguards, while teachers expressed concern that overreliance on LLMs could erode student skills and widen disparities between well-funded and under-resourced schools.
According to researchers, teachers cited worries about students becoming overly dependent on AI for answers, reducing the need to reason or analyze independently. They also noted that schools with limited budgets may sacrifice other critical resources to afford AI services, creating a new layer of educational inequality. These sociotechnical harms, the researchers argue, are harder to detect than obvious software flaws but could prove just as damaging.
The study calls for a shift in how these tools are designed and implemented. The researchers recommend that technology developers build features that let teachers correct AI-generated content in real time, allowing them to reinforce learning while minimizing the impact of any model errors. They also encourage government or nonprofit regulators to establish standardized, independent evaluations of AI education tools, particularly those marketed to public schools.
Another recommendation involves making systems more customizable. Educators want tools that can adapt to specific classroom contexts, not generic models that assume uniform teaching environments. The researchers also urge school administrators to consider teacher input more seriously before adopting AI platforms and to avoid penalizing those who opt out.
Cornell pointed out the findings, backed by support from the National Science Foundation and Schmidt Futures, suggest a need for more collaboration between tool builders and frontline educators. While developers often focus on improving accuracy or reducing biased outputs, the study argues that success in the classroom depends on recognizing the social dynamics and pressures teachers face.
The Cornell researchers view their work as a starting point for more nuanced evaluations of AI in education. They stress that existing benchmarks for language models are poorly suited to measuring the long-term societal effects of using such tools in schools. As AI becomes more embedded in daily classroom activities, the researchers warn, oversight must evolve to reflect not just what the technology can do, but how it fits into broader systems of teaching and learning.
“The potential harms of LLMs extend far past the technical concerns commonly measured by machine-learning researchers,” Hrvey was quoted as saying. “We need to be prepared to study the higher-stakes, difficult-to-measure social and societal harms arising from LLM use in the classroom.”