Insider Brief
- Carnegie Mellon University leads in estimated papers-per-student output for LLM research, followed by Stanford, Peking University and Tsinghua, with most top contributors being academic institutions rather than industry labs.
- A recent arXiv study analyzed nearly 17,000 large language model (LLM) papers from 2018 to 2023 to identify leading academic institutions in LLM research; an additional analysis normalized this output by student population to estimate research efficiency.
- The study found academic LLM research is growing faster than industry research, with universities accounting for 85% of papers and increasingly leading contributions in emerging areas like societal impacts and multimodal AI.
Research into artificial intelligence is a vast subject, but large language models (LLMs) — systems trained on massive text datasets to understand and generate human-like language — have become a cornerstone of the AI research enterprise. From powering chatbots to advancing machine translation, LLMs are reshaping technology and society. As global competition to lead in AI research intensifies, identifying which universities produce the most impactful LLM research is critical for students, policymakers, and industries.
A recent arXiv study on LLM research trends, for example, offers a glimpse into this landscape by analyzing 16,979 LLM-related papers published between January 1, 2018, and September 7, 2023. The findings suggest Carnegie Mellon University (CMU) is the most efficient, followed by Stanford and Peking University, though uncertainties in exact paper counts add complexity to the rankings.
While Big Tech and other large multinational corporations play a key role in LLM research, universities and research institutions are the real engine of research and discovery.
The paper states: “85% of papers have an academic affiliation, and 41 out of the top 50 institutions are academic, led by CMU, Stanford, Tsinghua, Peking, and UW. Further, academic LLM research has grown faster in 2023 than has industry research: there are 3.3 × as many academic as industry papers in 2023, compared to 2.3× pre-2023.”
Methodology: How the Rankings Were Determined
The arXiv study’s approach was multifaceted, focusing on publication volume, authorship trends and institutional collaboration. Researchers collected LLM-related papers from arXiv, a popular platform for academic preprints, using keyword searches and manual curation to ensure relevance. The dataset spanned subfields like natural language processing, machine learning, and AI, capturing the breadth of LLM research. They compared trends from 2018–2022 to 2023, noting shifts such as a 20-fold increase in papers exploring societal impacts of LLMs, submitted to arXiv’s Computers and Society category.
Authorship analysis revealed an influx of new researchers from fields like computer vision and software engineering, broadening the scope of LLM research. The study also examined the balance between industry and academia, finding that industry’s share of publications dropped in 2023, largely due to reduced output from companies like Google, while Asian universities increased their contributions. Institutional collaborations were common, but industry-academic partnerships often focused on industry-preferred topics rather than exploring new areas.
The goal of this list is to not just consider research output, but also consider institutional size to find pockets of LLM scholarship that might be missed otherwise. By combining publication data with student population figures, the list offers an estimation of which institutions lead in LLM research output per student, accounting for the natural advantage of larger universities.
To offer some guidance on these institutions by research efficiency, the study’s publication rankings were normalized by student population, as larger universities naturally produce more papers. Student numbers were sourced from reliable platforms like US News, university websites, and educational databases such as EduRank. Since exact LLM paper counts per institution were not provided, researchers estimated output based on rank, assuming the top-ranked institution (CMU) had the highest number of papers, decreasing linearly with rank (e.g., CMU = 10 papers, Stanford = 9, etc.). This allowed calculation of papers-per-student ratios, offering a fairer comparison across institutions of different sizes.
This list also considers AI and LLM education, AI- and LLM- centered research institutions and centers, along with outstanding faculty in LLM research.
Top Institutions: Leaders in LLM Research
The study identified the following top 10 academic institutions for LLM research based on publication volume:
Institution | Student Population | Estimated Papers (Rank-Based) | Papers-per-Student |
---|---|---|---|
Carnegie Mellon University (CMU) | 13,519 | 10 | 0.00074 |
Stanford University | 17,469 | 9 | 0.00051 |
Tsinghua University | 48,739 | 8 | 0.00016 |
Peking University | 35,000 | 7 | 0.00020 |
University of Washington (UW) | 60,692 | 6 | 0.00010 |
Zhejiang University | 54,641 | 5 | 0.00009 |
Nanyang Technological University (NTU) | 37,458 | 4 | 0.00011 |
Chinese University of Hong Kong (CUHK) | 30,794 | 3 | 0.00010 |
University of Chinese Academy of Sciences (UCAS) | 63,573 | 2 | 0.00003 |
National University of Singapore (NUS) | 38,000 | 1 | 0.00003 |
CMU leads with an estimated papers-per-student ratio of 0.00074, followed by Stanford at 0.00051 and Peking University at 0.00020. These ratios reflect CMU’s efficiency, given its smaller student body of 13,519 compared to Stanford’s 17,469 or Tsinghua’s 48,739. The chart above visualizes these ratios, highlighting CMU’s standout performance.
Carnegie Mellon University (CMU)
Carnegie Mellon University (CMU) is a global leader in AI and large language model (LLM) research, renowned for its Machine Learning Department, where computer agents improve through experience, a core principle of LLM development (CMU ML). CMU offers an Online Graduate Certificate in Generative AI & Large Language Models, equipping professionals with advanced techniques in LLM design, taught by faculty at the forefront of computer science research. The university’s researchers have developed MLC LLM, a toolset enabling language models to run on any device, revolutionizing LLM accessibility and application. Additionally, CMU’s collaboration with Microsoft on benchmarking LLM agents underscores its commitment to advancing practical applications of LLM technology (AI Tinkerers).
Stanford University
Stanford University is acknowledged as an AI leader broadly and in LLM research, specifically, with its Stanford Artificial Intelligence Laboratory (SAIL) conducting pioneering work in natural language processing, a key component of LLMs (Stanford AI). The Stanford Machine Learning Group focuses on improving lives through AI, emphasizing practical applications of machine learning, including LLMs, across various fields (Stanford ML). Stanford has developed STORM, an innovative LLM system that generates Wikipedia-style articles, showcasing its advancements in AI-powered knowledge curation. Furthermore, Stanford’s AI Playground provides a platform for exploring and experimenting with various LLMs, highlighting its dedication to both research and education in AI (AI Playground).
Peking University
Peking University has emerged as a leader in AI research, surpassing US institutions in research output rankings, with its Institute for Artificial Intelligence and school of intelligence science and technology playing pivotal roles (Peking AI). The university hosts the PKU CoRe Lab and the BAAI Innovation Center, both well-respected institutions of AI research, including LLMs (PKU CoRe). Peking University’s AIMIA Lab specializes in multimodal LLMs, demonstrating its commitment to advancing this critical area of AI. These initiatives underscore Peking University’s significant contributions to the global AI landscape, particularly in LLM development and application.
Tsinghua University
Tsinghua University is a cornerstone of China’s AI industry, often referred to as the “cradle of China’s LLMs” due to its extensive talent pool and research ecosystem. The university has nurtured numerous LLM startups, with alumni founding companies like Moonshot AI, which developed the popular LLM Kimi. Tsinghua’s research extends to innovative applications of LLMs, such as in psychological research and AI-based medical platforms, demonstrating its broad impact. Additionally, Tsinghua’s Generative AI 2025 Summer School highlights its commitment to training the next generation of AI experts.
University of Washington (UW)
The University of Washington (UW) is a leader in AI research, with its Paul G. Allen School of Computer Science & Engineering actively engaged in machine learning and natural language processing, essential for LLMs (UW AI). UW offers advanced educational programs, including a Master of Science in Artificial Intelligence and Machine Learning for Engineering and a Graduate Certificate in the same field, catering to professionals seeking AI expertise. The university’s research includes innovative LLM applications, such as supporting visualization for blind/visually impaired users, and the UW HAI lab focuses on responsible human-AI interaction. These efforts highlight UW’s significant contributions to both theoretical and applied aspects of AI and LLMs.
Zhejiang University
Zhejiang University is recognized as a leading institution for AI research in China, with its College of Computer Science and Technology excelling in international programming contests and research. The university offers innovative educational programs, such as the AI+X Micro-Program, which integrates AI with other disciplines to foster interdisciplinary skills. Researchers at Zhejiang University are actively contributing to LLMs, with publications on topics like causal inference with machine learning and GUI agents using LLMs. These efforts are just a few that show Zhejiang University’s significant role in advancing AI and LLM research in interdisciplinary contexts.
Nanyang Technological University (NTU) Singapore
Nanyang Technological University (NTU) Singapore is a leading institution in AI research, with faculty and researchers engaged in federated learning and the security of large language models (LLMs) (NTU Research). NTU offers a range of AI programs, including a Master of Science in AI and a Bachelor of Engineering in Artificial Intelligence, catering to both graduate and undergraduate students. The university’s research initiatives have led to advancements in healthcare technologies and smart cities, often in collaboration with industry partners like Alibaba. These efforts demonstrate NTU’s significant contributions to both theoretical and practical aspects of AI and LLMs.
Chinese University of Hong Kong (CUHK)
The Chinese University of Hong Kong (CUHK) is engaged in cutting-edge AI research, with faculty like Hongsheng Li exploring generalist vision transformers through universal language interfaces, closely related to LLMs. Researchers at CUHK are also working on scaling up vision foundation models and aligning them with LLMs for visual-linguistic tasks (CUHK MMLab). CUHK’s interest in language-specific AI is evident from its work on benchmarking Cantonese capabilities of LLMs, showcasing its commitment to diverse linguistic contexts. These efforts position CUHK as a significant contributor to the global AI and LLM research community, particularly in vision-language integration.
University of Chinese Academy of Sciences (UCAS)
The University of Chinese Academy of Sciences (UCAS) is making strides in AI research, particularly in developing large language models (LLMs). UCAS has open-sourced LLaMA-Omni, a multimodal LLM that processes both speech and text data, showcasing its innovation in this area. As part of the Chinese Academy of Sciences, UCAS contributes to broader AI initiatives, including AI processor chip development and research into LLM training systems. These efforts position UCAS as a key player in advancing AI and LLM technologies, with a focus on multimodal capabilities and computational efficiency.
National University of Singapore (NUS)
The National University of Singapore (NUS) is a leading AI research institution, particularly in developing large language models (LLMs). NUS contributes to the National Multimodal LLM Programme (NMLP) to drive AI innovation and create the first LLM with a South-east Asian context. The university applies AI in education, using tools like ScholAIstic to enhance students’ practical skills through roleplaying. NUS’s research labs explore multimodal intelligence, including video and language processing, crucial for advancing LLM technologies.
Media Mentions
Not relied on in this list per se, but it’s important to note that media mentions provide another lens on institutional impact, though they are less direct than publication metrics. We noted, for example, X posts highlight CMU and MIT for LLM advancements, with MIT noted for innovative work in June 2025 X Post MIT and CMU for research opportunities in November 2023 X Post CMU. Princeton’s Center for Language and Intelligence also gained attention in April 2023 X Post Princeton. However, quantifying media mentions is difficult, and their influence on rankings remains secondary to publication data.
What do the Rankings Mean?
These rankings are less for comparative uses — for example, University X is better than University Y at LLM research. There are too many good universities and too many good LLM researchers to definitively say one is better than the other. Rather, the list shows general leadership in LLM research and offers a chance to delve into their expertise in this revolutionary technology.
Each university also offers variations in its LLM expertise and its efforts to build centers of research around LLMs. For example, CMU’s leadership in papers-per-student suggests it maximizes research output relative to its size, likely due to its specialized AI programs and strong faculty. Stanford’s high ranking reflects its global reputation and resources, while Peking University’s position highlights China’s growing AI research prowess.
Larger institutions like Tsinghua and the University of Washington contribute significantly in absolute terms but rank lower per student due to their scale. These findings guide students seeking top AI programs, inform industry partnerships, and signal where innovation is concentrated. The rise of Asian universities underscores shifting global dynamics in AI research, driven by increased funding and talent development.
Limitations: Uncertainties in the Data
The rankings face several limitations. The reliance on arXiv papers may miss research published elsewhere or kept proprietary, potentially underrepresenting some institutions. It certainly does not offer a fair measure of quality, as leading journals would likely be better indicators for that.
Exact LLM paper counts were estimated based on rank, introducing uncertainty, as actual numbers could vary. Normalizing by student population assumes uniform research contributions across students, which may not hold, as graduate students and faculty drive most research. Media mentions, while insightful, are not systematically quantified, limiting their weight. Despite these challenges, the study’s methodology provides a robust framework for comparing institutional efficiency.
Although this list is based on the most recent statistics available, a lot has obviously changed after 2023 in LLM research, so that should be considered when using this list for guidance.
Finally, companies and commercial enterprises — such as Google, OpenAI, Microsoft, etc. — drive a lot of LLM research advances, but were not considered in these studies or our further refinements.
Beyond Quantity
Another critical limitations of a list like this is that research quantity does not equal research quality.
Assessing paper quality is challenging without comprehensive citation data, we can surmise based on the study and our own examination that institutions like CMU, Stanford and Tsinghua produce high-impact work, as their papers are frequently referenced in academic discussions. For instance, the University of Washington and the Allen Institute for AI were noted for high citation rates, indicating influential contributions. CMU’s focus on AI and computer science likely enhances its ability to produce important LLM research, while Stanford’s interdisciplinary approach supports its strong output. Asian universities like Tsinghua and Peking benefit from significant national investment in AI, contributing to their high publication volumes.
The Road Ahead for LLM Research
As LLMs continue to shape technology, understanding which institutions lead in research efficiency will remain vital. Future studies could refine rankings by incorporating exact paper counts, citation metrics, and broader publication sources. Tracking media and industry impact more systematically could also enhance assessments. The growing role of Asian universities suggests a more distributed global research landscape, with opportunities for cross-country collaboration. Institutions like CMU and Stanford will likely maintain their edge by fostering specialized AI programs, while larger universities may leverage scale to drive total output.