Jingwei Zuo

Jingwei Zuo 左敬威

Pronounced /dʒɪŋweɪ dzʊɔ/jing-way zwaw

Principal Researcher · Technology Innovation Institute (TII)

I am a Principal Researcher at the Technology Innovation Institute (TII), Abu Dhabi, where I lead the Falcon foundational models team. My research focuses on making large language models simultaneously more capable and more efficient, with a particular emphasis on principled data curation, large-scale pretraining, and novel architecture design. Recent model releases include Falcon-H1, Falcon-Edge, Falcon 3, and Falcon-Mamba.

I received my PhD (2022) from Université Paris-Saclay, awarded the Plateau de Saclay Doctoral Prize, an M.Sc. in Data Science (2018) from Paris-Saclay, an M.Sc. in CS (2017) from Sorbonne Université, and a B.Sc. in EE from HUST.

Featured Projects

View all projects

News

Apr 2026 Attending ICLR 2026 in Rio de Janeiro — happy to chat! NEW
Jan 2026 Falcon-H1-Tiny released — 90M to 0.6B NEW
Jan 2026 Learnable Multiplier paper out
Sep 2025 Invited talk on Falcon-H1 at ASAP Seminar [YouTube] [Slides]

Selected Publications

2026
J. Zuo, I. Chahed, M. Velikanov, C. Zeng, D. E. Rhaiem, P. Balsebre, A. Kumar, Y. Belkada, H. Hacid.
“Train Smarter, Not Longer: Memorization-Guided Data Reuse for Efficient LLM Training”
M. Velikanov, I. Chahed, J. Zuo, D. E. Rhaiem, Y. Belkada, H. Hacid.
“Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers”
arXiv, 2026.
2025
J. Zuo, M. Velikanov, I. Chahed, Y. Belkada, D. E. Rhaiem, G. Kunsch, et al.
“Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance”
arXiv, 2025.
2024
J. Zuo, M. Velikanov, D. E. Rhaiem, I. Chahed, Y. Belkada, G. Kunsch, H. Hacid.
“Falcon Mamba: The First Competitive Attention-free 7B Language Model”
arXiv, 2024.

Background

Research Interests

My current research centers on making large language models simultaneously more capable and more efficient. On the data side, I focus on large-scale pretraining data preparation, principled data curation, synthetic data generation, data mixture strategies, and (anti-) curriculum schedules that improve training efficiency at every scale — from 90M to 34B parameters. On the architecture side, I explore hybrid designs that combine Transformer attention with State Space Models, achieving better quality–throughput trade-offs than either paradigm alone. This dual focus on data and architecture is the thread connecting Falcon-H1, Falcon-Mamba, and our ongoing work.

Education
Ph.D. in Machine Learning
Université Paris-Saclay
Plateau de Saclay Doctoral Prize
2022
M.Sc. in Data Science
Université Paris-Saclay
2018
M.Sc. in CS
Sorbonne Université
2017
B.Sc. in EE
Huazhong University of Science & Technology (HUST)
2015

Contact

Address
PO Box 9639, Masdar City, Abu Dhabi, UAE