Preetum Nakkiran

Research Scientist at Apple · Foundations of Machine Learning

I am broadly interested in understanding deep learning. When do models generalize, and what should generalization mean? When models fail, do they fail in predictable ways? I like to identify an interesting empirical behavior (e.g. length-generalization, or calibration, or...), and then characterize the behavior as precisely as possible.

Separately, I've recently been thinking about new & improved mechanisms for peer review.

Interns I've Hosted/Collaborated

Faculty I've Sponsored (via Apple)

Service

Area Chair: ICLR 2024, NeurIPS 2024, ICML 2025, ICML 2026

Research

Calibration

Calibration is one way of formalizing how models which are not Bayes-optimal can still be "good" in other ways. What types of calibration hold for deep networks, and why? Is there a meaningful and achievable notion of calibration for LLMs?

  • Trained on Tokens, Calibrated on Concepts: Semantic Calibration in LLMs 2025 arXiv
    Preetum Nakkiran, Arwen Bradley, Adam Goliński, Eugene Ndiaye, Michael Kirchhof, Sinead Williamson
  • When Does Optimizing a Proper Loss Yield Calibration? NeurIPS 2023 Spotlight arXiv
    Jaroslaw Blasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
  • A Unifying Theory of Distance from Calibration STOC 2023 arXiv code
    Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran

Diffusion & Generative Models

I've written a tutorial on diffusion models, studied the mechanisms of diffusion generalization (classifier-free guidance, composition), and helped develop new methods (TarFlow).

  • Normalizing Flows are Capable Generative Models ICML 2025 Oral arXiv
    Shuangfei Zhai, Ruixiang Zhang, Preetum Nakkiran, David Berthelot, Jiatao Gu, Huangjie Zheng, Tianrong Chen, Miguel Angel Bautista, Navdeep Jaitly, Josh Susskind
  • Step-by-Step Diffusion: An Elementary Tutorial Foundations and Trends 2024 arXiv book
    Preetum Nakkiran, Arwen Bradley, Hattie Zhou, Madhu Advani

Understanding Generalization

Why do overparameterized models generalize? Why do underparameterized models generalize? Are these questions related? Why do LLMs generalize out-of-distribution, and how do we formalize this?

  • What Algorithms can Transformers Learn? ICLR 2024 arXiv
    Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, Preetum Nakkiran
  • The Deep Bootstrap Framework ICLR 2021 arXiv
    Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi
  • Deep Double Descent ICLR 2020 arXiv
    Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever
  • Towards an Empirical Theory of Deep Learning PhD Thesis 2021 pdf
    Preetum Nakkiran
See Google Scholar for full list of publications.

About

I completed my PhD at Harvard, advised by Madhu Sudan and Boaz Barak. In my postdoc I worked with Misha Belkin. I did my undergrad in EECS at UC Berkeley. Go Bears!

I'm broadly interested in theory and science. In the past, I've interned at OpenAI (with Ilya Sutskever), Google Brain (with Behnam Neyshabur, Hanie Sedghi), and have done research in error-correcting codes, distributed storage, and cryptography. I'm grateful for past support from NSF GRFP, the Google PhD Fellowship, and the Simons Institute.

For talks, use this bio. An (outdated) CV is here. ORCID

What People are Saying

a "high-level" scientist — colleague (ML)
makes plots and draws lines through them — colleague (TCS)
has merits that outweigh flaws — reviewer 2

Selected Tweets

From The Archives