Preetum Nakkiran

preetum@nakkiran.org

I'm a Research Scientist at Apple, working on foundations of machine learning.
My work aims broadly to understand generalization in deep learning. Recent topics: calibration [1] [2] [3] [4] [5], Transformer generalization [6], and diffusion [7] [8].

I completed my PhD at Harvard, having the unique pleasure of being advised by Madhu Sudan and Boaz Barak. In my postdoc I worked with Misha Belkin. I am grateful for past support from NSF and the Simons Institute. Go Bears!

Interns at Apple (hosted or collaborated):

  • Jacob Springer (CMU)
  • Siddartha Devic (USC)
  • Charlotte Peale (Stanford)
  • Hattie Zhou (now at Anthropic)
  • Noam Razin (now at Princeton)
  • Lunjia Hu (now at Northeastern)
  • Elan Rosenfeld (now at Google Research)
  • Shivam Garg (now at MSR)
  • Annabelle Carrell (University of Cambridge)
  • Rylee Thompson (now at Huawei)

Service: Area Chair (ICLR 2024, NeurIPS 2024, ICML 2025, ICML 2026)


Selected Research

See [publications] for full list.

2025 Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs
Preetum Nakkiran, Arwen Bradley, Adam Goliński, Eugene Ndiaye, Michael Kirchhof, Sinead Williamson
[arXiv]
2024 Normalizing flows are capable generative models
Shuangfei Zhai, Ruixiang Zhang, Preetum Nakkiran, David Berthelot, Jiatao Gu, Huangjie Zheng, Tianrong Chen, Miguel Angel Bautista, Navdeep Jaitly, Josh Susskind
ICML 2025 Oral.
[arXiv]
2024 Step-by-Step Diffusion: An Elementary Tutorial
Preetum Nakkiran, Arwen Bradley, Hattie Zhou, Madhu Advani
Foundations and Trends® in Computer Graphics and Vision, Vol. 17
[arXiv] [tweet]
2023 What Algorithms can Transformers Learn? A Study in Length Generalization
Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, Preetum Nakkiran
ICLR 2024.
[arXiv] [tweet]
2023 When does optimizing a proper loss yield calibration?
Jaroslaw Blasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
NeurIPS 2023 Spotlight.
[arXiv]
2022 A Unifying Theory of Distance from Calibration
Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
STOC 2023.
[arXiv] [tweet] [slides: aspen] [poster: ICLR] [code]
2021 The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi
ICLR 2021.
[arXiv] [tweet]
2019 Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran, Gal Kaplun*, Yamini Bansal*, Tristan Yang, Boaz Barak, Ilya Sutskever
ICLR 2020.
[arXiv]
2018 General Strong Polarization
Jarosław Błasiok, Venkatesan Guruswami, Preetum Nakkiran, Atri Rudra, Madhu Sudan
STOC 2018, JACM 2022.
[arXiv]
2015 Compressing Deep Neural Networks Using a Rank-Constrained Topology
Preetum Nakkiran, Raziel Alvarez, Rohit Prabhavalkar, Carolina Parada
INTERSPEECH 2015.
[pdf]

Theses

2021 Towards an Empirical Theory of Deep Learning
Preetum Nakkiran
Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
[pdf] [cite]

About Me

For talks, you can use this [bio].

I did my undergrad in EECS at UC Berkeley. I'm broadly interested in theory and science. In the past, I have interned at OpenAI (with Ilya Sutskever) Google Research (with Raziel Alvarez), Google Brain (with Behnam Neyshabur, Hanie Sedghi), and have also done research in error-correcting codes, distributed storage, and cryptography. I am grateful for past support from NSF GRFP and the Google PhD Fellowship. An (outdated) CV is available here. [ORCID]

See also my old website for more. This version borrowed in part from Luca Trevisan and Jon Barron, and (as of 2025) Cursor with claude-3.7-sonnet-thinking.

What People are Saying

a "high-level" scientist   —colleague (ML)

makes plots and draws lines through them
            —colleague (TCS)

has merits that outweigh flaws   —reviewer 2

Selected Tweets

From The Archives

  • Past successful application materials (fellowships, etc): [drive]
  • Courses I took in undergrad.

Citation Distribution

Loading citation data...