I'm a Research Scientist at Apple, working on
foundations of machine learning.
My work aims broadly to understand generalization in deep learning.
Recent topics:
calibration
[1]
[2]
[3]
[4]
[5],
Transformer generalization
[6],
and diffusion
[7]
[8].
I completed my PhD at Harvard, having the unique pleasure of being advised by Madhu Sudan and Boaz Barak. In my postdoc I worked with Misha Belkin. I am grateful for past support from NSF and the Simons Institute. Go Bears!
Interns at Apple (hosted or collaborated):
- Hattie Zhou. PhD student, Université de Montréal and Mila.
- Annabelle Carrell. PhD student, University of Cambridge.
- Lunjia Hu. PhD student, Stanford. (hosted by Parikshit Gopalan)
- Shivam Garg. PhD student, Stanford. (hosted by Kunal Talwar)
- Rylee Thompson. MASc student, University of Guelph. (hosted by Shuangfei Zhai)
- Elan Rosenfeld. PhD student, CMU. (hosted by Fartash Faghri)
Selected Research See [publications] for full list.
Arwen Bradley, Preetum Nakkiran
[arXiv] [tweet]
Preetum Nakkiran, Arwen Bradley, Hattie Zhou, Madhu Advani
[arXiv] [tweet]
Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, Preetum Nakkiran
ICLR 2024.
[arXiv] [tweet]
Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
STOC 2023.
[arXiv] [tweet] [slides: aspen] [poster: ICLR] [code]
Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi
ICLR 2021.
[arXiv] [tweet]
Preetum Nakkiran*, Yamini Bansal*
[arXiv] [talk] [slides]
Preetum Nakkiran, Gal Kaplun*, Yamini Bansal*, Tristan Yang, Boaz Barak, Ilya Sutskever
ICLR 2020.
[arXiv]
Jarosław Błasiok, Venkatesan Guruswami, Preetum Nakkiran, Atri Rudra, Madhu Sudan
STOC 2018, JACM 2022.
[arXiv]
Pasin Manurangsi, Preetum Nakkiran, Luca Trevisan
APPROX-RANDOM 2016.
[arXiv]
Preetum Nakkiran, Raziel Alvarez, Rohit Prabhavalkar, Carolina Parada
INTERSPEECH 2015.
[pdf]
Theses
Preetum Nakkiran
Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
[pdf] [cite]
All papers
Machine Learning Theory
-
What Algorithms can Transformers Learn? A Study in Length Generalization
[tweet]
Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, Preetum Nakkiran
ICLR 2024. -
When Does Optimizing a Proper Loss Yield Calibration?
[tweet]
Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
NeurIPS 2023. -
Loss Minimization Yields Multicalibration for Large Neural Networks
Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, Preetum Nakkiran
ITCS 2024. -
A Unifying Theory of Distance from Calibration
[tweet]
Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
STOC 2023. -
The Calibration Generalization Gap
A. Michael Carrell, Neil Mallinar, James Lucas, Preetum Nakkiran
ICML 2022 DFUQ Workshop. -
Benign, Tempered, or Catastrophic: A Taxonomy of Overfitting
Neil Mallinar*, Jamie Simon*, Amirhesam Abedsoltan, Parthe Pandit, Mikhail Belkin, Preetum Nakkiran
NeurIPS 2022. -
Deconstructing Distributions: A Pointwise Framework of Learning
[demo]
Gal Kaplun*, Nikhil Ghosh*, Saurabh Garg, Boaz Barak, Preetum Nakkiran
ICLR 2023. -
What You See is What You Get: Distributional Generalization for Algorithm Design in Deep Learning
Bogdan Kulynych*, Yao-Yuan Yang*, Yaodong Yu, Jarosław Błasiok, Preetum Nakkiran
NeurIPS 2022. -
Knowledge Distillation: Bad Models Can Be Good Role Models
Gal Kaplun, Eran Malach, Preetum Nakkiran, Shai Shalev-Shwartz
NeurIPS 2022. -
Limitations of the NTK for Understanding Generalization in Deep Learning
Nikhil Vyas, Yamini Bansal, Preetum Nakkiran
Preprint. 2022. -
Limitations of Neural Collapse for Understanding Generalization in Deep Learning
[tweet]
Like Hui, Mikhail Belkin, Preetum Nakkiran
Preprint. 2022. -
Turing-Universal Learners with Optimal Scaling Laws
Preetum Nakkiran
Manuscript. 2021. -
Revisiting Model Stitching to Compare Neural Representations
[tweet]
Yamini Bansal, Preetum Nakkiran, Boaz Barak
NeurIPS 2021. -
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
[slides]
[tweet]
Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi
ICLR 2021. -
Distributional Generalization: A New Kind of Generalization
[10m talk]
[slides]
Preetum Nakkiran*, Yamini Bansal*
- Desk-Rejected from NeurIPS 2020.
- Rejected from ICLR 2021. [reviews]
- Rejected from ICML 2021. [reviews] [rebuttal] [meta]
- Rejected from NeurIPS 2021. [reviews]
- Rejected from ICLR 2022. [reviews] -
Learning Rate Annealing Can Provably Help Generalization, Even for Convex Problems
[tweet]
Preetum Nakkiran
OPT2020 Workshop (Best Student Paper) -
Optimal Regularization Can Mitigate Double Descent
Preetum Nakkiran, Prayaag Venkat, Sham Kakade, Tengyu Ma
ICLR 2021. -
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran, Gal Kaplun*, Yamini Bansal*, Tristan Yang, Boaz Barak, Ilya Sutskever
ICLR 2020. -
More Data Can Hurt for Linear Regression:
Sample-wise Double Descent
Preetum Nakkiran
Manuscript. 2019. -
SGD on Neural Networks Learns Functions of Increasing Complexity
Preetum Nakkiran, Gal Kaplun, Dimitris Kalimeris, Tristan Yang, Benjamin L. Edelman, Fred Zhang, Boaz Barak
NeurIPS 2019 (Spotlight). -
Adversarial Examples are Just Bugs, Too
Preetum Nakkiran
Distill 2019. -
Adversarial Robustness May Be at Odds With Simplicity
Preetum Nakkiran
(Merged appears in COLT 2019). -
The Generic Holdout:
Preventing False-Discoveries in Adaptive Data Science
Preetum Nakkiran, Jarosław Błasiok
Manuscript. 2018.
Theory
-
Algorithmic Polarization for Hidden Markov Models
Venkatesan Guruswami, Preetum Nakkiran, Madhu Sudan
ITCS 2019. -
General Strong Polarization
Jarosław Błasiok, Venkatesan Guruswami, Preetum Nakkiran, Atri Rudra, Madhu Sudan
STOC 2018. -
Tracking the L2 Norm with Constant Update Time
Chi-Ning Chou, Zhixian Lei, Preetum Nakkiran
APPROX-RANDOM 2018. -
Near-Optimal UGC-hardness of Approximating Max k-CSP_R
Pasin Manurangsi, Preetum Nakkiran, Luca Trevisan
APPROX-RANDOM 2016.
Machine Learning
-
Compressing Deep Neural Networks Using a Rank-Constrained Topology
Preetum Nakkiran, Raziel Alvarez, Rohit Prabhavalkar, Carolina Parada
INTERSPEECH 2015. -
Automatic Gain Control and Multi-style Training for Robust Small-Footprint Keyword Spotting
with Deep Neural Networks
Rohit Prabhavalkar, Raziel Alvarez, Carolina Parada, Preetum Nakkiran, and Tara Sainath
ICASSP 2015.
About Me
For talks, you can use this [bio].I did my undergrad in EECS at UC Berkeley. I'm broadly interested in theory and science. In the past, I have interned at OpenAI (with Ilya Sutskever) Google Research (with Raziel Alvarez), Google Brain (with Behnam Neyshabur, Hanie Sedghi), and have also done research in error-correcting codes, distributed storage, and cryptography. I am grateful for past support from NSF GRFP and the Google PhD Fellowship. An (outdated) CV is available here. [ORCID]
See also my old website for more. This version borrowed in part from Luca Trevisan and Jon Barron.
What People are Saying
a "high-level" scientist —colleague (ML)
makes plots and draws lines through them
—colleague (TCS)
has merits that outweigh flaws —reviewer 2
Selected Tweets
- on science for science's sake
- the "definitional obstacle" to DL theory
- the "Natural Distributions" obstacle to generalization
- resources on causality
- how "causally explaining generalization" is not even wrong
- traps of defining objects which don't exist
- measures of dependence between RVs
- complaints about DL that are not actually about DL
- complaints about "science of ML" missing from ICML
- on calibration and overparameterization