Preetum Nakkiran
I'm a Research Scientist at Apple (with Josh Susskind and Samy Bengio), and a Visiting Researcher at UCSD (with Misha Belkin). I'm also part of the NSF/Simons Collaboration on the Theoretical Foundations of Deep Learning.My research builds conceptual tools for understanding learning systems (including deep learning), using both theory and experiment as appropriate. See the intro of my thesis for more on my motivations and methods.
I recently completed my PhD at Harvard, advised by Madhu Sudan and Boaz Barak. While there, among other things, I co-founded the ML Foundations Group.
[publications]
[CV]
[twitter]
[short-bio]
preetum@nakkiran.org
Acknowledgements: I have had the pleasure of collaborating with the following excellent students & postdocs (partial list):
News and Olds:
- May 2022: I've joined Apple ML Research! I maintain a UCSD affiliation as a Visiting Researcher.
- Nov 2021: New manuscript on Turing-Universal Learners with Optimal Scaling Laws (free-lunch.org). Also other papers.
- Sept 2021: I have moved to University of California, San Diego.
- July 2021: I defended my thesis! View the [slides], and read the [thesis]. I suggest the Introduction, which is written for general scientific audience.
- 1994: Born
Recent/Upcoming Invited Talks
I'm happy to speak about my works and interests. Currently, I will likely speak about The Deep Bootstrap Framework, Distributional Generalization, or musings about scaling, or science, or theory.- 7 Feb 2022: Stanford talk (Percy Liang group), on Distributional Generalization.
- 15 Dec 2021: Reinforcement Learning Reading Group talk (Marcus Hutter group).
- 7 Dec 2021: Simons Institute talk, on "Is Overfitting Actually Benign?." [slides]
- 2 Dec 2021: Princeton talk (Sanjeev Arora group).
- 30 Nov 2021: CMU ML Faculty Seminar talk.
- 17 Nov 2021: MIT talk (Sasha Rakhlin group), on Distributional Generalization.
- 15 Nov 2021: MIT talk (Poggio Lab), on The Deep Bootstrap Framework.
- Oct 2021: UCSD Theory Seminar talk, on "Theory for Deep Learning, and Deep Learning for Theory." [slides]
- Sept 2021: Deep Learning Classics & Trends talk, on Distributional Generalization. [slides]
- Aug 2021: UToronto talk (Roger Grosse group), on The Deep Bootstrap Framework. [slides]
- June 2021: Deep Learning Classics & Trends talk, on The Deep Bootstrap Framework. [slides]
- Apr 2021: Guest Lecture in ML Theory Course (Boaz Barak), speaking on scaling laws. [slides]
- Apr 2021: UPenn Seminar (Weijie Su group), on Distributional Generalization. [slides]
- Aug 2020: UCLA Big Data and Machine Learning Seminar, on Distributional Generalization.
- Feb 2021: Simons Collaboration Monthly Meeting, speaking on The Deep Bootstrap. [slides]
- Aug 2020: Max Planck+UCLA Math ML Seminar, speaking on Distributional Generalization. [video]
Research
I take a scientific approach to machine learning— trying to advance understanding through basic experiments and foundational theory.See [publications] for full list of papers.
Theses
-
Towards an Empirical Theory of Deep Learning
Preetum Nakkiran
Doctoral dissertation, Harvard University Graduate School of Arts and Sciences. July 2021. [cite]
Machine Learning Theory
-
Deconstructing Distributions: A Pointwise Framework of Learning
[demo]
Gal Kaplun*, Nikhil Ghosh*, Saurabh Garg, Boaz Barak, Preetum Nakkiran
Preprint. 2022. -
What You See is What You Get: Distributional Generalization for Algorithm Design in Deep Learning
Bogdan Kulynych*, Yao-Yuan Yang*, Yaodong Yu, Jarosław Błasiok, Preetum Nakkiran
Preprint. 2022. -
Knowledge Distillation: Bad Models Can Be Good Role Models
Gal Kaplun, Eran Malach, Preetum Nakkiran, Shai Shalev-Shwartz
Preprint. 2022. -
Limitations of Neural Collapse for Understanding Generalization in Deep Learning
[tweet]
Like Hui, Mikhail Belkin, Preetum Nakkiran
Preprint. 2022. -
Turing-Universal Learners with Optimal Scaling Laws
Preetum Nakkiran
Manuscript. 2021. -
Revisiting Model Stitching to Compare Neural Representations
[tweet]
Yamini Bansal, Preetum Nakkiran, Boaz Barak
NeurIPS 2021. -
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
[slides]
[tweet]
Preetum Nakkiran, Behnam Neyshabur, Hanie Sedghi
ICLR 2021. -
Distributional Generalization: A New Kind of Generalization
[10m talk]
[slides]
Preetum Nakkiran*, Yamini Bansal*
- Desk-Rejected from NeurIPS 2020.
- Rejected from ICLR 2021. [reviews]
- Rejected from ICML 2021. [reviews] [rebuttal] [meta]
- Rejected from NeurIPS 2021. [reviews]
- Rejected from ICLR 2022. [reviews] -
Learning Rate Annealing Can Provably Help Generalization, Even for Convex Problems
[tweet]
Preetum Nakkiran
OPT2020 Workshop (Best Student Paper) -
Optimal Regularization Can Mitigate Double Descent
Preetum Nakkiran, Prayaag Venkat, Sham Kakade, Tengyu Ma
ICLR 2021. -
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran, Gal Kaplun*, Yamini Bansal*, Tristan Yang, Boaz Barak, Ilya Sutskever
ICLR 2020. -
More Data Can Hurt for Linear Regression:
Sample-wise Double Descent
Preetum Nakkiran
Manuscript. 2019. -
SGD on Neural Networks Learns Functions of Increasing Complexity
Preetum Nakkiran, Gal Kaplun, Dimitris Kalimeris, Tristan Yang, Benjamin L. Edelman, Fred Zhang, Boaz Barak
NeurIPS 2019 (Spotlight). -
Adversarial Examples are Just Bugs, Too
Preetum Nakkiran
Distill 2019. -
Adversarial Robustness May Be at Odds With Simplicity
Preetum Nakkiran
(Merged appears in COLT 2019). -
The Generic Holdout:
Preventing False-Discoveries in Adaptive Data Science
Preetum Nakkiran, Jarosław Błasiok
Manuscript. 2018.
Theory
-
Algorithmic Polarization for Hidden Markov Models
Venkatesan Guruswami, Preetum Nakkiran, Madhu Sudan
ITCS 2019. -
General Strong Polarization
Jarosław Błasiok, Venkatesan Guruswami, Preetum Nakkiran, Atri Rudra, Madhu Sudan
STOC 2018. -
Tracking the L2 Norm with Constant Update Time
Chi-Ning Chou, Zhixian Lei, Preetum Nakkiran
APPROX-RANDOM 2018. -
Near-Optimal UGC-hardness of Approximating Max k-CSP_R
Pasin Manurangsi, Preetum Nakkiran, Luca Trevisan
APPROX-RANDOM 2016.
Machine Learning
-
Compressing Deep Neural Networks Using a Rank-Constrained Topology
Preetum Nakkiran, Raziel Alvarez, Rohit Prabhavalkar, Carolina Parada
INTERSPEECH 2015. -
Automatic Gain Control and Multi-style Training for Robust Small-Footprint Keyword Spotting
with Deep Neural Networks
Rohit Prabhavalkar, Raziel Alvarez, Carolina Parada, Preetum Nakkiran, and Tara Sainath
ICASSP 2015.



About Me
For talks, you can use this [bio].I did my undergrad in EECS at UC Berkeley. I'm broadly interested in theory and science. In the past, I have interned at OpenAI (with Ilya Sutskever) Google Research (with Raziel Alvarez), Google Brain (with Behnam Neyshabur, Hanie Sedghi), and have also done research in error-correcting codes, distributed storage, and cryptography. I am grateful for past support from NSF GRFP and the Google PhD Fellowship.
See also my old website for more. This version borrowed in part from Luca Trevisan and Jon Barron.
What People are Saying
a "high-level" scientist —colleague (ML)
makes plots and draws lines through them
—colleague (TCS)
has merits that outweigh flaws —reviewer 2
a complainer —my girlfriend (among others)
Selected Tweets
- on science for science's sake
- the "definitional obstacle" to DL theory
- the "Natural Distributions" obstacle to generalization
- resources on causality
- how "causally explaining generalization" is not even wrong
- traps of defining objects which don't exist
- measures of dependence between RVs
- complaints about DL that are not actually about DL
- complaints about "science of ML" missing from ICML
- on calibration and overparameterization