Journeying through Statistics & Machine Learning Research: An Interview with Jake Snell

Image of Dr. Snell smiling, wearing glasses and a pale red and grey checkered collared shirt.

Jake Snell is a DataX postdoctoral researcher in the Department of Computer Science at Princeton University, where he develops novel deep learning algorithms by drawing insights from probabilistic models. He is currently serving as a lecturer for SML 310: Research Projects in Data Science.

As I dive deeper into my computer science coursework, I’ve found myself engaging increasingly with statistics and machine learning (hereafter abbreviated as SML). Opportunities to conduct SML research are abound at Princeton: senior theses, junior independent work, research-based courses such as SML 310: Research Projects in Data Science, joining research labs, and much more. There is such a wide variety of research opportunities, and so many nuanced pathways that students can take while exploring SML research. So, for this seasonal series, I wanted to speak with professors and researchers who are more advanced in their research journeys to share their insight and advice to undergraduate students.

Alexis Wu (AW): How would you introduce your current research?

Jake Snell (JS): I’m currently a postdoc in the Computer Science department here at Princeton, and the work that I do is at the intersection of statistical models and deep learning. To provide a bit of background, statistical models involve building probabilistic models of the world, like coming up with a model that explains how certain data arose and describes it in terms of math. It models what we think will happen in the future. With deep learning, many people may have heard of things like ChatGPT, which are just very large scale neural networks with billions of billions of parameters that learn from a lot of data to make predictions. Not really thought of statistical models in the same way. My research involves taking insights from statistical models and applying learnings to create better deep models, and vice versa, making deep learning models more adaptive and reliable to changing environments.

AW: What are some common challenges that researchers in SML face, and how do you recommend students navigate these challenges in their own research endeavors?

JS: One of the challenges that’s faced by newcomers to research in any particular field is that the field has its own culture around it. There’s a sort of jargon, they have certain ways of thinking about problems, and expectations on the ways that papers would be written, such as structure, notation, and so on. It’s really helpful if you can find some way to get plugged into the community. That can take a lot of different forms; one case would be going to a conference and talking to people there and attending the poster sessions. As an undergrad, there are a lot of great researchers all around—look for ways to get plugged into a research group if that’s something that interests you. You can also look for courses that have a research project component to them and use that as a way to complete a research project. Oftentimes, you can then springboard that into additional research like independent work, or other research collaborations with the professor.

AW: How have your research interests shifted since you were an undergraduate student?

JS: I was an undergrad at Yale doing biomedical engineering. My senior research project was on the optimization of positron emission tomography (PET) image reconstruction, which meant reconstructing where the chemical gets injected into the body and, from the scanner, figuring out where it is in the brain. It’s changed a lot since now I do computer science, but they both involve reasoning with incomplete data: based on some incomplete picture, what is going on in the real world?

I’ve been interested in Computer Science since I was in middle school. I took a course at a local university in C programming when I was in 8th grade, and when I was in high school I competed in programming contests. I almost majored in Computer Science, but at the time I was more interested in biomedical research. After undergrad, I worked at a consulting job for two years doing business analytics work. But I wanted to get back into learning more about science, which led me to apply to grad school in Computer Science. I finished my degree at the University of Toronto, where I worked in machine learning and really got immersed in deep learning.

AW: Can you share some insights into the current trends or cutting-edge developments in ML research?

JS: The real big thing these days is large language models (LLMs), and what they mean for research and for the world. One interesting aspect of that is they have this ability to be queried in a way that previous machine learning models have not been. You can ask an LLM a question and see what it says, which has opened up some interesting directions of research. Especially at the lab I’m in (the Computational Cognitive Science Lab), where the researchers are really adept at designing studies into the ways that people think, there’s a hybrid space where now you can take some of those learnings about the way people learn and think and see to what extent LLMs embody the same way of thinking. For me, I’m really interested in thinking to what extent you can use these LLMs for modeling probabilistic models.

AW: How do you suggest that students navigate the vast landscape of algorithms, techniques, and methodologies available and select a research topic?

JS: My postdoc adviser is Tom Griffiths, who had what I thought was really insightful advice. He said that usually, ideas come from when you get to combine two ideas together. His advice is to look to classic papers in research for problems, and to new research for the techniques. Those older papers are classics for a reason—they lay out the foundational problems with statistics and machine learning. These newer techniques that come out typically haven’t yet been applied to these classic problems, so it often makes for a great research area to explore.

AW: What general advice do you have for undergraduate students conducting research?

JS: My big piece of advice is to find connections in the community. It’s often helpful to find a couple mentors, specifically people who are at the stage that you want to be next. So, if you’re interested in going to grad school, maybe you can find a graduate student (or two) who are in a PhD program. Then, connect with them, talk to them, and get their thoughts about their experiences. Developing those personal connections is really helpful. Generally, even just finding someone whose research you’re interested in can be valuable. Cold email them, find them on Twitter (X)—they’re often happy to connect.

Interview responses have been edited for clarity and length.

It was rewarding to speak with Jake and hear his advice on how to face and grow from common challenges that students working on SML research may face. Moreover, I found that the advice he shared from his adviser, Professor Griffiths, on tackling the initial ambiguity surrounding choosing a research topic to be incredibly insightful. Because new machine learning approaches and techniques develop extremely quickly, the advice he shared could help students figure out how to balance both classical, older knowledge and newer, cutting-edge research. I hope this was exciting to read, and I wish you all the best with your research endeavors!

— Alexis Wu ‘25, Engineering Correspondent