from Guide to the PhD on Jul 2, 2023
How to mentor undergraduate researchers
Over the few years of my PhD, I mentored over 50 undergraduate collaborators; at the peak, mentoring 15+ undergraduates at once. Here's how I scaled research mentorship.
I've had the fortune to collaborate with a number of undergraduates over the years, with productive collaborations and strong outcomes: successful mentees, accepted publications, and even several mentees that ultimately pursued academic careers. Here are the results:
- I've co-authored accepted papers with 5 mentees and submitted papers with 2 more mentees.
- 4 mentees pursued PhDs at top AI programs in Berkeley, CMU, MIT, and Princeton
- 10 pursued masters degrees, including 5th-years masters at Berkeley, Harvard, Columbia, and more
- The remainder went on to lead successful careers in industry, working at companies large (Google, Meta, Apple, Microsoft, etc) and small (OpenAI and other startups).
At the end of the day, I now know a large body of hard-working engineers, researchers, and academics. In fact, I've remained in touch with a large number of these mentees, even until the current day.
In designing and refining this process, I follow a few motivating principles for collaborating with undergraduates.
First and foremost, I'm passing on the torch; several graduate students (Vaishaal, Bichen) took a chance on me as an undergraduate, and their investment in me is what ultimately led to admission to a PhD program. I'm simply passing on this chance they gave me. Here are several ways that this manifested itself.
Everyone deserves a chance. I did not find the best undergraduate researchers. In fact, I didn't even try to filter the undergraduates I worked with. I tried to nurture them. My goal was to give as many undergraduates a chance, as possible1. There were two interpretations of this principle:
- Everyone deserves one chance and not more than that. This "chance" was to show a strong work ethic and put in an earnest effort. More on this later.
- That chance didn't need to come from me. If they could find a chance elsewhere, that was perfect. If they couldn't find an opportunity, I would offer it.
Work with teams, not individuals. We'll dive more into how this worked later on, but in short, organize by and work with project teams. This principle of working with teams had two motivations:
- In trying to give everyone a chance, I realized fairly quickly that any number of mentees greater than 2-3 was unsustainable.
- Being new to research, many mentees did need low-level, detailed guidance that I didn't have the bandwidth to provide. This is where fellow undergraduate collaborators and more senior undergraduates could step in.
Minimize meetings. This is for both undergraduates that demonstrate interest and those that are already working with you. We'll discuss ways around this later on, but in short, defer to in-person "working" times and write down information where possible.
I'll now walk through the lifecycle of a collaboration with an undergraduate. Along the way, I'll make sure to mention which principles came into play when.
I never actually recruited undergraduates, but at some point, it became common knowledge to email graduate students for research opportunities. I pretty much always responded with the same message, which looks roughly like the below:
- Prerequisites: Recommend some prerequisites, but as long as you code, we can collaborate.
- Find someone else: Try to find other research opportunities. If you don't find anything, then come find me.
If you can put your ego aside and accept being last resort, this helps ensures that you chat only with undergraduates that will definitely work with you, after chatting — either because no one else will, or because you're their top choice.
For this reason, I usually declined to meet just to discuss the "possibility" of collaborating. I was fully open to collaborating, so the ball was in their court for deciding whether or not to work together.
Feel free to copy the above snippet if my arguments convince you of the effectiveness of this approach. There were a few caveats here:
- I did list project descriptions with DARE, but only for the promise of more diverse candidates. There are also other organizations you can recruit more diverse talent from, such as Society of Women Engineers (SWE).
- The only time I declined new requests for collaborations was during my last semester of the PhD. At that point, with a few months to go, I wouldn't have enough time to onboard new collaborators and I wasn't planning to stick around for the summer.
You'll notice the process above is strangely "open," but it's not out of the kindness of my heart necessarily. I have several reasons to believe this is the best way to conduct a "search" for undergraduate collaborators.
For starters, work ethic is the only prerequisite in my book, but it's an undebatable prerequisite — no amount of intelligence and capability can make up for a lack of enthusiasm. However, the other direction works: Work ethic can make up for lack of engineering or machine learning experience, with time. So in short, work ethic is absolutely necessary and all you need.
- Why not filter by resumes or interviews? No amount of resume filtering and interviewing can tell you how hard an undergraduate researcher will work. To a degree, you can estimate this by getting references and talking to previous mentors that the undergraduate worked with, but even after all that, you still have an incomplete picture of their work ethic. It's easier to get started working with them.
- Why not filter by A+s? There are two possibilities here: (1) The undergraduate is so brilliant that courses are pieces of cake and A+s come naturally. (2) The undergraduate is entirely focused on getting perfect performance on their coursework and doesn't have time for other activities. In either case, there are plenty of professors that would jump on the opportunity to work with this undergraduate, and there's no need for me to help here.
- Why not filter by coursework? Basic working knowledge can be picked up quickly, and even if not, work can be broken down into sizable chunks that don't require an intimate understanding of every detail. As I mentioned above, any collaborator will be able to do more by interning as a software engineer and completing a machine learning course, first. However, even both of these aren't substitutes for work ethic. Fortunately, work ethic can fix both of those missing experiences.
So in short, I advocate strongly for this "open" approach to recruiting collaborators. The remainder of this post is dedicated to how to support large numbers of undergraduate collaborators.
To get undergraduate collaborators onboard, you'll need to establish a cadence for work. There are the typical onboarding steps, such as setting expectations and getting compute access, which we discuss in How to succeed as a (research) mentee.
On top of the usual onboarding steps, here are a few more steps to be aware of, to help set the stage for a more productive research mentorship at scale:
- Assemble teams: All undergraduate researchers should be grouped up into teams of 2-3. Ideally no more than 4, as larger groups can have stragglers. From here on out, you then treat this team as a functional unit when meeting, picking paper deadlines, and pivoting ideas.
Define projects: Find a nugget for this team of undergraduate collaborators to pursue. Ideally, you should start from the insight, work towards a problem statement, then determine the "minimum viable result" the team needs to show. See Is my project paper-ready?
- For example, the insight could be that Large Language Models (See Practical Introduction to Large Language Models) typically memorize better than they generalize. In light of this, we find a problem space where this is advantageous (e.g., AP US History), and quantify their ability to memorize accurately, by comparing with nearest neighbors or human test-takers.
- Project ideas should be publishable as a full conference paper, if the idea is standalone. Target a workshop paper only for half-baked results, not half-baked ideas. Ideas can be extensions of your current work as well.
- I actually wouldn't onboard first-time undergraduate researchers onto your own project. It's better for first-timers to pursue their own project first, letting them take their time in learning the basics, on their own pace. More experienced collaborators can join your project after a semester.
Schedule or add to team meeting: Rather than meeting with a teammate individually, I usually add the new undergraduate collaborator directly to the next team meeting. During that team meeting, I'll usually ask my collaborators to present the problem, insight, method, and minimum viable result.
- Gatekeep compute: To start, I keep a running AWS instance with the lowest-tier GPU. All incoming undergraduate collaborators are granted access to that instance to use. Once the collaborator has (a) committed code and (b) produced some kind of result or analysis with that code, I'll then request access to the lab's resources on my collaborator's behalf.
Once you have these basic steps completed, move next onto setting a work schedule.
Throughout the semester, establish a regular cadence of work. Regularity and habit-building is fairly important, as your collaborators will know when and how to find you for support.
Weekly working sessions: I scheduled long-form "working" meetings every week.
- How they worked: This would be three or more 5-hour working sessions — usually Monday, Wednesday, and Friday. I was in the lab all the time anyways, so these working sessions would be time for them to come join me in the lab — ask questions, discuss debugging strategies, and generally learn how to do research.
- How collaborators use them: My collaborators weren't "required" to attend these sessions, but these were times for their teams to get together, with me present. Each team would have a working session where they were prioritized.
- Why I created them: These working sessions could also be used for ad-hoc updates and discussions. In short, I tried to set the stage for "water cooler" conversations. These were also the catch-alls for any meetings that my collaborators requested.
Weekly meetings: The existence of a weekly meeting is more or less standard.
- How they worked: I critically only scheduled meetings with teams of undergraduates. Each team got 15 minutes of time. During this time, they gave updates, I provided feedback, and we went over our plan for the next paper deadline.
- Impact of COVID: Weekly meetings became a necessity during COVID when working sessions weren't as effective. However, even after COVID, I scheduled weekly meetings, even if we didn't always use them — sometimes, our ad-hoc discussions during working sessions sufficed.
Here are the atypical expectations to set with your undergraduate collaborators. These are critical for ensuring your own sanity as you scale to support more collaborators.
No other meetings: Most of your insanity will be driven by off-hand scheduling requests. In short, decline all meetings.
- Instead, refer your collaborators to one of the working sessions. There are large chunks of times throughout the week, and they can pick any time then that works for them. It's not that you aren't willing to meet; it's that scheduling overhead is unscalable, when there are 15 undergraduates.
- You're fighting fragmentation of your time. So in short, you're still sinking quite a lot of time in supporting these undergraduates. That time just needs to be regular, instead of sporadically fragmented, to not overwhelm yourself.
No direct messages: The other half of your insanity will be driven by many, highly-related direct messages. Have your collaborators always send messages in the team channel. It could be a question about background material or a clarification question about a todo — no matter what it is, ask the question in a team channel.
- However, your collaborators should feel free to @-mention you. You aren't afraid of being pinged; what you don't want is for each collaborator to direct message you the same question — resulting in you answering the same question a dozen times.
- Punting all questions to team channels has two benefits: (1) All teammates should have the opportunity to participate in the conversation, and more importantly, all of them should benefit from the information you share. (2) The team channel becomes a persistent knowledge base for all future collaborators as well.
- The only allowed direct messages for sensitive questions — for example, if your collaborator needs a recommendation letter, if they have a medical condition that prevents them from continuing research, or if they have a concern about team dynamics.
Go to your teammates first: In short, collaborators should go to their teammates first, rather than go directly to you.
- Figure out as a team first. Each team will be more capable than any of the constituent undergraduates on their own. Low-level problems such as a bug should be resolved at the team-level.
- Ask you for help, as a team. This falls in line with the "team as a unit" idea. If someone from the team pings you, it should be because the team as a whole is confused. If the entire team is confused, then that's a sign you should step in.
The above schedule changes radically when the deadline rolls around. Around 2 weeks before the deadline, I'll pencil in daily work sessions, with priority given to the team that is about to submit. At this point, we follow the process described in How to write a paper.
The process leading up to a paper deadline is somewhat straightforward. Not all collaborators will reach this stage, but that's completely okay. My only criteria for a successful collaboration is hard work, and sometimes, good work with a handful of luck creates strong results. Here are a few notes on ending collaborations or handling post-collaboration questions.
- Every collaborator got one chance. This "one" chance was measured in semesters. If a collaborator disappeared from meetings for a semester and pinged me the next semester, I would simply tell them that I was too busy. However, if the collaborator attended all the meetings and was dutifully putting in effort, I'd continue the collaboration. The "chance" was not measured in results but in effort.
- Need a recommendation letter? Hard work earned a recommendation, no matter the results. For recommendation letters, I always asked collaborators to write a draft of their own recommendation first. You can simply refer your collaborators to the following post. Once they hand you a draft, embellish it with comparisons and anything you can write to quantify their performance. Published papers are self-explanatory of course: How to write (self-)recommendation letters for PhD admissions.
This pretty much sums up all of my working notes on working with undergraduate research collaborators. Use the above process, and you should definitely find yourself with some unforgettable collaborators — no doubt, many mentees that will make you proud. Best of luck!
This is a vast departure from the vision faculty at Berkeley, who pretty much let candidates (a.k.a., undergraduates or early-career PhD students) fight it out and claim the winners as their own. ↩
Want more tips? Drop your email, and I'll keep you in the loop.