Chip Huyen on Machine Learning Interviews (Full Stack Deep Learning - November 2019)
Based on The Full Stack's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Research and applied research differ mainly in time horizon and expected outcomes, and ML applied work often remains empirical due to limited theoretical frameworks.
Briefing
Machine learning hiring is less about “perfect” interviews and more about navigating a noisy, expensive, and often inconsistent process—so candidates should optimize for fit, signal quality, and question selection rather than chasing a single formula for success. Chip Huyen, working on productionizing AI research at Nvidia, draws a sharp line between research and engineering roles, then maps how that distinction shows up in recruiting pipelines, resume screening, and interview question design.
A central theme is that “research” and “applied research” differ in time horizon and outcome expectations. Fundamental research targets long-term answers to theoretical questions, while applied research aims for practical solutions with nearer-term commercial impact. Even in applied settings, cutting-edge machine learning remains heavily empirical because many techniques work without a strong theoretical framework. Huyen also distinguishes research scientists (who originate ideas) from research engineers (who scale and operationalize those ideas), noting that modern model scaling makes engineering skill essential for turning prototypes into systems.
She then broadens the career landscape: machine learning jobs can come from multiple starting points—direct ML engineering from bachelor’s or master’s programs, PhD-to-research roles, transitions from data science, and software engineers retraining via coursework or bootcamps. Another pathway is hiring “adjacent” researchers from statistics or physics and training them on the job. The common thread is time: there’s no shortcut to becoming an ML expert in weeks, and online “five easy steps” claims should be treated skeptically.
Huyen emphasizes that big-company machine learning differs from startup machine learning. Large firms can afford expensive compute-heavy research and often hire specialists who focus on narrow components. Startups move faster across domains and typically prefer generalists who can handle multiple moving parts. That difference also shapes how candidates are evaluated and how many interviews are required.
On interviewing, she argues that outcomes depend on variables beyond raw ability. Interviewers often receive little training, may use “pet questions,” and can be mismatched to the candidate’s strengths (for example, a computer-vision expert facing an interviewer who doesn’t know the area). Mood, stress, and day-to-day factors can also skew results. Because of this, a rejection is not a reliable measure of competence, and companies sometimes revisit candidates later.
Recruiting itself is constrained by cost and headcount. Huyen notes that hiring agencies can take 20–30% of first-year salary, and hiring managers under pressure may choose “good enough” candidates whose schedules and signals align with the immediate need. Resume screening is informal and often biased toward recognizable signals—previous employers, GitHub, open-source contributions, awards/papers, referrals, and sometimes school names—while also being limited by recruiters’ lack of engineering background.
Finally, she critiques common interview question types. Easy recall questions are weak predictors of excellence; trick questions, overly specific name/term tests, and open-ended questions become unfair when interviewers expect a single “correct” solution. Better questions probe understanding and assumptions (e.g., how k-means behaves under dataset changes), and can use structured formats like multiple choice, quizzes, code walkthroughs, pair programming, or two-interviewer setups to improve quality.
To ground the discussion, she shares analysis from thousands of Glassdoor reviews across major tech companies, finding that companies with lower on-site offer ratios often have higher offer acceptance rates—suggesting that selectivity and offer strategy interact with candidate behavior. For candidates, her practical advice is to interview when not desperate, run a small number of targeted interviews to calibrate, and treat internships as a more attainable entry point since they’re cheaper for companies and often convert to full-time roles.
Cornell Notes
Chip Huyen frames ML interviews as a high-variance process shaped by role definitions, company constraints, and interviewer quality—not a clean test of talent. She distinguishes research (long-term fundamental answers) from applied research (practical solutions with nearer commercial impact), and research scientists (idea originators) from research engineers (scaling and operationalizing ideas). Career entry points include ML engineering from degrees, PhD research roles, data-science-to-ML transitions, software engineers retraining, and “adjacent” researchers trained on the job. In hiring, resume screening and interviewer performance are noisy; outcomes can hinge on interviewer training, question design, and even day-to-day factors. Candidates should focus on strong signals, ask clarifying questions when prompts are ambiguous, and avoid over-trusting “easy” or unfair question formats.
How do research, applied research, and engineering roles differ in machine learning hiring?
What are common career paths into machine learning, and what do they have in common?
Why can interview outcomes be a poor proxy for ability?
What resume signals tend to matter during screening, and why?
Which interview question types are likely to be weak or unfair predictors?
How should candidates respond to unclear or mismatched interview prompts?
Review Questions
- Which differences between research scientist and research engineer most affect what skills an interviewer will test for?
- Give an example of a “good” ML interview question and explain what kind of understanding it measures.
- Why might on-site offer ratio and offer acceptance rate move together in Huyen’s Glassdoor-based analysis?
Key Points
- 1
Research and applied research differ mainly in time horizon and expected outcomes, and ML applied work often remains empirical due to limited theoretical frameworks.
- 2
Research scientists typically originate ideas, while research engineers are critical for scaling and operationalizing those ideas into production systems.
- 3
Machine learning career entry points include ML degrees, PhDs, data-science transitions, software-engineer retraining, and adjacent-research hires trained on the job.
- 4
Interview outcomes are high-variance because interviewer training is limited, question quality varies, and mismatches or day-to-day factors can distort evaluation.
- 5
Resume screening relies heavily on proxies (employers, GitHub/open source, referrals, papers/awards, sometimes schools) because many recruiters lack engineering depth.
- 6
Question design matters: easy recall, trick questions, narrow memorization, and unfair open-ended prompts can be weak predictors of real capability.
- 7
Candidates should interview strategically—especially when not desperate—and use clarifying questions to align with what the interviewer is actually assessing.