Get AI summaries of any video or article — Sign up free
Challenges in doing a PhD for Machine Learning and AI - Saturday with Dr. Sourish thumbnail

Challenges in doing a PhD for Machine Learning and AI - Saturday with Dr. Sourish

5 min read

Based on Enago Read (Previously Raxter.io)'s video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Research capability depends on experiment design, result interpretation, and problem framing—not just coding strength or familiarity with many ML algorithms.

Briefing

A machine-learning PhD doesn’t hinge on knowing algorithms or being strong at coding—it hinges on learning how to frame problems, design experiments, and interpret results across domains. Jay, a graduate student building interpretable AI for healthcare, credits his path to research not to “magic” in AI, but to early experiences that made the math-to-application connection feel concrete. After internships—especially one at NTU and later work connected to Philips Research Labs—he decided that deeper studies were worth the risk, even though he initially treated a PhD as a gamble compared with a safer master’s route.

His entry point into research started with a mismatch: early reading made AI sound like vague hype, but he couldn’t see how numbers became real applications. That changed during an NTU summer course on neural networks and through hands-on exposure to how academic work turns into product-grade R&D, supported by large datasets. He also describes a key mindset shift: research is a skill. Programming and mathematical priors matter, but they don’t automatically make someone capable of research work—especially the parts that involve starting from scratch, setting up experiments, and learning to navigate “rabbit holes” when results don’t land where expected.

When choosing a PhD topic, Jay emphasizes deliberate exploration rather than instant certainty. He frames the decision as a “tasting” process: sampling different areas through internships and small projects until one direction feels sustainable for years. His eventual focus—medical imaging and interpretable models—was shaped by experiences that connected AI to real-world constraints. At Philips, he built a module that radiologists reviewed in real time, where even a one-second delay could matter clinically. In his current work, he’s working toward using deep learning to find biomarkers in brain data for neurodegenerative conditions, including post-traumatic headache and Alzheimer’s, with the goal of helping clinicians understand disease-relevant regions.

Interdisciplinary research brings its own early-stage challenges, and Jay treats them as structural rather than personal. He describes a three-node ecosystem: (1) medical experts who know the clinical domain but not ML, (2) ML experts who know the coding and modeling but not medicine, and (3) bridge researchers—often the advisor—who translate between both sides. For literature review, he recommends narrowing the reading funnel: pick a small set of core journals and then skim aggressively for methods, data loading, preprocessing, architectures, and metrics. He argues that in medical AI, data handling can make or break model performance, so reading the “how” matters more than the “why” early on.

He also flags a practical problem: memory and linkage. Reading many papers can blur details, leaving researchers unsure which paper supports which idea. His workaround is to store notes with hyperlinks and concise reminders about architectures, hyperparameters, and results, so later comparisons become faster.

Overall, the central takeaway is that a successful ML/AI PhD is built through experimentation discipline, early topic clarity via exploration, and active translation across domains—supported by targeted reading habits and networking that expands the range of questions a researcher can ask.

Cornell Notes

Jay’s path into a machine-learning PhD for healthcare interpretable AI centers on a simple premise: research is a skill, not a byproduct of strong coding or algorithm knowledge. He built that skill through internships that made AI feel grounded—especially experiences where models had to work under real-world constraints like clinical timing. For choosing a thesis topic, he recommends “exploration first”: sample different directions via projects and internships until one area passes the six-year “marriage” test. Interdisciplinary work requires translation between medical experts, ML experts, and bridge figures (often the advisor), plus a targeted literature strategy that prioritizes methods, data loading, preprocessing, and metrics over slow reading of introductions. Finally, he warns that heavy paper reading can cause memory “mapping” failures and suggests keeping structured, linked notes to preserve connections between ideas.

Why doesn’t strong ML knowledge automatically make someone a good researcher, and what replaces it?

Jay draws a distinction between priors (programming skill, math understanding, knowing many algorithms) and research capability. Research involves learning how to start from scratch, interpret results that don’t match expectations, frame the problem correctly, and design experiments. He emphasizes diligence—running experiments patiently, iterating, and accepting that progress often comes with getting stuck in “rabbit holes,” especially when the work sits between computer science and a medical application.

How did Jay decide on a PhD direction instead of committing too early?

He describes a “breadth-first tasting” approach even though he expects depth later. Internships and smaller projects served as exploration: one helped him see computer vision’s fit for him (including video/image understanding), while another—Philips Research Labs—showed how healthcare AI scales into real-world workflows. The clinical relevance became tangible when radiologists reviewed a module in real time, where delays could change outcomes. That combination helped him commit to medical imaging and interpretability.

What’s the practical challenge of interdisciplinary literature review, and how does he manage it?

Jay says interdisciplinary reading is harder because researchers may lack both medical domain fluency and ML implementation depth. He manages this by treating the research ecosystem as three nodes: medical experts, ML/coding experts, and bridge researchers (his advisor). For papers, he narrows to a small set of core journals and skims for repeatable technical patterns—especially architectures, data loading techniques, preprocessing steps, and evaluation metrics—rather than reading everything linearly.

What does Jay mean by “data loading changes the game” in medical AI?

In his experience, medical AI pipelines can fail or succeed based on how data is loaded and preprocessed. He focuses on the methodology sections to learn the exact data handling approach used in strong papers, because those implementation details often determine whether deep learning models train and generalize properly.

How does Jay address the problem of forgetting details across many papers?

He argues the issue isn’t just memory capacity; it’s the “mapping” problem—knowing what was understood but not being able to link it to the exact paper, author set, or page/paragraph later. His workaround is labor-intensive note-taking with hyperlinks: he stores short summaries (e.g., encoder-decoder architecture, hyperparameters, and results) and adds questions for later follow-up, so similar architectures can be compared efficiently.

What role does networking and cross-pollination play in his research thinking?

Jay credits podcasts and networking as a way to expand his breadth. Preparing questions for researchers outside his immediate interests forces him to read papers he otherwise wouldn’t. He describes a serendipitous shift: even though he didn’t enjoy reinforcement learning, a conversation led him to see how RL could match real problems like RNA sequencing, where the action space spans trillions of possibilities.

Review Questions

  1. What specific research skills does Jay say matter beyond programming and algorithm knowledge?
  2. How does Jay’s “tasting” strategy for thesis topic selection work, and what evidence convinced him to commit to medical imaging?
  3. When reading interdisciplinary papers, what sections and details does Jay prioritize, and why?

Key Points

  1. 1

    Research capability depends on experiment design, result interpretation, and problem framing—not just coding strength or familiarity with many ML algorithms.

  2. 2

    Early internships can turn AI from abstract hype into concrete understanding by showing how academic work becomes product-grade R&D.

  3. 3

    Choosing a thesis topic benefits from exploration first; Jay uses internships and small projects to find an area he can sustain for years.

  4. 4

    Interdisciplinary work functions best when roles are clear: medical experts, ML experts, and bridge researchers who translate between domains.

  5. 5

    Medical AI literature review should be targeted toward methods—especially data loading, preprocessing, architectures, and metrics—rather than slow reading of everything.

  6. 6

    Heavy paper reading creates a “mapping” problem; structured notes with hyperlinks and concise technical reminders help preserve connections between ideas.

  7. 7

    Networking and cross-domain conversations can create unexpected research directions by broadening the set of questions a researcher can ask.

Highlights

Jay credits his shift from “AI hype” to research clarity to hands-on exposure—especially seeing how papers and large datasets translate into real product R&D.
A key clinical requirement shaped his motivation: radiologists reviewed a module in real time, where even a one-second delay could matter.
For interdisciplinary reading, he recommends skimming for the technical core—data loading, preprocessing, architectures, and metrics—using a small set of trusted journals.
He frames interdisciplinary understanding as three-node translation: medical experts, ML experts, and bridge figures (often the advisor).
His biggest reading challenge is not comprehension but linkage—knowing what was learned while losing which paper it came from—so he uses linked notes to recover context quickly.

Topics

  • PhD Research Skills
  • Interpretable AI
  • Medical Imaging
  • Interdisciplinary Literature Review
  • Thesis Topic Selection

Mentioned