Deep Research.....but Open Source
Based on NetworkChuck's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Deep research trades speed for depth by running multi-step web dives that can take 5 to 30 minutes.
Briefing
OpenAI’s “Deep research” promises slower, more verifiable answers—often taking 5 to 30 minutes—by doing multi-step web dives with citations, rather than returning a quick, sometimes shaky summary. The catch is cost: using it requires ChatGPT Pro at $200 per month, which pushes many users to look for alternatives that deliver similar “research-grade” output without the subscription bill.
A workaround surfaced via an open-source implementation credited to David (named in the transcript as “David came up with his own open source implementation”). The pitch is straightforward: replicate Deep research’s workflow—planning a research task, pulling in sources from the web, and producing a markdown report with links—by running the system locally. Instead of paying a monthly premium for the feature, the setup relies on an OpenAI API key (pay-as-you-go) plus a web crawling service. In practice, the transcript frames this as “same capability” with a different cost structure: API usage can be pennies for light testing, while heavy usage can still get expensive.
The transcript also emphasizes what makes Deep research feel different from standard chat responses. Rather than “going off to the races,” it asks for additional context when needed, performs iterative searches, and can analyze multiple data types—text, images, and PDFs—described as multimodal. Citations are a core selling point: most claims come with links so users can verify the trail of evidence. The slower pace is treated as a feature, not a bug; the system’s multi-step reasoning and source gathering are presented as closer to how a human researcher would work.
To demonstrate the open-source version, the creator walks through setup on GitHub using Node.js inside a Docker container. The required components are a Fir Crawler API key (with a free hosted option up to 500 credits) and an OpenAI API key. After cloning the repo, creating environment variables, and building Docker configuration files (including Dockerfile and docker-compose.yml), the system is run from the command line. A sample prompt—“Which animal is better, cats or dogs?”—triggers a research plan with adjustable “breadth” and “depth” parameters, then outputs a markdown report.
The results are compared to ChatGPT Deep research: both conclude “dogs are better,” with the open-source output appearing more command-line and link-heavy, while ChatGPT offers a more polished interface. A second test asks which operating system is better for IT professionals—Mac, Linux, or Windows—framing the criteria around cost, licensing, and security. The open-source system lands on a hybrid strategy: use each platform’s strengths while applying platform-agnostic security practices.
Finally, the transcript notes a broader trend: Hugging Face is reportedly working on an agentic framework for deep search to compete with OpenAI’s approach, underscoring how quickly the ecosystem is moving toward citation-backed, tool-using research agents.
Cornell Notes
OpenAI’s Deep research is positioned as “research-grade” AI: it takes longer (minutes), performs multi-step web dives, and returns answers with citations and links so users can verify claims. The transcript highlights a major barrier—Deep research access requires ChatGPT Pro at $200/month. An open-source alternative is demonstrated that recreates the workflow locally using an OpenAI API key (pay-as-you-go) plus a web crawling service (Fir Crawler). The setup uses Docker and a Node.js environment, then runs a command-line research job that outputs a markdown report. Example prompts (“cats vs dogs” and “best OS for IT pros”) produce conclusions with adjustable breadth/depth, and the open-source output is compared favorably to ChatGPT’s results, especially for verifiability.
What makes “Deep research” different from typical AI chat answers?
Why does the open-source approach matter if it still uses OpenAI?
What components are required to run the open-source deep research locally?
How do “breadth” and “depth” affect the research output?
What do the demo conclusions suggest about the system’s behavior?
What practical limitations show up when running the open-source version?
Review Questions
- Compare the cost structure of ChatGPT Pro access versus the open-source/local approach using an OpenAI API key. What changes and what stays the same?
- Explain how citations and multimodal processing contribute to trust and usefulness in deep research outputs.
- In the open-source demo, how would you expect changing breadth versus depth to alter the final markdown report?
Key Points
- 1
Deep research trades speed for depth by running multi-step web dives that can take 5 to 30 minutes.
- 2
Citation-linked outputs are central to the value proposition, aiming to make AI answers more verifiable than typical chat responses.
- 3
ChatGPT Pro at $200/month is positioned as the main cost barrier for using OpenAI’s Deep research directly.
- 4
An open-source implementation can replicate the workflow locally using an OpenAI API key (pay-as-you-go) plus Fir Crawler for web retrieval.
- 5
Docker-based setup isolates the Node.js environment and helps run the research pipeline reliably across systems.
- 6
Breadth and depth parameters let users control how wide and how thoroughly the research explores sources.
- 7
API rate limits can interrupt or slow research runs, affecting both speed and potentially output quality.