Get AI summaries of any video or article — Sign up free
GeminiCLI - The Deep Dive with MCPs thumbnail

GeminiCLI - The Deep Dive with MCPs

Sam Witteveen·
6 min read

Based on Sam Witteveen's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Gemini CLI’s streaming chat builds work best when streamed output is explicitly treated as Markdown and UI behaviors like autofocus are spelled out clearly.

Briefing

Gemini CLI’s built-in tools and MCP integrations can turn “rough” app scaffolding into a working, deployable project—especially when developers lean on search, file tools, and context MCPs to fix issues in real time. After Google shipped bug fixes and the project’s community surged, the walkthroughs focus on practical workflows: building a streaming Next.js chat UI, debugging common implementation failures, and then wiring external capabilities through MCP servers like DuckDuckGo, Hugging Face, and Context 7.

The first walkthrough starts with a Next.js chat app that streams Gemini responses token-by-token and renders the output as Markdown. Model selection matters for speed and usability; switching to a faster model (described as “Flash”) improves response times while keeping formatting like bold text and tables intact. The deeper value comes from how Gemini CLI handles iterative development mistakes. When a “bad prompt” triggers code generation for a custom chat system (including streaming and Cloud Run deployment), the process repeatedly hits issues—most notably around streaming UI updates, autofocus returning to the input, and Markdown rendering. Fixes often require precise instruction, and the workflow gets more reliable when the model can verify assumptions via tools.

A key debugging pattern is using Gemini CLI’s internal Google search tool whenever versioning or cutoff-date uncertainty appears. Starting from scratch, the generated Next.js code used an older Next.js version (version 14 was identified), which led to errors; instructing Gemini CLI to search for the latest Next.js version corrected the mismatch. A similar issue occurred with the model choice: the project initially targeted “Gemini Pro” (1.0), then was steered toward “Gemini Flash,” and finally prompted to search for the latest Gemini Flash model name. For long-context work, the walkthrough notes that the 2.5 model stands out when large amounts of retrieved information need to be included.

When streaming seemed broken, the root cause wasn’t the streaming logic at all—it was a low max output tokens setting that truncated responses. The fastest way to diagnose it was to provide a screenshot and ask the system to inspect what was going wrong. After increasing max output tokens, the app began streaming correctly. Markdown display then required an explicit instruction to treat streamed output as Markdown. From there, Gemini CLI could initialize a Git repo, commit changes, install needed packages (with permission prompts), and update a Dockerfile for Cloud Run. The walkthrough also flags a deployment constraint: Gemini CLI can’t read API keys from environment variables directly, so developers must arrange credentials so the tool can access them.

The second walkthrough demonstrates fetch-style automation and why MCPs help. Direct “web fetch” requires fully qualified URLs and can fail on sites that block Gemini-style downloading. Using DuckDuckGo MCP, the workflow searches for TechCrunch, fetches the five most recent articles, and saves summaries with titles and URLs—avoiding the redirect/URL-handling friction seen with Google-based approaches.

The third walkthrough adds richer integrations. Hugging Face MCP enables image workflows: an image is “giflified” by sending a Dropbox link, downloading it locally, processing it via Hugging Face Gradio spaces, and returning a WebP file. For development, Context 7 MCP supplies documentation context so an agent can answer Gemini CLI questions without relying on generic Google search. An ADK agent is generated to run in ADK web, uses the Context 7 docs for Gemini CLI answers, and demonstrates that retrieved documentation can produce correct responses (including a factual reference to Gemini CLI’s June 25 release). The overall takeaway: MCPs turn Gemini CLI from a code generator into a tool-using development assistant that can retrieve, fetch, transform, and ground answers in external knowledge and docs.

Cornell Notes

Gemini CLI becomes far more effective when developers pair its internal tools (streaming, file access, search, web fetch) with MCP servers that supply missing capabilities. In a Next.js streaming chat build, common failures—truncated output, broken Markdown rendering, and UI quirks like autofocus—were resolved by tightening instructions and using search to correct version/model mismatches. A screenshot-based diagnosis revealed that “streaming problems” were caused by a low max output tokens setting. MCPs then enabled reliable content workflows: DuckDuckGo MCP fetched and summarized TechCrunch articles with titles and URLs, while Hugging Face MCP processed images via Gradio spaces. Context 7 MCP grounded an ADK agent in Gemini CLI documentation so it could answer questions without falling back to generic Google search.

What are the most common “it’s not working” issues that show up when building a streaming chat app with Gemini CLI, and how were they fixed?

The walkthrough highlights three recurring categories: (1) streaming UI behavior—tokens arrive incrementally, but the screen must update correctly; (2) interaction polish—after generation, the input should autofocus again; and (3) formatting—streamed text must be rendered as Markdown. The most concrete failure was truncation: responses appeared to stop early, but the cause was a low max output tokens parameter. Increasing max output tokens restored full streaming. Markdown issues were resolved by explicitly instructing the system to display streamed outputs as Markdown. Autofocus and other UI details were fixed by describing the desired behavior clearly enough that the model could implement it in one pass.

Why does Google search inside Gemini CLI matter during development, beyond “finding answers”?

Google search inside Gemini CLI acts like a version and naming verifier. When the generated Next.js project started from an older Next.js version (identified as version 14), errors followed. Prompting Gemini CLI to search for the latest Next.js version corrected the mismatch. A similar issue happened with model selection: the project initially targeted Gemini Pro (1.0), then was steered to Gemini Flash, and finally prompted to search for the latest Gemini Flash model name. The walkthrough also notes that when information might fall outside the model’s cutoff date, search is the reliable way to fetch current details.

How did the walkthrough distinguish between a streaming bug and a parameter bug?

It used a screenshot-based debugging loop. After repeated attempts to “fix streaming,” the system was asked to inspect what was wrong in the responses. That visual inspection showed the output was being cut off and sometimes appearing in the wrong message bubble—pointing away from streaming mechanics and toward generation limits. The diagnosis concluded that max output tokens was set too low; raising that parameter made streaming work properly.

What problem does DuckDuckGo MCP solve compared with web fetch or Google-based fetching?

DuckDuckGo MCP avoids URL-handling and blocking issues that can appear with web fetch and Google-style downloading. In the walkthrough, Google search plus content downloading often didn’t return raw URLs cleanly (redirect URLs were involved), and some sites block Gemini-style page downloads. With DuckDuckGo MCP, the workflow could search for TechCrunch, fetch the five most recent articles, and then save summaries including titles and URLs—without relying on the problematic raw-URL pipeline.

How do Hugging Face MCP and Context 7 MCP differ in purpose?

Hugging Face MCP is used for capability expansion—especially model/space workflows like image transformations. The walkthrough “giflified” an image by sending a Dropbox link, downloading it locally, calling the Hugging Face Gradio space, and retrieving the result as a WebP file. Context 7 MCP is used for grounding development and agent behavior in documentation. It provided ADK docs so an agent could answer Gemini CLI questions using the docs context rather than generating its own Google search tool or relying on generic web search.

What’s the practical deployment constraint mentioned when pushing the generated app to Cloud Run?

Gemini CLI can’t read API keys from environment variables directly. The walkthrough advises arranging credentials so the tool can access them during deployment. After local development, it updated the Dockerfile for Cloud Run, but key access needed to be handled explicitly.

Review Questions

  1. When streaming output appears truncated in Gemini CLI, what parameter should be checked first, and why?
  2. How does internal Google search help resolve version mismatches during code generation?
  3. What role does Context 7 MCP play in preventing an ADK agent from relying on generic Google search?

Key Points

  1. 1

    Gemini CLI’s streaming chat builds work best when streamed output is explicitly treated as Markdown and UI behaviors like autofocus are spelled out clearly.

  2. 2

    Many “streaming” failures trace back to generation settings like max output tokens rather than token streaming logic itself.

  3. 3

    Internal Google search is a practical tool for correcting version and model-name mismatches (e.g., latest Next.js version and latest Gemini Flash model).

  4. 4

    Screenshot-based debugging can quickly reveal whether truncation or message-bubble rendering is the real issue.

  5. 5

    DuckDuckGo MCP enables reliable “search → fetch → summarize with titles and URLs” workflows even when Google-based pipelines return redirect URLs or sites block downloads.

  6. 6

    Hugging Face MCP supports non-coding tasks like image transformation via Gradio spaces, returning files with correct MIME types (e.g., WebP).

  7. 7

    Context 7 MCP grounds agent answers in documentation so development agents can avoid falling back to generic web search tools.

Highlights

Increasing max output tokens fixed what looked like a streaming bug, restoring full incremental output.
Prompting Gemini CLI to search for “latest” versions prevented errors caused by outdated Next.js and incorrect model naming.
DuckDuckGo MCP made it straightforward to fetch and summarize TechCrunch articles with both titles and URLs.
Hugging Face MCP turned a Dropbox-linked photo into a locally saved “Gibli” WebP image via a Gradio space workflow.
Context 7 MCP enabled an ADK agent to answer Gemini CLI questions using docs context, with correct factual grounding (including the June 25 release date).

Topics

  • Gemini CLI
  • MCP Servers
  • Streaming Chat
  • DuckDuckGo MCP
  • Hugging Face MCP
  • Context 7 MCP
  • ADK Web
  • Cloud Run Deployment

Mentioned