Get AI summaries of any video or article — Sign up free
Introducing Gemini CLI thumbnail

Introducing Gemini CLI

Sam Witteveen·
5 min read

Based on Sam Witteveen's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Gemini CLI turns Gemini Code Assist into a command-line workflow that can edit repos, run commands, and orchestrate tool calls.

Briefing

Google’s Gemini team is rolling out Gemini CLI, a command-line interface that turns Gemini Code Assist into an agent-like workflow for editing files, running commands, and integrating tools—while offering a surprisingly generous free tier. The core pitch is straightforward: developers can log in with a personal Google account, get a free Gemini Code Assist license, and then use Gemini 2.5 Pro with a 1 million context window for 60 requests per minute and 1,000 requests per day at no charge. For teams already using command-line LLM tools (and frustrated by fast-billing limits), that free rate structure is the headline.

Gemini CLI builds on the earlier Gemini Code Assist offering and adds a workflow centered on a repo-local “Gemini MD” file that acts like a rules/context document. After installing via npx, users authenticate either with a Google login or a Gemini API key from Vertex AI / Gemini AI Studio. Once authorized, the CLI can generate and modify project files directly. In a live walkthrough, it creates a Tailwind-based HTML/JS file, then iterates on a more substantial request: a landing page for a cat café in San Francisco with menu content and an “about” page. The system updates files on the fly, asks for permission for actions (with options like “allow once” or “always allow”), and tracks context usage as the project grows—dropping to roughly 98% remaining context during the expansion.

A key differentiator is tool integration through MCPs (Model Context Protocols). The CLI supports adding MCP servers so the model can query external capabilities at the command line level. The walkthrough adds the Hugging Face MCP server, then uses it to search for model components like rerankers and to locate Spaces. The results include multiple variants (e.g., Onyx and GGUF forms), and the CLI can also surface model details and paper search. This makes the CLI less about “chatting” and more about orchestrating concrete actions across services.

The workflow also includes a memory feature that saves user-provided facts into the repo’s Gemini MD context, so later prompts can reuse details (such as a port number). When the user asks for a Flask backend, the CLI creates directories, installs dependencies, sets up a virtual environment, runs the app, and then reports execution metrics like input/output tokens and other internal token counts. It also supports rollback of actions, giving users a safety valve when automated changes go in the wrong direction.

Overall, Gemini CLI positions command-line agent workflows—file editing, command execution, and MCP tool access—inside a Google-backed stack with a free tier that’s unusually large for developers who want to experiment without immediately hitting cost ceilings. The practical takeaway: with Gemini CLI, a single authenticated session can generate a working project, integrate external knowledge via MCPs, and keep state in a repo-local rules/context file, all while staying within clearly defined daily and per-minute limits.

Cornell Notes

Gemini CLI brings Gemini Code Assist into a command-line workflow that can generate and modify files, run commands, and integrate external tools. After installing with npx, users authenticate with a personal Google account for a free Gemini Code Assist license or with a Gemini API key from Vertex AI / Gemini AI Studio. The free tier supports Gemini 2.5 Pro with a 1 million context window plus 60 requests per minute and 1,000 requests per day. A repo-local “Gemini MD” file serves as rules/context, while a memory tool can persist facts into that context. MCP servers—such as Hugging Face—extend the CLI so the model can search models, rerankers, Spaces, and more, turning prompts into concrete tool-driven actions.

What makes Gemini CLI meaningfully different from earlier command-line LLM tools?

It combines three capabilities in one workflow: (1) direct repo editing (creating and updating files), (2) command execution with permissions and rollback, and (3) tool integration via MCPs. In the walkthrough, it generates a Tailwind-based HTML/JS file, then builds a cat café landing page with menu and an about page, and later sets up a Flask backend by creating directories, making a virtual environment, installing dependencies, and running the app. MCP integration is the bridge to external systems like Hugging Face, letting the model search for rerankers and Spaces rather than relying only on text generation.

How does the free tier work, and what limits does it impose?

Using a personal Google account to obtain a free Gemini Code Assist license enables Gemini 2.5 Pro (full version) with a 1 million context window. The usage limits described are 60 requests per minute and 1,000 requests per day at no charge. For higher rate limits, the CLI can also use keys from Vertex AI or Gemini AI Studio.

What role does “Gemini MD” play during a session?

Gemini MD functions like a repo-local rules/context file. As the project grows, the CLI keeps track of context usage (the walkthrough notes about 98% context remaining at one point). The system also references Gemini MD when updating content and when persisting “memory” facts. Later, the user can open Gemini MD to see what it recorded—such as technologies used (Tailwind) and even where images were sourced from (Unsplash).

How does memory persistence work in Gemini CLI?

The CLI includes a memory tool that saves user-provided facts into the repo’s context. In the walkthrough, the user saves a fact (including a port-related detail), and later the system uses that information when generating or configuring the backend. The exact storage location isn’t shown, but the effect is clear: the saved fact ends up in the Gemini MD context and influences subsequent actions.

How do MCP servers extend what Gemini CLI can do?

MCP servers add tool capabilities the model can call from the command line. The walkthrough installs/loads the Hugging Face MCP server, which exposes functions like model search, space search, and image generation. The user then asks for rerankers and receives a list of candidates, including variants such as Onyx and GGUF. The user also searches for a specific type of Space and the CLI returns the matching Space entry.

What does the CLI report after running tasks like a Flask app setup?

After executing steps (creating a virtual environment, installing packages, running the app), the CLI shows execution details and metrics. The walkthrough highlights token accounting—input tokens, output tokens, and additional internal token categories—along with a duration figure (noted as seemingly longer than the wall-clock time). It also demonstrates that actions can be rolled back if needed.

Review Questions

  1. What specific combination of features (file editing, command execution, MCP tool access) does Gemini CLI provide, and why does that matter for building real projects?
  2. How do the free-tier limits (requests per minute/day) and the 1 million context window shape how you’d plan experiments with Gemini CLI?
  3. When using MCP servers like Hugging Face, what kinds of queries can be answered through tool calls rather than pure text generation?

Key Points

  1. 1

    Gemini CLI turns Gemini Code Assist into a command-line workflow that can edit repos, run commands, and orchestrate tool calls.

  2. 2

    A free Gemini Code Assist license is available via personal Google account login, enabling Gemini 2.5 Pro with a 1 million context window.

  3. 3

    The free tier limits usage to 60 requests per minute and 1,000 requests per day, with higher limits possible via Vertex AI or Gemini AI Studio keys.

  4. 4

    A repo-local Gemini MD file acts as persistent rules/context, and the CLI tracks context usage as projects expand.

  5. 5

    A memory tool can persist user-provided facts into Gemini MD so later prompts reuse them (e.g., configuration details like ports).

  6. 6

    MCP servers extend capabilities at the command line level; the walkthrough uses Hugging Face MCP for model and Space search.

  7. 7

    Automated actions require permissions and can be rolled back, reducing risk when the CLI makes unwanted changes.

Highlights

Gemini CLI’s free tier pairs Gemini 2.5 Pro and a 1 million context window with 60 requests per minute and 1,000 requests per day—an unusually large allowance for hands-on development.
The Gemini MD file functions as a persistent rules/context layer for the repo, and the memory tool can inject new facts into it for later reuse.
MCP integration lets the CLI query external systems like Hugging Face—returning concrete model and Space results rather than only generated text.
The walkthrough demonstrates end-to-end automation: generate a UI, update content, then build and run a Flask backend with environment setup and dependency installation.

Topics

Mentioned