Introducing Gemini CLI
Based on Sam Witteveen's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Gemini CLI turns Gemini Code Assist into a command-line workflow that can edit repos, run commands, and orchestrate tool calls.
Briefing
Google’s Gemini team is rolling out Gemini CLI, a command-line interface that turns Gemini Code Assist into an agent-like workflow for editing files, running commands, and integrating tools—while offering a surprisingly generous free tier. The core pitch is straightforward: developers can log in with a personal Google account, get a free Gemini Code Assist license, and then use Gemini 2.5 Pro with a 1 million context window for 60 requests per minute and 1,000 requests per day at no charge. For teams already using command-line LLM tools (and frustrated by fast-billing limits), that free rate structure is the headline.
Gemini CLI builds on the earlier Gemini Code Assist offering and adds a workflow centered on a repo-local “Gemini MD” file that acts like a rules/context document. After installing via npx, users authenticate either with a Google login or a Gemini API key from Vertex AI / Gemini AI Studio. Once authorized, the CLI can generate and modify project files directly. In a live walkthrough, it creates a Tailwind-based HTML/JS file, then iterates on a more substantial request: a landing page for a cat café in San Francisco with menu content and an “about” page. The system updates files on the fly, asks for permission for actions (with options like “allow once” or “always allow”), and tracks context usage as the project grows—dropping to roughly 98% remaining context during the expansion.
A key differentiator is tool integration through MCPs (Model Context Protocols). The CLI supports adding MCP servers so the model can query external capabilities at the command line level. The walkthrough adds the Hugging Face MCP server, then uses it to search for model components like rerankers and to locate Spaces. The results include multiple variants (e.g., Onyx and GGUF forms), and the CLI can also surface model details and paper search. This makes the CLI less about “chatting” and more about orchestrating concrete actions across services.
The workflow also includes a memory feature that saves user-provided facts into the repo’s Gemini MD context, so later prompts can reuse details (such as a port number). When the user asks for a Flask backend, the CLI creates directories, installs dependencies, sets up a virtual environment, runs the app, and then reports execution metrics like input/output tokens and other internal token counts. It also supports rollback of actions, giving users a safety valve when automated changes go in the wrong direction.
Overall, Gemini CLI positions command-line agent workflows—file editing, command execution, and MCP tool access—inside a Google-backed stack with a free tier that’s unusually large for developers who want to experiment without immediately hitting cost ceilings. The practical takeaway: with Gemini CLI, a single authenticated session can generate a working project, integrate external knowledge via MCPs, and keep state in a repo-local rules/context file, all while staying within clearly defined daily and per-minute limits.
Cornell Notes
Gemini CLI brings Gemini Code Assist into a command-line workflow that can generate and modify files, run commands, and integrate external tools. After installing with npx, users authenticate with a personal Google account for a free Gemini Code Assist license or with a Gemini API key from Vertex AI / Gemini AI Studio. The free tier supports Gemini 2.5 Pro with a 1 million context window plus 60 requests per minute and 1,000 requests per day. A repo-local “Gemini MD” file serves as rules/context, while a memory tool can persist facts into that context. MCP servers—such as Hugging Face—extend the CLI so the model can search models, rerankers, Spaces, and more, turning prompts into concrete tool-driven actions.
What makes Gemini CLI meaningfully different from earlier command-line LLM tools?
How does the free tier work, and what limits does it impose?
What role does “Gemini MD” play during a session?
How does memory persistence work in Gemini CLI?
How do MCP servers extend what Gemini CLI can do?
What does the CLI report after running tasks like a Flask app setup?
Review Questions
- What specific combination of features (file editing, command execution, MCP tool access) does Gemini CLI provide, and why does that matter for building real projects?
- How do the free-tier limits (requests per minute/day) and the 1 million context window shape how you’d plan experiments with Gemini CLI?
- When using MCP servers like Hugging Face, what kinds of queries can be answered through tool calls rather than pure text generation?
Key Points
- 1
Gemini CLI turns Gemini Code Assist into a command-line workflow that can edit repos, run commands, and orchestrate tool calls.
- 2
A free Gemini Code Assist license is available via personal Google account login, enabling Gemini 2.5 Pro with a 1 million context window.
- 3
The free tier limits usage to 60 requests per minute and 1,000 requests per day, with higher limits possible via Vertex AI or Gemini AI Studio keys.
- 4
A repo-local Gemini MD file acts as persistent rules/context, and the CLI tracks context usage as projects expand.
- 5
A memory tool can persist user-provided facts into Gemini MD so later prompts reuse them (e.g., configuration details like ports).
- 6
MCP servers extend capabilities at the command line level; the walkthrough uses Hugging Face MCP for model and Space search.
- 7
Automated actions require permissions and can be rolled back, reducing risk when the CLI makes unwanted changes.