Get AI summaries of any video or article — Sign up free
Gemini CLI - FREE? Claude Code by Google | First Look and NextJS RAG App Test thumbnail

Gemini CLI - FREE? Claude Code by Google | First Look and NextJS RAG App Test

Venelin Valkov·
5 min read

Based on Venelin Valkov's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Gemini CLI is a free, open-source terminal workflow for Google’s Gemini Code Assist, designed to bring Gemini into developer environments.

Briefing

Gemini CLI lands as a free, open-source “developer-terminal” layer for Google’s Gemini Code Assist, pairing a ChatGPT-like coding workflow with a generous usage tier and built-in tooling such as MCP. In a hands-on build, it quickly scaffolds a NextJS + TypeScript RAG-style app that can upload documents, index them for retrieval, and answer questions grounded in the uploaded text—despite rough edges around speed, configuration, and styling.

The core pitch is straightforward: Gemini CLI brings Gemini directly into a terminal experience, positioned as a competitor to tools like “code” assistants. Google’s free tier is described as 60 model requests per minute and 1k model requests per day, with the project released as open source. Under the hood, Gemini CLI uses Gemini Code Assist, which—on the free plan—collects prompts, related code, generated output, code edits, and usage/feedback data to improve Google products. Users can opt out via the Gemini Code Assist for individuals setup.

Installation is presented as simple: install the required packages and point to the Gemini CLI GitHub repository. Once authenticated, the interface resembles familiar coding assistants: it detects a Gemini MD file, shows available “magic commands,” and runs tools such as Google Search to pull documentation. The workflow also includes agent memory management (showing current context from Gemini files), usage stats (input/output tokens and duration), and MCP enablement—meaning the CLI can integrate with external tool servers.

In the build demo, the user targets a document-chat application: a RAG system designed to be easy to deploy, using Ollama for local model serving and a NextJS/React/TypeScript stack with Tailwind CSS and Shadcn UI. The CLI produces a step-by-step implementation plan, then executes it: creating package.json, scaffolding the NextJS structure, generating TypeScript config (including editor support for VS Code and Cursor), adding Tailwind configuration, and wiring up file upload plus chat logic.

The session isn’t smooth. The CLI hits errors around missing imports (notably Formidable), and Tailwind configuration issues lead to a running app with little to no styling. There are also signs of throttling and model switching—responses slow down and the system shifts from “Gemini 2.5 Pro” to “Gemini 2.5 Flash,” which the narrator attributes to free-tier constraints. Even so, the application comes up and functions: a text file uploads successfully to the project’s uploads/public area, and the chat answers a question using the uploaded document. The demo response frames the answer as grounded in the provided text, showing the RAG loop working end-to-end.

By the end, the main takeaway is pragmatic. Gemini CLI can generate a working RAG web app quickly, with MCP and tool-use capabilities already integrated. But the experience still needs polish: configuration alignment with Tailwind 4, fewer abrupt model switches, faster execution, and more reliable build/test steps would make the workflow more dependable. The demo also flags confusing context/token accounting (high “context left” despite heavy usage) and a long session time for the amount of work completed.

Cornell Notes

Gemini CLI is a free, open-source terminal tool that brings Google’s Gemini Code Assist into a developer workflow, including tool-use and MCP integration. In a hands-on test, it scaffolds a NextJS + TypeScript RAG-style app that supports uploading documents and chatting with answers grounded in the uploaded text, using Ollama for model inference. The build largely succeeds, but the session shows friction: throttling and model switching, errors around Formidable imports, and Tailwind configuration problems that leave the UI mostly unstyled. Despite those issues, the app runs and the upload/chat loop works, making Gemini CLI a promising starting point for document-grounded web apps.

What makes Gemini CLI different from typical coding assistants?

Gemini CLI is positioned as Gemini directly inside developers’ terminals, built on Gemini Code Assist. It includes “magic commands” for agent memory, usage stats, and MCP tool integration, and it can call tools like Google Search to fetch documentation. In the demo, it also manages context from a Gemini MD file and tracks context usage during the session.

How does the demo app implement a RAG-style document chat?

The app is built around two core capabilities: file upload and chat grounded in the uploaded content. The CLI generates an upload API that saves a text file into the project’s uploads/public area, then wires chat logic that queries the model (via Ollama) and uses the provided document as the basis for answers. When asked “What is the truth about AI knowledge,” the response explicitly references the uploaded blog text.

What stack choices did the demo use, and why do they matter?

The target stack is NextJS, React, TypeScript, Tailwind CSS, and Shadcn UI, with Ollama powering the model layer. This combination matters because it determines both the UI layer (Tailwind/Shadcn) and the inference layer (Ollama). The demo’s styling issues trace back to Tailwind configuration mismatches, while the core functionality depends on the Ollama chat integration.

What went wrong during the build, and what does it reveal about reliability?

Several issues surfaced: missing/failed imports around Formidable, Tailwind configuration problems that resulted in an app with little styling, and slowdowns that triggered a switch from Gemini 2.5 Pro to Gemini 2.5 Flash. These problems suggest that while the CLI can generate code quickly, it may need stronger build/test loops and better alignment with newer library versions to reduce breakage.

How do usage and context metrics behave in the session?

The CLI provides stats such as input tokens, output tokens, and duration, plus a “context left” indicator. The narrator notes inconsistencies—context left appears very high (e.g., 97%) despite significant activity, and the session time is long (around 30 minutes). This points to either throttling effects, metric interpretation issues, or context accounting quirks.

What privacy implication comes with using the free tier?

For Gemini Code Assist for individuals (the free plan), Google collects prompts, related code, generated output, code edits, related feature usage information, and feedback to improve products and machine learning technologies. The transcript notes an opt-out path via the Gemini Code Assist for individuals setup page.

Review Questions

  1. What specific terminal features (tools, commands, integrations) did Gemini CLI provide beyond basic chat?
  2. Which parts of the RAG app were generated successfully, and which parts failed or degraded (e.g., styling)?
  3. How did throttling/model switching affect the development workflow during the demo?

Key Points

  1. 1

    Gemini CLI is a free, open-source terminal workflow for Google’s Gemini Code Assist, designed to bring Gemini into developer environments.

  2. 2

    The free tier is described as 60 model requests per minute and 1k model requests per day, with an opt-out available for Code Assist data collection.

  3. 3

    Gemini CLI includes built-in “magic commands” for memory, stats, and MCP-enabled tool integration, including tool calls like Google Search.

  4. 4

    A NextJS + TypeScript document-chat (RAG-style) app can be scaffolded end-to-end: upload documents, then answer questions grounded in the uploaded text.

  5. 5

    The demo succeeded functionally but struggled with reliability: Formidable import/build errors and Tailwind configuration mismatches left the UI largely unstyled.

  6. 6

    Execution speed and model behavior were unstable under free-tier constraints, including a switch from Gemini 2.5 Pro to Gemini 2.5 Flash.

  7. 7

    Token/context and session-duration metrics appeared inconsistent, suggesting either accounting quirks or throttling-related delays.

Highlights

Gemini CLI can generate a working document-upload + chat experience in a NextJS app, with answers grounded in the uploaded text.
MCP enablement and tool-use (including Google Search) appear directly inside the terminal workflow, not as a separate setup.
Tailwind configuration mismatches (especially around Tailwind 4 expectations) can leave the UI nearly blank even when the app runs.
Free-tier constraints showed up as throttling and model switching, slowing the build and changing response behavior mid-session.

Topics

Mentioned

  • MCP