Gemini 1.5 Pro for Code - Part 01
Based on Sam Witteveen's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Gemini can ingest a repository and generate runnable crewAI multi-agent code using the repo’s structure and docs as context.
Briefing
Gemini 1.5 Pro for Code can ingest a real GitHub-style repository, then generate working multi-agent Python code that interacts with the repo’s structure—first using OpenAI by default, then switching to Gemini models, and finally adding external tools like DuckDuckGo search. The practical takeaway is that code-focused prompting plus repository context can produce end-to-end prototypes (agents, tasks, and tool calls) with relatively little manual wiring, even when some integration details still need human correction.
The workflow starts by uploading the crewAI repository and selecting the most relevant parts: the source code and the documentation markdown files, while skipping tests. Roughly 35,000–37,000 tokens are included, and the prompt asks Gemini to summarize what crewAI does and what it’s built with. Gemini identifies the stack as Python-based and points to Pydantic and LangChain, along with OpenAI as the default LLM integration. It also produces a concrete list of pip packages needed for a Colab setup (including crewAI and LangChain-related dependencies), which can be copied directly into a notebook.
With dependencies installed, Gemini generates a simple two-agent bot: one agent acts as a hotel-chain customer seeking value for money, and the other plays a salesperson selling air conditioners. The code uses crewAI’s Agent/Task/Crew constructs and runs quickly because the prompt context stays around the 40k-token range. The resulting interaction includes a dialogue-like exchange where the salesperson asks for specifics (e.g., number of rooms and floors) and the system produces a final set of options and requirements. The run also reveals small quirks—like gender-neutral phrasing when the prompt specifies a “saleswoman.”
Next comes model switching. Gemini is asked to modify the same crewAI code to use a Gemini model via Google’s generative AI integration. The generated version largely keeps the structure but introduces integration mistakes—such as an incorrect package name and import path—requiring manual adjustment to use langchain_google_genai and ChatGoogleGenerativeAI. After fixing those details, the bot runs successfully on Gemini, and the outputs differ from the OpenAI version.
Finally, Gemini is pushed to include tool use: a search agent gathers recent AI-release information via DuckDuckGo, and a second agent rewrites it from an “AI doomer” perspective. The first attempt fails due to missing required fields (notably backstory), but after adding backstories, the tool-enabled pipeline works. It retrieves items including OpenAI’s Sora and Google-related generative AI and personalization news, then produces sensational rewrite text.
Overall, the results suggest a strong pattern: repository source code alone can be enough for Gemini to infer key classes, inputs, and outputs, enabling code understanding and downstream generation (tests, additional agents, and possibly docs—though documentation quality may require more guidance).
Cornell Notes
Gemini 1.5 Pro for Code can take a repository’s source code and docs, then generate runnable crewAI multi-agent Python programs. It first produces a two-agent customer/salesperson bot using crewAI with OpenAI as the default LLM, including task definitions and agent dialogue. When asked to switch from OpenAI to Gemini, it mostly preserves the structure but may output incorrect package/import details that require manual fixes (e.g., using langchain_google_genai and ChatGoogleGenerativeAI). Adding tool use works too: a DuckDuckGo search agent can fetch recent AI-release info, and a second agent can rewrite it in a specified “AI doomer” tone, though missing required fields like backstory can cause runtime errors. This matters because it enables rapid prototyping with less manual scaffolding, while still benefiting from developer oversight.
How does Gemini use repository context to generate code, and what parts of the repo matter most?
What does the first generated crewAI example do, and what structure does it use?
What changes when switching from OpenAI to Gemini, and why is manual correction needed?
How does tool use work in the multi-agent setup, and what tool was used here?
What caused the tool-enabled example to fail initially, and how was it resolved?
What’s the practical lesson about using source code vs. docs for code understanding?
Review Questions
- When Gemini switches LLM providers, which integration details are most likely to require developer correction (packages, imports, or prompt structure)?
- In the two-agent air-conditioner example, which task prompts drive the salesperson to ask for specific hotel requirements?
- For the DuckDuckGo tool pipeline, what minimum agent information must be present to avoid runtime errors (e.g., backstory), and why does that matter?
Key Points
- 1
Gemini can ingest a repository and generate runnable crewAI multi-agent code using the repo’s structure and docs as context.
- 2
A two-agent customer/salesperson workflow can be produced with crewAI’s Agent/Task/Crew abstractions and executed with minimal manual changes.
- 3
Switching from OpenAI to Gemini often requires fixing dependency names and import paths (e.g., using langchain_google_genai and ChatGoogleGenerativeAI).
- 4
Tool-enabled agent chains can fetch external information via DuckDuckGo and feed it into a second agent for rewriting.
- 5
Missing required agent fields like backstory can break execution, so generated code still needs validation.
- 6
Source code-only context can help Gemini identify key classes and I/O patterns, but doc generation may need extra guidance to be reliable.