How to Run OpenCode Inside an Autonomous Claude Code AI Agent

TL;DR

Cloud Code can run OpenCode via a CLI-style command discovered from OpenCode’s documentation, enabling automated model execution.

Briefing Cornell Notes

Briefing

An autonomous Claude Code agent can now run OpenCode from Cloud Code via a simple CLI command, swap in different OpenRouter models, and generate side-by-side benchmark videos automatically. The practical payoff: one prompt can trigger multiple model runs in parallel, producing HTML outputs that get converted into a grid-style MP4 and then packaged for an X post—turning model comparison into a repeatable workflow rather than a manual testing chore.

The setup starts by pulling OpenCode’s CLI documentation and feeding it into Cloud Code so it can discover the correct “run” flag and command structure. The key command format that emerges is: OpenCode run <model> <provider> <model> <prompt> (with the provider/model naming matching OpenRouter). With an example prompt—whether someone should “walk or drive to the car wash” 50 meters away—the workflow successfully returns a model-specific answer. Testing across models shows the behavior changes as expected: one model recommends walking, while another recommends driving, aligning with the “can’t wash the car if you leave it behind” logic.

Once the CLI invocation works, the workflow is converted into a reusable Cloud Code “skill.” The skill is designed for parallel execution: run the same prompt against multiple OpenRouter models, capture each output, and save results into a consistent folder structure. That structure becomes crucial for the next step—benchmarking creative generation.

For the creative benchmark, the prompt instructs the system to generate a single full-screen animated retro arcade “space battle” scene in HTML5. The agent saves each run as a model-labeled HTML file (for example, game_<model>.html) inside a dedicated experiment directory. The transcript then demonstrates running four models simultaneously—GLM5, Minimax 2.5, Gemini 3 Pro, and Opus 4.6—so the comparison happens quickly and consistently under identical prompt conditions.

After the HTML files land, a Remotion-based skill turns them into a single grid-style video. The result is a side-by-side visual comparison where each panel is labeled by model name, making differences in animation style and scene composition easy to spot at a glance. The workflow is then consolidated into one pipeline that produces an MP4 for the “retro space battle benchmark.”

The final automation step prepares social sharing: the agent creates an X draft by attaching the MP4 and generating a caption that summarizes the experiment—“four LLMs” given the same prompt and building the same HTML5 demo, with results shown side by side. The overall message is less about any single model’s quality and more about building an agent skill that can repeatedly generate, render, and package model comparisons—ready for ongoing testing as new models appear.

Cornell Notes

The workflow builds an autonomous testing skill that runs OpenCode from Cloud Code using a CLI-style command discovered from OpenCode documentation. With OpenRouter, the same prompt can be executed across multiple models in parallel (example models include GLM5, Minimax 2.5, Gemini 3 Pro, and Opus 4.6), and each output is saved as a model-specific HTML file. Those HTML files are then converted into a single grid-style MP4 using a Remotion skill, enabling quick visual side-by-side benchmarking. The pipeline ends by preparing an X draft that attaches the MP4 and generates a caption describing the comparison. This matters because it turns model evaluation—especially creative HTML generation—into a repeatable, automated routine.

How does Cloud Code learn to run OpenCode from the command line?

It starts by copying OpenCode’s CLI documentation into a Cloud Code workspace, then prompting Cloud Code to find a way to run OpenCode with a specific model and prompt. Cloud Code identifies the relevant “run” flag/command pattern and produces a bash command format that successfully executes OpenCode using OpenRouter model/provider naming.

What command structure enables model switching through OpenRouter?

The working format uses an OpenCode “run” command followed by the model and provider details, then the prompt. In practice, the workflow runs the same prompt while swapping the OpenRouter model (e.g., GLM5 vs Gemini 3 Pro), producing different outputs under identical instructions.

How is the benchmark prompt turned into a parallelizable experiment?

The skill is set up to run the same creative prompt across multiple models concurrently, saving each result to a consistent output folder with model-labeled filenames. The prompt instructs generation of a full-screen animated retro arcade space battle scene in HTML5, so each model produces an HTML file that can later be rendered and compared.

Why does saving model-specific HTML files matter for the video comparison?

Remotion needs a set of inputs to assemble a grid-style video. By storing each model’s HTML output with predictable names (e.g., game_<model>.html), the Remotion step can automatically ingest all files and label panels by model, producing a single MP4 for easy side-by-side review.

What does the pipeline automate at the end for sharing?

After generating the MP4, the agent prepares an X draft: it attaches the video (not necessarily posting immediately) and generates caption text summarizing the experiment—four LLMs given the same prompt and building the same retro arcade HTML demo, shown side by side.

Review Questions

What is the role of OpenCode’s CLI documentation in building the Cloud Code skill?
How does the workflow ensure that different models are compared fairly?
Describe the sequence from model execution to MP4 creation and then to an X draft.

Key Points

1
Cloud Code can run OpenCode via a CLI-style command discovered from OpenCode’s documentation, enabling automated model execution.
2
OpenRouter model/provider switching lets the same prompt produce different outputs across multiple LLMs.
3
A reusable Cloud Code skill supports parallel runs, saving each model’s output as a model-labeled HTML file for later processing.
4
Remotion converts the set of generated HTML files into a single grid-style MP4, making creative comparisons visually consistent.
5
The pipeline can package results for social sharing by generating an X draft with the MP4 and an experiment caption.
6
The workflow is designed for repeatable benchmarking, so new models can be added to the parallel list without rebuilding the process.

Highlights

A single prompt can be executed across multiple OpenRouter models in parallel, with outputs saved automatically for comparison.

The workflow turns HTML generation into a benchmark by rendering model outputs into a labeled grid video via Remotion.

The end-to-end pipeline culminates in an X draft that attaches the benchmark MP4 and auto-writes a caption describing the side-by-side results.

Topics

OpenCode CLI
Cloud Code Skills
OpenRouter Models
Parallel Benchmarking
Remotion Video Rendering