My Most INSANE AI Agent Ever (OpenClaw Clone)

TL;DR

A WhatsApp command can trigger Claude Code to browse X, gather context, and return results in chat.

Briefing Cornell Notes

Briefing

An AI agent built around a dedicated Mac mini can research on social platforms, schedule autonomous tasks, and even generate and publish videos—without relying on API keys. The workflow starts with a WhatsApp command that triggers Claude Code to navigate sites like X, gather context by opening posts and tweets, and send back findings in chat. In the demonstration, the agent searches for “Maltbook,” reads what’s being discussed, and returns a message summarizing what it found—positioning the system as a practical “do the browsing for me” layer rather than a purely conversational chatbot.

The core design choice is operational: the agent runs inside a browser session where accounts are already logged in (YouTube, X, LinkedIn, Gmail, and more). Instead of storing and using API keys, the system leverages authenticated web access, which the creator frames as both more secure and easier to keep running. Claude Code runs continuously on the Mac mini, tied to a $200 subscription, so scheduled and on-demand actions can happen without manual supervision.

Beyond research, the agent adds a scheduling layer using cron jobs. The system can automatically open sites such as Hacker News, check the top post, navigate into it, and produce summaries—then repeat on a daily cadence. A live test schedules a job for 15:46 (about a minute later), and the machine proceeds without direct control, demonstrating that the agent can operate like a background newsroom: collect, summarize, and potentially post or email results while the user is away. Jobs can also be removed from the queue to stop recurring behavior.

The most ambitious capability is video production. The agent is trained with a skill system that includes X skills, video research, video editing (using Remotion), thumbnail creation, and YouTube publishing. In an example workflow, the user provides a URL to a trending Google project (“Project Genie”), then instructs the agent to plan a story-driven video: use main clips as silent background, add voiceover, and insert additional “spots” from clips. The agent runs in an autonomous pipeline—researching, drafting a plan, downloading or reusing clips, assembling scenes, and stitching the final output—before producing a near-finished video.

The resulting video narration focuses on Project Genie’s “world sketching,” where users describe a world in plain text and then “walk through it,” including steps like choosing a character and fine-tuning the world. After generating content, the system can extend into distribution: the creator describes posting to YouTube and then sharing outward to X, with the broader goal of automating social media workflows.

Overall, the project’s pitch is less about a single flashy model and more about an end-to-end agent stack: authenticated browsing for context, cron-based automation for routine tasks, and a Remotion-backed skill pipeline for producing and publishing videos—then iterating by expanding the agent’s skills over time, including more complex actions like applying for jobs.

Cornell Notes

The system centers on an AI agent running on a dedicated Mac mini that can act on the web through logged-in browser accounts rather than API keys. A WhatsApp command triggers Claude Code to browse platforms like X, collect context, and return results in chat. Cron jobs enable autonomous scheduled tasks such as visiting Hacker News, opening the top post, and generating summaries without user control. The agent’s biggest focus is a skill-based pipeline for story-driven video creation: it uses a video research skill to gather material, Remotion-based editing to assemble scenes, and YouTube skills to publish and create thumbnails. The workflow is designed to scale by adding more skills and automating both content production and distribution.

How does the agent avoid API keys while still performing actions on sites like X and YouTube?

It relies on authenticated browser sessions. The Mac mini runs the automation with Chrome logged into accounts (e.g., YouTube, X, LinkedIn, Gmail). Claude Code then uses those existing sessions to navigate, search, open posts, and perform tasks, instead of calling services via API keys.

What does the WhatsApp-to-agent workflow look like in practice?

A user sends a command like “/claude” plus an instruction to research a topic (e.g., “research Maltbook”). Claude Code launches a task, browses X to gather context by opening relevant tweets/posts, and streams progress before sending the final findings back as a WhatsApp message.

How are autonomous scheduled tasks implemented, and what example proves it works?

The system uses cron jobs on the Mac mini. A test schedules a job for 15:46 (about a minute later): open Hacker News, check the top post, navigate into it, and write a summary. The machine then performs the steps without manual control, and the agent removes the job afterward to stop the recurring workflow.

What skill pipeline turns a URL into a finished YouTube-style video?

The agent uses a plan mode to outline the video, then runs skills for video research (gathering information and clips), video edit (using Remotion to assemble and improve the video), and YouTube operations (uploading and thumbnail-related steps). In the Project Genie example, it uses main clips as silent background with voiceover on top and adds additional clip “spots,” then stitches the final output.

What was the concrete demonstration topic for the video automation, and what was the video’s core message?

The demonstration used a Google project called “Project Genie.” The produced narration highlights “world sketching”: describing a world in plain text, picking a character, and walking through it, with the ability to fine-tune the world to match a vision.

Key Points

1
A WhatsApp command can trigger Claude Code to browse X, gather context, and return results in chat.
2
The system uses logged-in browser accounts on a dedicated Mac mini to reduce reliance on API keys.
3
Cron jobs provide autonomous scheduling for tasks like visiting Hacker News, opening the top post, and summarizing it.
4
A skill system enables end-to-end video generation, combining video research, Remotion-based editing, and YouTube publishing workflows.
5
The Project Genie example shows how a URL can become a story-driven, voiceover-based video with background clips and inserted “spots.”
6
After content creation, distribution can be automated across platforms such as YouTube and X.

Highlights

The agent can research on X and deliver a synthesized result back through WhatsApp after navigating posts and tweets.

Cron-based jobs let the system act while the user is away—opening Hacker News, summarizing the top story, and repeating on a schedule.

Video generation is treated as a pipeline of skills: research → planning → Remotion editing → YouTube publishing.

Topics

AI Agents
Claude Code
Cron Scheduling
Video Automation
Remotion