My First Successful AI Agent Ish Project feat OpenAI-o1!
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
The pipeline uses four specialized agents—web, rewrite, voice, and newsletter—coordinated by a Mastermind model that approves or forces reanalysis at each stage.
Briefing
A four-agent workflow can reliably turn scraped tech headlines into a polished, automated newsletter—complete with rewritten stories, an AI-generated voice MP3, and an HTML email—by using a “Mastermind” model to gate quality at each step. The key improvement driving the system’s success is stronger instruction-following from OpenAI o1 models, paired with simple pass/fail logic that forces reanalysis or skips downstream steps when outputs don’t meet quality or engagement standards.
The setup starts with a web agent that scrapes targeted content from The Verge. It pulls from a specific page section (for testing, the “Most Popular” area) and saves the results as unstructured text—hundreds of lines that are not immediately newsletter-ready. That raw material is then handed to a rewrite agent running GPT-4o mini, tasked with transforming the messy input into a cohesive set of short, captivating stories. The rewrite step includes practical requirements: keep image URLs and article URLs so readers can click back to the sources, and begin with a brief attention-grabbing introduction. The rewrite process can iterate, because the model may suggest edits to improve structure, clarity, and flow.
Next comes a voice agent that converts the rewritten stories into a script suitable for narration and generates an audio file using ElevenLabs. The system hosts the resulting MP3 via a URL so the newsletter can link to an audio version readers can play instead of reading. In testing, the voice step is described as mostly reliable, with only occasional minor suggestions.
Finally, a newsletter agent converts the story set into a working HTML email using a predefined template. It may loop with the Mastermind to adjust formatting until the HTML is “ready for publication.” Once approved, a separate sending script (called an “Agent five” in spirit) pushes the finished HTML to a mailing list through an API and returns a confirmation that the newsletter was sent.
At the center of the orchestration is the Mastermind, run on OpenAI o1 preview. It evaluates each stage’s output for quality and engagement, and it uses explicit decision rules: if the evaluation response contains issues or weaknesses, the system triggers reanalysis; if the output meets standards, it proceeds to the next agent. One operational detail matters for accuracy: the prompts include the current date and time, because news items that appear “in the future” can be flagged as fake without temporal context.
In a full automation run, the Mastermind first reviews scraped content, then triggers reanalysis when needed, and ultimately concludes that all selected articles are suitable. The rewrite and voice steps are then approved, followed by the HTML newsletter step, which is also judged ready and sent. The end result lands in a Gmail test inbox as a formatted email with embedded images, clickable links back to sources, an unsubscribe feature, and a playable MP3 link—showing how a single script can coordinate multiple specialized agents into one end-to-end publishing pipeline.
Cornell Notes
A multi-agent pipeline can automatically scrape tech headlines, rewrite them into short newsletter stories, generate a narrated voice MP3, and assemble everything into a template-based HTML email that gets sent to subscribers. The system’s reliability comes from a “Mastermind” using OpenAI o1 preview to evaluate each stage for quality and engagement, then either approve outputs or trigger reanalysis when issues appear. The workflow starts with a web agent scraping The Verge “Most Popular” content into unstructured text, then a rewrite agent (GPT-4o mini) structures it with image and article URLs. A voice agent uses ElevenLabs to produce an MP3 hosted at a URL, and a newsletter agent formats the final HTML. Including current date/time in prompts helps prevent “future news” from being incorrectly flagged.
How does the workflow turn messy scraped text into newsletter-ready content?
What role does the “Mastermind” model play in keeping the pipeline from producing low-quality outputs?
Why does the system include date and time in prompts?
How is the voice version of the newsletter generated and delivered to readers?
How does the system ensure the final email is actually publishable HTML?
Review Questions
- What specific quality checks does the Mastermind perform, and how do those checks change the system’s next action?
- Which agent is responsible for preserving image and article URLs, and what is the input/output relationship between the web agent and rewrite agent?
- How does the system handle temporal issues that can cause news to be flagged as fake?
Key Points
- 1
The pipeline uses four specialized agents—web, rewrite, voice, and newsletter—coordinated by a Mastermind model that approves or forces reanalysis at each stage.
- 2
Scraped content is intentionally kept unstructured at first, then converted into short stories with preserved image URLs and clickable article URLs during the rewrite step.
- 3
ElevenLabs generates a hosted MP3 narration, and the newsletter HTML links to that audio for readers who prefer listening.
- 4
Newsletter HTML is produced from a strict template, with iterative refinement when formatting needs adjustment before sending.
- 5
OpenAI o1 preview quality gating relies on explicit evaluation signals (issues/weaknesses) to decide whether to reanalyze or proceed.
- 6
Including current date and time in prompts reduces errors where “future” news gets incorrectly rejected as fake.
- 7
A single automation run can end with a confirmed email delivery to a mailing list via an API-based sending script.