Coding with Cursor AI: My Real Time Builder AI App

TL;DR

The builder generates HTML and CSS from a user description and renders updates in real time by streaming tokens.

Briefing Cornell Notes

Briefing

A real-time website builder that already generates HTML and CSS from a text prompt now gains an image pipeline: it can call an external image model, receive an image URL, and embed that URL directly into the generated CSS/HTML so the page visuals match the user’s theme. The practical payoff is speed and iteration—users can generate a full page, then swap images without rewriting the entire layout, keeping the “modify” workflow responsive.

The workflow starts with a prompt like “2008 Reddit clone,” which produces a live preview of HTML/CSS being rendered as it’s generated. A “modify” action keeps the existing code and applies targeted changes (for example, adding up/down votes, switching to a dark mode, or adjusting title text). Early attempts to add links show limitations: URLs can end up as placeholders (like example.com), and the system doesn’t reliably create working external references.

The new feature focuses on images. The builder runs a fast text model (described as “GPT 40 mini” in the transcript) to return HTML and CSS separately, using regex-based extraction to ensure the API returns renderable code blocks. For real-time UX, the app streams tokens so the preview updates continuously while code is generated. Images are handled differently: instead of streaming pixels, the system calls an image generator API and waits for a returned URL.

To keep generation fast, the image step uses Replicate’s Flux model family (Black Forest Labs’ Flux “Schell” is mentioned as the fastest option). The developer installs Replicate, stores an API token in an environment variable, and builds a small “image service” module (imageService.js) that sends a prompt to Replicate and returns the resulting image URL. The prompt is wrapped with instructions that explicitly tie the image to the website description and request an image URL suitable for embedding.

Once wired in, the “generate website” flow expands: the app generates an image based on the user’s description, then injects the returned image URL into the page’s HTML/CSS so the preview shows a themed visual (e.g., an image linked to “Dark Knight,” or a World of Warcraft-themed page). A key issue appears during “modify”: if the image URL is regenerated every time code is modified (such as when toggling dark mode), the visual changes unexpectedly.

The fix is a workflow split. The app introduces a separate UI input for a “new image URL” (generated via a dedicated “generate image” step) so users can control when the image changes. With this separation, “modify” can update styling or layout while preserving the original image, and users can selectively swap visuals by generating a new image and then applying it.

The result is a rapid creative loop: landing pages and marketing copy can be generated, images can be inserted and resized, and additional styling effects (like centering, ordering elements, and even a 3D-ish animation effect) can be layered on top. The transcript ends with multiple example pages and a code extraction step, emphasizing that the image-embedding feature works well enough to support more advanced visual experiments.

Cornell Notes

The builder generates HTML and CSS from a text description and renders it live, then adds a second capability: generating an image via Replicate’s Flux model and embedding the returned image URL into the page. Speed drives key design choices—HTML/CSS generation streams tokens for real-time preview, while image generation returns a URL that’s inserted once available. A major usability problem emerges when “modify” regenerates the image, causing visuals to change during unrelated edits like dark mode. The solution is workflow separation: a dedicated “generate image” input produces a new image URL, while “modify” can preserve the existing image unless the user explicitly swaps it. This enables fast iteration on both layout and visuals without constantly rewriting the whole page.

How does the app keep the website preview feeling “real time” while generating code?

It streams tokens from the text-generation API while producing HTML and CSS, so the preview updates continuously rather than waiting for a single completed block. The backend also extracts and returns renderable HTML/CSS using regex logic, ensuring the frontend can reliably display the generated code.

Why does image generation behave differently from HTML/CSS generation in the workflow?

Images can’t be streamed token-by-token into a progressive visual the same way code can be streamed. Instead, the app calls an external image generator and waits for a completed result—specifically a URL—then embeds that URL into the page’s HTML/CSS.

What image model is used, and what’s the reason for that choice?

The transcript points to Replicate’s Flux model family, specifically the “Black Forest lab” Flux “Schell” model, described as the fastest image generator option. The developer chooses it to match the app’s emphasis on speed and quick iteration.

What goes wrong when users press “modify” after an image has been generated?

The image can change unintentionally because the modify flow triggers image generation again. For example, switching to a dark theme can also regenerate the image, which breaks the expectation that style edits shouldn’t replace visuals.

How does the app fix the image-changing problem during “modify”?

It separates image generation from code modification. A new UI field/input lets users generate a “new image URL” explicitly. The modify step then uses that provided image URL (or preserves the existing one), so dark mode and other edits can occur without forcing a new image render.

How does the developer validate the feature in practice?

They test with multiple themes—like World of Warcraft and a “Dark Knight” example—then verify that the generated image URL appears in the rendered page. They also test resizing/placement (centering, ordering under price) and confirm that the new separation logic keeps the image stable when applying style changes.

Review Questions

What architectural difference between streaming and URL-based image embedding shapes the user experience in this builder?
Why does separating “generate image” from “modify website” improve control, and what specific UI change enables that?
How do regex-based extraction and separate HTML/CSS returns help the frontend render generated output reliably?

Key Points

1
The builder generates HTML and CSS from a user description and renders updates in real time by streaming tokens.
2
A new image pipeline calls Replicate’s Flux model and embeds the returned image URL into the generated page.
3
Speed is prioritized by using a fast text model for code generation and the fastest available Flux image model option.
4
A naive integration regenerates images during “modify,” causing unexpected visual changes (e.g., dark mode swapping the image).
5
The fix introduces a separate “generate image” input so users can explicitly choose when the image changes.
6
Once image control is separated, users can iterate on layout, styling, and marketing copy while preserving visuals unless they request a new image.

Highlights

The app embeds AI-generated images by taking a returned image URL from Replicate’s Flux model and injecting it into the generated HTML/CSS.

Streaming tokens keeps code generation interactive, while images arrive as a finished URL—so the UX splits into progressive code vs. one-shot visuals.

The key usability breakthrough is separating image generation from “modify” to prevent dark mode and other edits from unintentionally replacing images.

A dedicated “new image URL” input gives users explicit control over when visuals update, enabling stable iterative design. 

Topics

Real-Time Website Builder
AI Image Embedding
Replicate Flux
Token Streaming
UI Workflow Separation

Mentioned

Replicate
Flux
Black Forest Labs
GPT
API
HTML
CSS
UX
XML
AI