Get AI summaries of any video or article — Sign up free
VEO 3 AI: Testing YOUR Prompts Live! thumbnail

VEO 3 AI: Testing YOUR Prompts Live!

MattVidPro·
5 min read

Based on MattVidPro's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

VEO 3 often produces visually on-target results when prompts are specific about action, camera framing, and style, but reliability remains inconsistent.

Briefing

Google’s VEO 3 is capable of turning highly specific, meme-heavy prompts into short, mostly coherent text-to-video generations with audio that often lands surprisingly well—but the experience is still dominated by glitches, prompt-length limits, and frequent “failed gens” that can burn credits fast.

In a live prompt-testing session, MattVidPro repeatedly feeds VEO 3 (text-to-video) with viewer-submitted ideas ranging from absurd character transformations (“skinny, ugly, short guy drinking a special juice that turns him into a Chad”) to cinematic set pieces (a “drone footage of Godzilla smashing a volcano in downtown NYC”), horror vibes (a “Skinwalker” prompt), and detailed micro-world shots (a “POV cinematic shot entering human skin… flowing plasma”). Many outputs show strong visual style adherence—especially when prompts include concrete camera language, time period cues, and genre framing. Sound also frequently appears to work “right off rip,” with the creator calling out audio as one of VEO 3’s biggest strengths.

At the same time, the session makes clear how fragile the pipeline remains. The interface is described as buggy: videos sometimes render without audio, only show partial playback, or fail to display properly until the user refreshes or downloads the MP4. In at least one case, the MP4 contains no audio even though the prompt implies dialogue. Audio can also go missing in otherwise successful generations, and the creator speculates that certain terms or copyrighted character references may trigger restrictions or degrade results.

Credits become the practical limiter. The stream starts with limited access, then the creator switches accounts and eventually buys additional credits after reaching zero. Even with paid credits, the system can still reject generations; later in the stream, multiple prompts disappear when the user refreshes, and the creator notes that failed generations can lead to credit loss or refunds depending on the outcome. The session also highlights operational constraints: the platform queues only a limited number of generations at once, and the creator repeatedly mentions a cap on prompt length (and a maximum of five generations queued). Longer, more complex prompts often struggle to fit into the 8-second generation window, leading to mushy motion, unclear details, or partial compliance.

Despite the jank, several prompts deliver standout results. The “Pope” prompt finally succeeds after multiple attempts, producing a dramatic, wide-angle, rap-video-like moment with cheering crowds. The “Chichin Chong and the VW bus anime style” prompt also lands, while the “biblically accurate angel” prompt gets close enough to feel convincing. The “Tron-like simulation” and the “Titanic being salvaged” concepts show strong atmosphere and texture, even if motion and exact realism can fall short.

Overall, the live test paints VEO 3 as a promising but still finicky tool: it rewards careful prompt engineering and iterative reruns, but it punishes overlong instructions, copyrighted or sensitive references, and platform bugs that can silently strip audio or waste credits. The practical takeaway is that success often comes from shorter, more targeted prompts—and from using Discord to manage and refine submissions—while expecting occasional failures as part of the workflow.

Cornell Notes

VEO 3 can generate short text-to-video clips with strong visual style and often workable audio, but the workflow is unstable. In a live testing session, many prompts produced recognizable scenes (transformations, cinematic set pieces, horror, and micro-world visuals), yet frequent glitches included missing audio, buggy playback, and “failed gens” that consumed credits. Prompt engineering mattered: concrete camera directions, genre cues, and manageable length improved results, while overly long or complex prompts struggled to fit into the 8-second window. The creator also relied on Discord to queue prompts and used account switching and credit purchases to keep testing. The big lesson: VEO 3 is powerful, but reliability and cost control remain the limiting factors.

What kinds of prompts tended to work best during the test?

Prompts that were concrete and visually anchored—camera framing (“POV cinematic shot”), style cues (“VHS footage,” “documentary style,” “90s VHS”), and clear actions—often produced more coherent results. The creator also found that shorter, more targeted prompts were easier for the model to execute within the 8-second generation window, while very long prompts frequently became mushy or incomplete.

Why did audio repeatedly fail or behave inconsistently?

Audio sometimes disappeared even when the clip visually rendered. The creator described cases where the interface showed no audio, and downloading the MP4 revealed no audio track. They also noted that certain prompts (including ones involving recognizable characters or potentially restricted content) might trigger limitations or degrade outputs, though the exact cause wasn’t confirmed.

How did credit limits shape the testing process?

Credits were the bottleneck. The session started with limited access, then the creator switched accounts and later bought additional credits. Failed generations could cost credits, and refreshing could wipe failed prompts from the queue, preventing easy reruns. The creator also mentioned queue caps (only a limited number of generations at once, and a maximum of five queued) that forced batching and prioritization.

What operational issues made the workflow harder than expected?

The interface was described as buggy: videos sometimes wouldn’t display properly, scrolling/playback could break, and audio could be missing. The creator repeatedly used refresh and download workarounds, and had to manage account sign-ins (including incognito mode) to avoid being stuck on the wrong account.

Which outputs stood out as “wins,” and what made them impressive?

The “Pope” prompt eventually succeeded after multiple attempts, producing a dramatic, wide-angle, crowd-cheer moment with strong motion and comedic timing. The “Chichin Chong and the VW bus anime style” prompt also landed. The “biblically accurate angel” prompt got close to the intended look, and the “Titanic salvage” concept delivered convincing atmosphere and texture even when motion realism wasn’t perfect.

What was the practical prompting strategy the creator converged on?

Iterate: run prompts, watch results, then rerun with tighter wording. Use Discord to store and manage prompt ideas, and consider using ChatGPT to condense or rewrite overly complex prompts into something that fits the model’s constraints. When a prompt fails, adjust rather than keep the same long instruction set.

Review Questions

  1. When did audio fail even though the visuals rendered, and what workaround did the creator use?
  2. How do prompt length and the 8-second generation window appear to affect coherence and detail?
  3. What credit-management tactics (account switching, queue limits, reruns) were used to keep testing going despite failures?

Key Points

  1. 1

    VEO 3 often produces visually on-target results when prompts are specific about action, camera framing, and style, but reliability remains inconsistent.

  2. 2

    Audio is a major strength when it works, yet it can vanish due to interface bugs or missing audio tracks in downloaded MP4s.

  3. 3

    Prompt length and complexity matter: long prompts frequently struggle to fit into the 8-second output window, leading to mushy motion or partial compliance.

  4. 4

    Platform constraints include limited queue size (only a few generations at once) and a maximum of five queued, forcing batching and prioritization.

  5. 5

    Credits are the real limiter: failed generations can consume credits, and refreshing can delete failed prompts so reruns aren’t always possible.

  6. 6

    Workarounds—refreshing, downloading MP4s, and using Discord to manage prompts—are essential for a smoother testing workflow.

  7. 7

    Account and sign-in friction (including wrong-account auto-login) can interrupt testing, so using incognito or clearing cookies may be necessary.

Highlights

The “Pope” prompt finally succeeded after multiple attempts, delivering a dramatic, rap-video-like wide-angle moment with cheering crowds—one of the session’s clearest wins.
Audio frequently worked better than expected for a text-to-video model, but the interface sometimes produced silent outputs or MP4s with no audio track.
The session repeatedly demonstrated that prompt engineering is a tradeoff: more detail can help style, but too much complexity often breaks coherence within the 8-second limit.
Failed generations and credit consumption turned prompt testing into an optimization problem, not just a creativity exercise.

Topics

Mentioned