Runway Gen 4 AI Video is Blowing My Mind! First Impressions

TL;DR

Runway Gen 4 shows sharper realism and more stable backgrounds than Gen 3, with improved lighting behavior such as light passing through fabric.

Briefing Cornell Notes

Briefing

Runway ML’s Gen 4 arrives with a clear jump in video realism and control—especially for character motion, physics-like effects, and background consistency—while still showing familiar failure modes like occasional anatomy drift and imperfect object interactions. In early demos and hands-on tests, the model produces scenes where lighting behaves plausibly (light passing through fabric), motion reads cleanly (cloak movement, bird wing flaps), and backgrounds stay stable without the “mushy” artifacts common in earlier generations.

The most striking improvements show up in how Gen 4 handles complex movement and coherence across shots. A character in a sandstorm keeps believable physics in clothing and lighting, while a low-depth-of-field forest walk pairs a moving subject with a readable environment. Animal motion stands out too: a vulture-like creature spreads and contracts wings with a level of timing that feels more physically grounded than typical AI motion. Even abstract sequences—like jellyfish-like movement or large-scale transformations—tend to look sharper and more defined, with less of the scribbled, hallucination-heavy texture that can undermine credibility.

Hands-on testing adds nuance. Gen 4 supports image uploading and offers multiple aspect ratios (including 16:9, 21:9, 4:3, and portrait options) with generation lengths up to 10 seconds. When prompted to animate a person from an uploaded image—staring into the camera, sprinting away, and pulling the camera upward—Gen 4 follows the intent well, including dust kicked up during the run. But it still struggles with strict continuity: an “armless” test subject gains the missing arm once the character sprints away, and a 10-second run repeats the same issue. Upscaling to 4K can improve detail, yet it doesn’t fully fix motion artifacts; warping can still appear when frames are paused.

Vehicle and physics prompts show both progress and limits. A “car speeding away with smoke and a fire trail” prompt generally works better in Gen 4 than in Gen 3, with more consistent fire and overall quality. Still, camera rotation can fail to match the exact instruction, and motion can look slightly washed as the subject shrinks. A “truck smashing through a wall” test is hit-or-miss: sometimes the truck materializes inside the room rather than breaking through, then later generations do manage a more convincing smash—suggesting that precise physical causality remains difficult.

The model also leans into stylized and narrative-friendly animation. VHS-like “creepy footage” prompts maintain the look while delivering unsettling close-ups (a lemon held to the camera). In 3D animation tests—like a robot riding a rocket to the moon—Gen 4 can preserve character consistency and add emotion through body language, even when the prompt is relatively simple. Where 2D animation appears weaker, the broader takeaway is that Gen 4 is carving out a stronger niche in realism, 3D character performance, and cinematic motion, while leaving room for future refinement in strict anatomy control and deterministic physics.

Cornell Notes

Runway ML’s Gen 4 is presented as a step up from Gen 3 in realism, motion clarity, and controllability—particularly for character movement, lighting, and physics-like effects. Early examples show sharper backgrounds and fewer “mushy” artifacts, with convincing behavior like light passing through fabric and complex wing motion. Hands-on tests confirm practical features such as image uploading, multiple aspect ratios, and up to 10-second generations, plus optional 4K upscaling. The tradeoff: strict continuity is still unreliable (e.g., an armless character regains an arm when sprinting), and physical interactions like “truck through wall” can sometimes materialize incorrectly. Overall, Gen 4 looks strongest for cinematic, 3D-friendly animation and realism-focused prompts.

What kinds of realism improvements stand out most in Gen 4’s early demos?

Lighting and background stability are recurring wins. Examples include light shining through fabric with visible arm coherency, cloak motion that reads with believable physics, and animal wing movement (spreading and contracting) that stays timed and structured. Backgrounds also remain clearer and more defined, with less of the scribbled, hallucination-like texture that can break immersion.

How does Gen 4 perform when given an uploaded image and a multi-step action prompt?

It follows the overall choreography well: a character can stare into the camera, sprint away, and trigger camera movement upward to reveal the landscape. Motion details like dust kicked up during the run appear. However, continuity fails for strict anatomy—an “armless” input regains the missing arm during the sprint, and the same issue shows up again in a longer (10-second) generation.

What does Gen 4’s support for duration, aspect ratio, and upscaling enable?

Gen 4 offers durations up to 10 seconds and supports image uploading. Aspect ratio options include 16:9, 21:9, 4:3, and portrait-style formats. After generation, clips can be upscaled to 4K for more detail, but upscaling doesn’t fully correct motion problems—pausing can still reveal warping.

Where do physics and object-interaction prompts still struggle?

Deterministic causality remains inconsistent. A “truck smashing through the wall” prompt sometimes results in the truck appearing in the room rather than breaking through. Later generations can improve the smash, implying that the model may need prompt adjustments or retries to achieve the intended physical interaction.

How does Gen 4 compare across animation styles—especially 3D vs 2D?

The strongest results lean toward realism and 3D animation. 3D character tests (robot on a rocket to the moon) can preserve character consistency and convey emotion through body language. 2D animation is described as comparatively less compelling, with other tools mentioned as stronger in that specific lane.

What does the transcript suggest about Gen 4’s speed and practical workflow?

Generation time appears quick in testing, with progress indicators around the low teens (e.g., ~12% and ~10% shown during runs). The workflow includes uploading an image, cropping to supported aspect ratios, running short (5-second) and longer (10-second) generations, and iterating prompts based on observed failures.

Review Questions

In the armless-character test, what specific continuity failure occurs, and at what point in the action does it show up?
Why might a 5-second generation be more likely to miss complex prompt details than a 10-second generation?
Give one example of where Gen 4 improves physical realism and one example where it still produces an incorrect physical outcome.

Key Points

1
Runway Gen 4 shows sharper realism and more stable backgrounds than Gen 3, with improved lighting behavior such as light passing through fabric.
2
Complex character motion (including animal wing movement) appears more coherent, with fewer “mushy” artifacts in the environment.
3
Gen 4 supports image uploading and multiple aspect ratios (16:9, 21:9, 4:3, and portrait options) with generation lengths up to 10 seconds.
4
4K upscaling can increase detail, but it doesn’t reliably fix motion warping or anatomy/motion inconsistencies.
5
Strict continuity remains a weak spot: an armless input character regains the missing arm during sprinting, even in longer generations.
6
Physics-style prompts (smoke/fire trails, collapsing bridges, object impacts) often improve quality versus Gen 3, but deterministic interactions like “truck through wall” can still fail or materialize incorrectly.
7
Gen 4’s strongest niche in these tests is cinematic, realistic, and 3D-friendly animation, while 2D performance is described as comparatively less dominant.

Highlights

Gen 4’s demos emphasize clearer backgrounds and more believable lighting, including light shining through fabric and readable character coherency.

An armless character created from an uploaded image regains the missing arm once the sprint begins—showing that anatomy continuity is still unreliable.

A “truck smashing through a wall” prompt is inconsistent: sometimes the truck appears inside the room instead of breaking through, then later generations can improve the effect.

In 3D animation tests, Gen 4 can preserve character identity and add emotion through movement, even when the prompt is minimal.

Topics

Runway Gen 4
AI Video Generation
Image Upload
Physics Prompts
3D Animation

Mentioned

Runway ML
Runway
GPT-4 Omni
Idiomogram AI
Vu
Runway Gen 4
MattVidPro
AI
GPT
4K
VHS
3D