Get AI summaries of any video or article — Sign up free
AI Artist is Becoming a VIABLE Career Path! thumbnail

AI Artist is Becoming a VIABLE Career Path!

MattVidPro·
5 min read

Based on MattVidPro's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Fiverr’s AI services marketplace is positioned as a practical entry point for earning money from AI video creation.

Briefing

AI video freelancing is emerging as a practical career path, and a custom Fiverr job shows how quickly “AI artist” work can move from hobby to paid, portfolio-ready production. The core takeaway is that buyers aren’t just paying for prompts—they’re paying for end-to-end creative execution: consistent characters, coherent scenes, natural voice, and the technical stitching needed to turn image generation into a finished motion piece.

To test that, MattVidPro used Fiverr’s AI services marketplace to commission an AI creator, Jonas Ai, specifically because the listing claimed use of cutting-edge tools for AI movies. The resulting short scene set in a neon, VR-like future city features a consistent central character (the commissioner himself) moving through a fully stylized metropolis. The character consistency was achieved through Midjourney character reference: photos were used to get close to an accurate likeness, then the creator pushed it further into a Pixar-style cartoon look—an artistic choice that also helped solve a key limitation. When the same character was tested in a more realistic style, the AI struggled to keep the depiction stable across scenarios; the more stylized “cartoon” approach fit the generator’s strengths better.

Voice work also played a major role in making the piece feel complete. The creator selected a text-to-speech voice that sounded natural and matched the tone of the futuristic narration, using 11 Labs for the voiceover. The project relied on multiple image-to-video and motion tools, with the creator reporting 20 hours of work to produce the final result.

The production wasn’t flawless. Glitches and imperfect motion appeared in some transitions and walking sequences—typical issues for current image-to-video systems. Still, the imperfections were treated as part of the workflow: the creator used prompts designed to reduce errors, generated clips in segments, and then stitched them together into a coherent final timeline.

Beyond the finished clip, the broader career argument is straightforward: AI lowers the cost and time barrier for creative output, but it doesn’t remove the need for talent, creativity, and technical know-how. The commissioner emphasizes that meaningful work gets harder as requirements become more specific—especially when consistency, character replication, and multi-tool pipelines are involved.

Fiverr is positioned as a gateway for enthusiasts to earn early income, with an AI services tab and a growing ecosystem of freelancers. The transcript also contrasts Jonas Ai’s more affordable pricing with higher-end sellers like the Door Brothers, who charge thousands per video and have reached mainstream visibility (including work with Snoop Dogg). The message: professional-grade results are possible on today’s tools, but the market rewards people who can combine the right software, manage limitations, and deliver a polished product—not just generate a single image.

Cornell Notes

AI video freelancing is presented as a viable career path, with a Fiverr commission used to demonstrate what paid, end-to-end production looks like. Jonas Ai was hired to create a futuristic VR-style city scene featuring a consistent character likeness, achieved through Midjourney character reference and then stylized into a Pixar-like cartoon to improve stability. The project also used 11 Labs for natural-sounding voiceover and relied on a multi-tool pipeline (including image generation, image-to-video, and motion tools) with about 20 hours of work. Despite glitches and imperfect motion typical of current image-to-video systems, the work shows how stitching, prompting, and creative direction turn AI outputs into coherent deliverables. The takeaway: AI makes creation cheaper and faster, but talent and technical skill still determine quality.

What made the character look consistent across multiple scenes, and why did the style choice matter?

Consistency came from Midjourney character reference: the commissioner provided photos, and the creator used them to get close to an accurate depiction. The creator then applied a creative stylization—turning the character into a Pixar-style cartoon. Realistic tests didn’t hold up well across scenarios because current AI video generation struggles to perfectly recreate a person in motion; the stylized approach matched the generator’s strengths and reduced visible instability.

How did the project turn generated images into a finished video instead of a set of separate clips?

The workflow was segmented: images were generated (Midjourney), then fed into image-to-video generation for each scene. The creator produced multiple clips, prompted to remove as many errors as possible, and then stitched the clips together into one final timeline. Even with glitches and warped motion in walking/transition moments, the assembly process helped maintain overall coherence.

Which tools were named for different parts of the pipeline, and what roles did they play?

Midjourney was used for images and for replicating the character via character reference. The creator cited gen 3, Luma labs (including “clling 1.5” and “clling 1.0” as written), motion brush, and flux for drafting ideas for the script. 11 Labs was used for the voiceover, selected for a natural-sounding fit to the narration.

What limitations showed up in the final AI video, and how were they handled?

Motion wasn’t always perfect: glitches appeared, and some warping occurred, especially around walking and animation. The creator acknowledged these as current constraints of image-to-video systems and mitigated them through careful prompting to reduce errors and by editing/stitching multiple generated segments into a coherent whole.

Why does the transcript treat Fiverr as a career on-ramp rather than just a place to buy one-off outputs?

Fiverr is framed as a marketplace with an AI services category and enough demand for AI-created work that newcomers can earn early. The argument is that buyers pay for deliverables, and creators who learn toolchains and manage limitations can build repeatable, portfolio-ready services. The Door Brothers are cited as proof that AI video work can scale into professional, high-priced contracts.

How does the pricing and success gap between Jonas Ai and the Door Brothers support the career claim?

Jonas Ai’s work impressed at a lower price point (the commissioner paid more than $45). The Door Brothers, by contrast, charge within the thousands per video, have strong reviews, and have reached mainstream partnerships (including work with Snoop Dogg). Together, they illustrate a spectrum: early freelancers can start with affordable commissions, while top sellers can command premium rates by delivering consistent, professional results.

Review Questions

  1. What specific technique was used to keep the character recognizable across scenes, and what stylization change improved stability?
  2. Which named tools were used for voiceover versus image generation versus video/motion, and how did the workflow require stitching multiple outputs?
  3. What kinds of errors still appear in current AI image-to-video results, and what does that imply about the skills needed to deliver polished work?

Key Points

  1. 1

    Fiverr’s AI services marketplace is positioned as a practical entry point for earning money from AI video creation.

  2. 2

    Character consistency can be engineered using Midjourney character reference, but stylization may be necessary to reduce instability in motion.

  3. 3

    Natural-sounding narration is a major quality lever; 11 Labs was used for the voiceover.

  4. 4

    Current image-to-video tools still produce glitches and imperfect motion, so prompting and editing are essential.

  5. 5

    High-quality results come from multi-tool pipelines and time-intensive iteration, not from a single generation step.

  6. 6

    AI lowers the barrier to making creative work, but talent, creativity, and technical know-how still determine whether output looks professional.

  7. 7

    Premium pricing in the market (e.g., the Door Brothers) suggests buyers value reliability and end-to-end production, not just raw generation.

Highlights

A consistent character across a multi-scene AI short was achieved through Midjourney character reference, then stabilized further by shifting to a Pixar-style cartoon look.
The voiceover used 11 Labs, and the narration quality was treated as a key factor in making the piece feel finished.
Even with advanced tools, walking and motion still showed glitches—current limitations that require prompting and careful stitching.
Jonas Ai reported spending 20 hours on the custom project, underscoring that paid AI work is still labor-intensive and skill-driven.

Topics

  • AI Video Freelancing
  • Fiverr AI Services
  • Midjourney Character Reference
  • Text-to-Speech
  • Image-to-Video Pipelines

Mentioned