Payload Logo

AI video gets real

But what comes next won't be.

Getty Images

Will Smith eating spaghetti has become tech’s strangest success story.

Back in 2023, an AI-generated video of the actor slurping pasta went viral for all the wrong reasons. The clip, created by an early AI model called ModelScope, showed a nightmarish figure that vaguely resembled Smith grotesquely mangling noodles with impossible hand movements and facial contortions. It was so obviously fake and unsettling that Smith himself parodied it almost a year later, turning the AI failure into a meme.

That horrific pasta clip has since become an informal benchmark for AI video progress — a standard test that developers and researchers use to measure how far the technology has advanced. It’s the AI video equivalent of asking a chatbot to take the LSAT or solve a math problem.

Fast-forward to last month, when Google revealed Veo 3, its latest text-to-video model, which can generate a convincing Will Smith doppelganger smoothly twirling linguine — complete with chewing sounds. The only problem? The AI thinks spaghetti makes crunching noises, like eating potato chips. It’s a small glitch that reveals just how far we’ve traveled in less than two years, from digital horror show to near-perfect mimicry with only minor audio quirks.

The journey from spaghetti nightmare to convincing deepfake happened through a series of rapid-fire breakthroughs in 2024. OpenAI’s SORA, released early in the year, could generate smooth, cinematic footage but remained silent — essentially high-quality GIFs. Meta’s Movie Gen followed with better character consistency across longer clips. Google’s Veo 2 improved on both but still couldn’t produce sound. Each model represented incremental progress, but none prepared observers for Veo 3's sudden integration of synchronized audio, realistic dialogue, and ambient sound effects.

This isn’t the steady march of technological progress we’re used to. It’s a cliff jump that has left experts, filmmakers, and society scrambling to understand what just happened. The sudden leap from obviously fake AI videos to nearly indistinguishable synthetic content represents one of the most dramatic capability jumps in recent tech history.

One place where it is being embraced is Hollywood. Media executives who sat nervously in conference audiences taking notes about AI experimentation as recently as a few years ago are now publicly discussing their active use of these tools. Amazon Studios recently spoke openly about integrating generative AI into its creative pipelines, marking what one industry insider called “a come to Jesus moment” where the technology became too useful to ignore. The shift makes sense: When daily shooting costs reach $200,000 in Los Angeles and traditional VFX houses are shutting down, AI isn’t just innovation — it’s survival.

But the real disruption isn’t happening in studio boardrooms. It’s in the complete democratization of sophisticated video manipulation. What once required teams of VFX artists, expensive software, and Hollywood budgets can now be accomplished by anyone with $1.50 and an internet connection. Veo 3's pricing structure puts the creation of convincing fake videos within reach of essentially everyone, collapsing barriers that previously served as natural safeguards against widespread media manipulation.

The threat was already materializing for images. Starting in 2023, Tom Hanks has repeatedly warned his Instagram followers about AI-generated videos falsely using his likeness to promote miracle cures and wonder drugs. The Department of Homeland Security has identified deepfakes as an “increasing threat,” noting that synthetic media doesn’t need to be particularly advanced to be effective — it just needs to exploit “people’s natural inclination to believe what they see.” This latest leap in video quality will only accelerate the problem, making deception cheaper, faster, and more accessible.

The technology still shows limitations. While the viral demos circulating online look flawless, deeper experimentation reveals that Veo 3 struggles with consistency and often ignores prompts entirely. The best models have guardrails that won’t allow you to create videos showing recognizable people. But the pace of advancement suggests even current quirks will soon become obsolete. And guardrails have a way of being dismantled, leaving us with AI-generated content that’s functionally indistinguishable from reality.

The question isn’t whether we can trust what we see and hear anymore — it’s whether we can trust who’s showing it to us. In an era when sophisticated video manipulation costs less than a coffee, credibility becomes anchored not in the medium but in the messenger. The sudden maturation of AI video technology has compressed what many expected to be a decade-long societal adaptation into an immediate crisis of verification, forcing us to rebuild trust systems that assumed seeing was believing.

—Jackie Snow, Contributing Editor

📬 Sign up for the Daily Brief

Our free, fast and fun briefing on the global economy, delivered every weekday morning.