Google Lumiere Introduced – Another Attempt at AI Video Generation

February 2^nd, 2024

Google Lumiere Introduced – Another Attempt at AI Video Generation

by Mascha DeikovaFebruary 2^nd, 2024

A year ago we watched the rise of miraculous text-to-image generators with wide-open eyes. Fast forward to now: they are not hot news anymore. For the last few months, the tech race has been more about AI video generators, and naturally, Google couldn’t miss out. Last week, company researchers presented their attempt at synthesizing full-frame-rate clips. Their video diffusion model, named after the fathers of cinema – the Lumiere brothers – shows surprisingly consistent results and realistic, coherent motion. Unfortunately, it’s not available for public testing yet. What kind of future in AI video does Google Lumiere depict? Let’s take a closer look!

Curiously, Google already played around with a couple of deep-learning models for generating videos: Phenaki and Imagen Video (we wrote about them here). Somehow, neither of them saw the world, remaining in research papers rather than becoming real-life tools. Does the same fate await Google Lumiere? Time will tell, but we hope not, as some of the features presented seem very useful for independent filmmakers.

Introducing Google Lumiere

So, what can the new diffusion model, fresh out of their AI research center, do? According to the official paper and the announcements on social media, everything that other contemporary AI video generators offer. That includes creating 5-second clips based on text description, video stylization, animating still images, or bringing motion only to the selected parts of the pictures.

Introducing Lumiere 📽️

The new video diffusion model we've been working on @GoogleAI

* Text-to-Video
* Image-to-Video
* Stylized Generation
* Inpainting
* Cinemagraphs
and more 🎨

W/ amazing team incl. @hila_chefer @omer_tov @InbarMosseri @talidekel @DeqingSun @oliver_wang2 pic.twitter.com/jEQcFo26Gm
— Omer Bar Tal (@omerbartal) January 24, 2024

How stylization in Google Lumiere works. Images source: Google Research

Additionally, researchers introduced a special feature called inpainting that allows video creators to customize any part of their existing clip (we’ll talk about it more in a second).

What is so special about Google Lumiere?

However, other than the ‘inpainting’ feature, Google’s announcement is not groundbreaking news. Other available AI generators (such as Runway or Pika) already offer all these tools. Better yet, you can try them out, which is not the case with Lumiere. Still, the comments below the original tweet show huge hype and interest in the newly presented AI model. Why?

Well, it’s simple. Other – already active – video generators struggle with consistency and cannot really generate coherent motion yet. Developers of Google Lumiere, on the other hand, claim they have found a new technical approach, which will be the solution to this problem. Unlike existing models, Lumiere uses a so-called Space-Time U-Net architecture, which helps it to create full-frame-rate videos in a single, consistent pass. This allows the output to be realistic, diverse, and coherent.

Generated video clip examples. Image source: Google Research

Showcases presented on the Google Lumiere webpage look quite convincing, don’t they? Yet we won’t be fully convinced until we can try this model ourselves.

Editing videos locally

One of the special features of Google Lumiere that triggered our interest is the possibility of editing videos locally using masks. Imagine you already have a filmed clip and want to swap the outfit of your protagonist in the post (which happens sometimes, even to the best of us). Allegedly, the new AI software can take over this task and accomplish it within mere seconds. All you will need to do is write a text description of what you want to have in the selected area.

Adobe promised a similar feature during their annual event last year, and I think it may become the most useful way to integrate AI video generators into our workflows.

Only words for now

Although Google Lumiere will indeed have limitations (for instance, not being able to synthesize scenes with multiple shots), it represents a new step in text-to-video generation. However, all we have now is a research paper and some of the showcases published by developers. As long as the model is not released or available for tests, we can’t confirm or disprove its capabilities.

If you’re interested in the actual state of the existing AI video generators, head over here and read our review on them. Spoiler alert: they’re advancing, but they won’t replace us soon.

What do you think of Google Lumiere? Are you curious to see it in action or unsure whether it will ever be available to the general public? Share your opinion with us in the comments below!

Feature image source: Google Research

ai tools deep-learning models ai video generators Google AI artificial intelligence Google Research deep learning