New PODCAST 🎧 ep99 - What's the camera of the future? Trying out new features on CineD.com Listen or watch now!
LISTEN to PODCAST 🎧 ep99 🎬
What's the camera of the future?
Education for Filmmakers
Language
The CineD Channels
Info
New to CineD?
You are logged in as
We will send you notifications in your browser, every time a new article is published in this category.
You can change which notifications you are subscribed to in your notification settings.
A year ago we watched the rise of miraculous text-to-image generators with wide-open eyes. Fast forward to now: they are not hot news anymore. For the last few months, the tech race has been more about AI video generators, and naturally, Google couldn’t miss out. Last week, company researchers presented their attempt at synthesizing full-frame-rate clips. Their video diffusion model, named after the fathers of cinema – the Lumiere brothers – shows surprisingly consistent results and realistic, coherent motion. Unfortunately, it’s not available for public testing yet. What kind of future in AI video does Google Lumiere depict? Let’s take a closer look!
Curiously, Google already played around with a couple of deep-learning models for generating videos: Phenaki and Imagen Video (we wrote about them here). Somehow, neither of them saw the world, remaining in research papers rather than becoming real-life tools. Does the same fate await Google Lumiere? Time will tell, but we hope not, as some of the features presented seem very useful for independent filmmakers.
So, what can the new diffusion model, fresh out of their AI research center, do? According to the official paper and the announcements on social media, everything that other contemporary AI video generators offer. That includes creating 5-second clips based on text description, video stylization, animating still images, or bringing motion only to the selected parts of the pictures.
Introducing Lumiere 📽️The new video diffusion model we've been working on @GoogleAI* Text-to-Video* Image-to-Video* Stylized Generation* Inpainting* Cinemagraphsand more 🎨W/ amazing team incl. @hila_chefer @omer_tov @InbarMosseri @talidekel @DeqingSun @oliver_wang2 pic.twitter.com/jEQcFo26Gm— Omer Bar Tal (@omerbartal) January 24, 2024
Introducing Lumiere 📽️The new video diffusion model we've been working on @GoogleAI* Text-to-Video* Image-to-Video* Stylized Generation* Inpainting* Cinemagraphsand more 🎨W/ amazing team incl. @hila_chefer @omer_tov @InbarMosseri @talidekel @DeqingSun @oliver_wang2 pic.twitter.com/jEQcFo26Gm
Additionally, researchers introduced a special feature called inpainting that allows video creators to customize any part of their existing clip (we’ll talk about it more in a second).
However, other than the ‘inpainting’ feature, Google’s announcement is not groundbreaking news. Other available AI generators (such as Runway or Pika) already offer all these tools. Better yet, you can try them out, which is not the case with Lumiere. Still, the comments below the original tweet show huge hype and interest in the newly presented AI model. Why?
Well, it’s simple. Other – already active – video generators struggle with consistency and cannot really generate coherent motion yet. Developers of Google Lumiere, on the other hand, claim they have found a new technical approach, which will be the solution to this problem. Unlike existing models, Lumiere uses a so-called Space-Time U-Net architecture, which helps it to create full-frame-rate videos in a single, consistent pass. This allows the output to be realistic, diverse, and coherent.
Showcases presented on the Google Lumiere webpage look quite convincing, don’t they? Yet we won’t be fully convinced until we can try this model ourselves.
One of the special features of Google Lumiere that triggered our interest is the possibility of editing videos locally using masks. Imagine you already have a filmed clip and want to swap the outfit of your protagonist in the post (which happens sometimes, even to the best of us). Allegedly, the new AI software can take over this task and accomplish it within mere seconds. All you will need to do is write a text description of what you want to have in the selected area.
Adobe promised a similar feature during their annual event last year, and I think it may become the most useful way to integrate AI video generators into our workflows.
Although Google Lumiere will indeed have limitations (for instance, not being able to synthesize scenes with multiple shots), it represents a new step in text-to-video generation. However, all we have now is a research paper and some of the showcases published by developers. As long as the model is not released or available for tests, we can’t confirm or disprove its capabilities.
If you’re interested in the actual state of the existing AI video generators, head over here and read our review on them. Spoiler alert: they’re advancing, but they won’t replace us soon.
What do you think of Google Lumiere? Are you curious to see it in action or unsure whether it will ever be available to the general public? Share your opinion with us in the comments below!
Feature image source: Google Research
Δ
Stay current with regular CineD updates about news, reviews, how-to’s and more.
You can unsubscribe at any time via an unsubscribe link included in every newsletter. For further details, see our Privacy Policy
Want regular CineD updates about news, reviews, how-to’s and more?Sign up to our newsletter and we will give you just that.
You can unsubscribe at any time via an unsubscribe link included in every newsletter. The data provided and the newsletter opening statistics will be stored on a personal data basis until you unsubscribe. For further details, see our Privacy Policy
Mascha Deikova is a freelance director and writer based in Salzburg, Austria. She creates concepts for and works on commercials, music videos, corporate films, and documentaries. Mascha’s huge passion lies in exploring all the varieties of cinematic and narrative techniques to tell her stories.