...
--- Fast Video Inpainting on a Budget

FM²FVI: Flow Matching for Fast Multi-frame Video Inpainting

Mathis WauquiezYann GousseauAlasdair NewsonAndrés Almansa

LTCI, Télécom Paris, ISIR, Sorbonne Université, MAP5, Université Paris Descartes

Original
Inpainted
Original
Inpainted

Abstract


Removing an unwanted object from an image now takes just a few seconds on a smartphone. Doing the same on a video can require tens of minutes of computation on a modern GPU. This accessibility gap is no coincidence: video inpainting relies predominantly on diffusion models whose computational cost grows linearly with the temporal dimension.

We propose FM²FVI, a frugal approach to video inpainting based on Flow Matching, a recent generative modeling framework closely related to diffusion models. Flow Matching offers several advantages: mathematical simplicity, the ability to handle non-Gaussian source distributions, connections to optimal transport, and most importantly, fast sampling with fewer function evaluations.

Our first contribution is a complete and modular library for training Flow Matching models, supporting all state-of-the-art parameterizations including schedulers, sampling strategies, loss functions, ODE solvers, and guidance modes.

We then develop two image inpainting methodologies based solely on image self-similarity, avoiding priors from large datasets that may raise ethical or legal concerns.

Finally, we extend our most promising approach to video inpainting, demonstrating that it is possible to achieve visually satisfying results with a model using fewer than 500,000 parameters, trained on a single video. This represents a significant step toward democratizing video editing on consumer hardware while reducing energy consumption and environmental impact.


FMFVI: Flow Matching for Fast Video Inpainting