Learning an Optimal Transport Brenier Map

With: Félix FOURREAU
Ressources: Report / slides / Code
When: 2025
Associated to: MVA
Category: Research, Generative Models, Optimal Transport

Project description

This group project’s goal was to investigate Monotone Gradient Networks. Under some mild assumptions assumptions, these networks have been shown to be universal approximators of gradients of convex functions. These architecture’s are motivated by the Brenier Theorem, in optimal transport:

Consider Monge’s OT problem:

\[\inf_{T:\, T_\# p_X = p_Y} \mathbb{E}_{x\sim p_X}\left[c\bigl(x,T(x)\bigr)\right]\]

where $T_\# p_X = p_Y$ denotes that the mapping $T$ pushes forward the source measure $p_X$ onto the target measure $p_Y$ and $c$ is a cost function (typically, the squared Euclidean distance). When $c(x,y)=|x-y|^2$, Brenier’s Theorem guarantees that, under mild conditions, the unique optimal transport map is characterized by being the gradient of a convex function:

\[T(x) = \nabla \varphi(x) \quad \text{for } p_X\text{-almost every } x.\]

This result motivates learning the gradient directly from data when an explicit convex potential is difficult to construct.

Therefore, we implemented the M-MGN & C-MGN networks, and tested them on these tasks:

Mapping one gaussian to another
Mapping one GMM to another
Mapping pixel distributions
Learning a MNIST generative network

Moreover, we introcuced the use of two losses, that were not tested in the paper:

Kullback-Leibler divergence
Wasserstein distance approximation through the adversarial formulation (WGANs)
We also experimented with Maximum Mean Discrepancy (MMD), similarly to MMD-GANs (but with a Monotone Gradient generator and a different source distribution)

We show promising results for low dimensional settings, but that the models severely lacks a sufficient complexity for learning generative networks in high-dimensional spaces.

Share on

X (formerly Twitter) Facebook LinkedIn

Mathis Wauquiez

Project description

Share on