Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2025/01/15 - Optimal transport intro #29

Open
agosztolai opened this issue Dec 29, 2024 · 17 comments
Open

2025/01/15 - Optimal transport intro #29

agosztolai opened this issue Dec 29, 2024 · 17 comments

Comments

@agosztolai
Copy link
Collaborator

Hi GenAIers,

I will attempt to put together a crash course in OT theory for our January 15 session.

Foreseeably, the topics I will cover will be

  • Monge and Kantorovich formulation of OT
  • Dual formulation
  • Dynamic formulation (Benamou-Brenier)
  • Variational formulation (Jordan-Kinderlehrer-Otto)
  • Entropic regularisation (Sinkhorn)
  • Displacement interpolation
  • Linking flow/score matching to OT

Please note that this is only preliminary and I will likely reduce this list.

I will base most of the session on The Book, Topics in Optimal Transport, Villani AMS, but I will also consider various other sources.

@moritzschaefer
Copy link
Collaborator

moritzschaefer commented Jan 9, 2025

Here is a poor-quality version of the book: https://www.math.ucla.edu/~wgangbo/Cedric-Villani.pdf

(The ebook is 70Euros, if someone got access to it please feel invited to share here)

@agosztolai mentioned that if we want to prepare, we can read the introduction (correct @agosztolai?)

@EmreTaha
Copy link
Collaborator

EmreTaha commented Jan 9, 2025

It is in djvu format
cédric-villani-topics-in-optimal-transportation-2003.zip

@moritzschaefer
Copy link
Collaborator

I built a PDF which allows copy-pasting (although mathematical symbols are not always correctly identified):

cedric-villani-topics-in-optimal-transportation-2003_ocr.pdf

@moritzschaefer
Copy link
Collaborator

moritzschaefer commented Jan 11, 2025

I particularly appreciated this statement in the book's overview:
image

If the reader is not familiar with the notion of Hausdorff dimension, we recommend that he/she just forget about it and skip those statements which involve this concept, since they are not of primary importance here.

(more of that please ^^)

@agosztolai
Copy link
Collaborator Author

This is a very nice article linking Flow Matching to Optimal Transport. I don't think I can cover this on Wednesday, but it suggests a few nice follow-up articles to round off our venture into probabilistic generative modelling.

@agosztolai
Copy link
Collaborator Author

This website is rich in resources on OT.

@agosztolai
Copy link
Collaborator Author

agosztolai commented Jan 14, 2025

Hiya, this was a toughie, but I managed to compile a crash course on OT. I have uploaded my presentation and I highly recommend you take a quick look so that you can follow the discussion tomorrow.

@animesh3008
Copy link
Collaborator

@tpinetz
Copy link

tpinetz commented Jan 15, 2025

One book I liked is "Optimal Transport for Applied Mathematicians" (https://math.univ-lyon1.fr/~santambrogio/OTAM-cvgmt.pdf). It is easier to understand than Villani, but not easy.

@agosztolai
Copy link
Collaborator Author

Hi, sorry I just noticed that next week I need to be on a panel during the reading group. I can continue in two week’s time. Could someone present something instead of me?

WGANs could be one paper, which uses the duality of Kantorovich and Rubinstein. I have not covered that, but that’s the next thing I wanted to talk about, so it’s a natural continuation.

@tpinetz
Copy link

tpinetz commented Jan 15, 2025

I can do it, I have slides on that topic from my time in Bonn anyways. I would add discrete optimal transport and solvers for that.

@agosztolai
Copy link
Collaborator Author

Great, go ahead, please! You can also use my notes (if you want) to explain Kantorovich-Rubinstein duality.

@agosztolai
Copy link
Collaborator Author

Hello,

I wanted to remind you that when you learn new maths, the way you should do it is go through the main results and place them in your understanding. Knowing the proofs (details) comes only after. Please remember. Unless you want to do mathematical research in OT, I highly recommend you do not get into the technical details because you will make no progress (I guarantee) and will distract you from the big picture.

Here's the big picture of Brenier's theorem: If you have a quadratic cost function c(x,y) = |x-y|^2, then the Monge and Kantorovich problems coincide. Additionally, you do not need to solve the OT problem - instead, you can just look for a convex potential that transports your density between the endpoints. This path will automatically be optimal.

But because you were so curious, and because I wanted to show you how technical things get, I briefly looked into the question of the convexity assumption underlying Brenier's theorem. I refer you to Villani (AMS) to check the proofs if you want to understand this.

The most standard setting requires the cost c(x,y) to be quadratic (Thm 2.32, so strictly convex and C_2(R)). However, there are various relaxations.

  1. You can have strict convexity (Thm 2.44) and strict concavity (Thm 2.45).
  2. You can have normal (non-strict) convexity as long as another condition is satisfied, as I mentioned. This is the twist condition, which is that the first derivates are injective. Thanks @shemlem for the input.

Of course, all of these convexity assumptions are taken in the c-sense, i.e., with respect to the cost function c in the sense of the Legendre-Fenchel transform. Therefore, the uniqueness of derivatives needs to be interpreted in terms of conjugacy with respect to the LF transform, which is called subdifferentiability.

Brenier theorem states that under this condition there is a convex (not strictly) potential, whose gradient is the transport map.

I hope this closes the discussion. I hope you appreciate that answering these questions is highly non-trivial (cannot be explained based on elementary calculus), so you just need to leave some details to pass to give yourself a chance to understand.

@tpinetz
Copy link

tpinetz commented Jan 16, 2025

Thank you again for the nice explanation. I agree that it is unnecessary to go into the proofs, however the conditions are nice to know, e.g. strict convexity in the cost function leads to an unique solution.

For next week I will go over your slides concerning the Kantorovich-Rubinstein duality and then introduce the original Wasserstein GAN paper (https://proceedings.mlr.press/v70/arjovsky17a/arjovsky17a.pdf), Wasserstein GAN with Gradient Penalty (https://proceedings.neurips.cc/paper_files/paper/2017/file/892c3b1c6dccd52936e27cbd0ff683d6-Paper.pdf) and the discrete Wasserstein distance and solvers such as Sinkhorn (https://proceedings.neurips.cc/paper/2013/file/af21d0c97db2e27e13572cbf59eb343d-Paper.pdf).

@moritzschaefer
Copy link
Collaborator

That's great @tpinetz, thanks a lot! I created a new issue for next week, based on your content.

@moritzschaefer
Copy link
Collaborator

Thanks Adam for providing context. In general, I (and I think many others) am very happy to leave many details to pass.

What's hard is to select them: When learning new concepts (which are all struggling to comprehend) there is no way to know, which of them will be important to understand underlying concepts vs. the ones that are just 'technical details'.

Thereby, I do appreciate a lot if you (@agosztolai and other mathematicians) cut my (and other's) questions short as "not relevant".

@dariarom94
Copy link
Collaborator

Video on Lagr. vs Eulerian flows (can't guarantee it's good): https://www.youtube.com/watch?v=DVB3hwfUr0w

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants