Advancing Visual Representation and Generation Through End-to-End Training

End-to-end training is the keystone of modern deep learning. However, diffusion training remains two-stage: stage 1 for representation (VAE/RAE) and stage 2 for generation (DiT/SiT/JiT). Our mission is to enable end-to-end training for both visual representation and generation: allowing both representations and generation to be advanced together in an end-to-end manner.

Let's Connect

If you are interested in our research on end-to-end diffusion models and visual representation learning, we would love to hear from you. Whether you want to collaborate, discuss our work, or explore new research directions, feel free to reach out.

Contact Us