Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

• Fast Generation with Flow Matching

13 minute read

Published:

Fast sampling has become a central goal in generative modeling, enabling the transition from high-fidelity but computationally intensive diffusion models to real-time generation systems. While diffusion models rely on tailored numerical solvers to mitigate the stiffness of their probability flow ODEs, flow matching defines dynamics through smooth interpolation paths, fundamentally altering the challenges of acceleration. This article provides a comprehensive overview of fast sampling in flow matching, with emphasis on path linearization strategies (e.g., Rectified Flow, ReFlow, SlimFlow, InstaFlow), the integration of consistency models, and emerging approaches such as flow generators.

• From Diffusion to Flow — Seeking the Elegant Path

28 minute read

Published:

In this post, we uncovered the foundations of Flow Matching: the limitations of diffusion models, the constraints of continuous flows, and the transformative idea of directly learning the path between distributions. From the intuition of Rectified Flow to the unifying lens of Stochastic Interpolants, Flow Matching emerged as more than a method — it is a paradigm that reframes generation as learning currents of transformation. With this conceptual map in hand, we are now ready to move from theory to practice.

• High-Order PF-ODE Solver in Diffusion Models

40 minute read

Published:

Diffusion sampling can be cast as integrating the probability flow ODE (PF-ODE), but dropping it into a generic ODE toolbox rarely delivers the best speed–quality trade-off. This post first revisits core numerical-analysis ideas. It then explains why vanilla integrators underperform on the semi-linear, sometimes stiff PF-ODE in low-NFE regimes, and surveys families that exploit diffusion-specific structure: pseudo-numerical samplers (PLMS/PNDM) and semi-analytic/high-order solvers (DEIS, DPM-Solver/++/UniPC). The goal is a practical, unified view of when and why these PF-ODE samplers work beyond “just use RK4.”

• A Panoramic View of Diffusion Model Sampling: From Classic Theory to Frontier Research

34 minute read

Published:

This article takes a deep dive into the evolution of diffusion model sampling techniques, tracing the progression from early score-based models with Langevin Dynamics, through discrete and non-Markov diffusion processes, to continuous-time SDE/ODE formulations, specialized numerical solvers, and cutting-edge methods such as consistency models, distillation, and flow matching. Our goal is to provide both a historical perspective and a unified theoretical framework to help readers understand not only how these methods work but why they were developed.

• Diffusion Architectures, Part III: Multi-modal and Generalization-Oriented Designs

65 minute read

Published:

Diffusion models are no longer limited to images; they increasingly serve as universal generative frameworks across text, video, audio, and 3D. This article explores how architectures evolve to support multi-modal conditioning, including cross-attention, joint tokenization (MMDiT), and state-space alternatives such as S4 and Mamba. We further discuss design principles that promote generalization across tasks and domains, positioning diffusion as a foundation for multi-modal AI. The analysis highlights emerging directions where architecture becomes the key enabler of flexible, general-purpose generative systems.

• Diffusion Architectures Part II: Efficiency-Oriented Designs

55 minute read

Published:

Efficiency is a defining challenge for diffusion models, which often suffer from high computational cost and slow inference. This article surveys architectural strategies that enhance efficiency, from latent-space diffusion and multi-resolution cascades to lightweight convolutional blocks, efficient attention mechanisms, and parameter-efficient modules like LoRA. We also examine distillation and inference-time acceleration techniques that drastically reduce sampling steps. Together, these approaches demonstrate how architectural design can expand the reach of diffusion models — from research labs to real-time and mobile applications.

• Diffusion Architectures Part I: Stability-Oriented Designs

72 minute read

Published:

This article explores how network architectures shape the stability of diffusion model training. We contrast U-Net and Transformer-based (DiT) backbones, analyzing how skip connections, residual scaling, and normalization influence gradient propagation across noise levels. By surveying stability-oriented innovations such as AdaGN, AdaLN-Zero, and skip pathway regulation, we reveal why architectural choices can determine whether training converges smoothly or collapses. The discussion provides both theoretical insights and practical design rules for building robust diffusion models.

• Analysis of the Stability and Efficiency of Diffusion Model Training

65 minute read

Published:

while diffusion models have revolutionized generative AI, their training challenges stem from a combination of resource intensity, optimization intricacies, and deployment hurdles. A stable training process ensures that the model produces good quality samples and converges efficiently over time without suffering from numerical instabilities.

• Unifying Discrete and Continuous Perspectives in Diffusion Models

18 minute read

Published:

Diffusion models have been shown to be a highly promising approach in the field of image generation. They treat image generation as two independent processes: the forward process, which transforms a complex data distribution into a known prior distribution (typically a standard normal distribution) by gradually injecting noise; and the reverse process, which transforms the prior distribution back into the complex data distribution by gradually removing the noise.

awards

books

patents

projects

publications

services

talks