Войти
  • 23035Просмотров
  • 4 месяца назадОпубликованоJia-Bin Huang

Diffusion Language Models: The Next Big Shift in GenAI

Most Large Language Models (LLMs) today are based on Autoregressive models (i.e., they predict texts in a left-to-right order). But diffusion models offer iterative refinement, flexible control, and faster sampling. In this video, we explore several ideas for applying diffusion models to language modeling. 00:00 Autoregressive LLMs 00:13 Limitations of Autoregressive models 00:56 How diffusion models work for images 01:26 DiffusionLM: Apply diffusion to word embeddings 02:46 Latent diffusion models: Apply diffusion to paragraph embeddings 03:37 Masked diffusion models 07:41 Scaling laws of diffusion models 08:53 Comparing AR and diffusion models in data-constrained settings. References: Continuous diffusion on word/paragraph embeddings: - Diffusion-LM: - Latent Diffusion for Language Generation: - PLANNER: Discrete diffusion: - D3PM: - SEED: Masked diffusion: - - - Large Diffusion LLM: - LLaDA: - Dream 7B: Scaling: - - Mercury (The fastest commercial-grade diffusion LLM) Blog: A nice overview of diffusion LLMs Video made with Manim: