Skip to content
Merged
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
929522f
feat: add LLaDA2 and BlockRefinement pipelines for discrete text diff…
kashif Mar 8, 2026
b3f6cb5
feat: add BlockRefinementScheduler for commit-by-confidence scheduling
kashif Mar 8, 2026
f8220db
test: add unit tests for BlockRefinementScheduler
kashif Mar 8, 2026
bbc3592
docs: add toctree entries and standalone scheduler doc page
kashif Mar 8, 2026
2ec8083
Merge branch 'main' into llada2-support
kashif Mar 11, 2026
4bec047
Merge branch 'main' into llada2-support
kashif Mar 15, 2026
152b5bd
feat: add --revision flag and fix dtype deprecation in sample_llada2.py
kashif Mar 15, 2026
12c27eb
fix: use 1/0 attention mask instead of 0/-inf for LLaDA2 compat
kashif Mar 15, 2026
bf42b4a
refactor: consolidate training scripts into single train_block_refine…
kashif Mar 15, 2026
c6a6109
fix formatting
kashif Mar 15, 2026
73e91fd
docs: improve LLaDA2 and BlockRefinement documentation
kashif Mar 15, 2026
3d76f0c
feat: set LLaDA2Pipeline defaults to recommended model parameters
kashif Mar 15, 2026
c9d0a24
feat: default editing_threshold=0.5 for LLaDA2.1 quality mode
kashif Mar 15, 2026
3cfc78c
fix: align sampling utilities with official LLaDA2 implementation
kashif Mar 15, 2026
68a73db
refactor: remove duplicate prompt encoding, reuse mixin's _prepare_in…
kashif Mar 15, 2026
f434a9d
formatting
kashif Mar 15, 2026
317160a
fix: replace deprecated torch_dtype with dtype in examples and docstr…
kashif Mar 15, 2026
cb67651
Merge branch 'main' into llada2-support
kashif Mar 17, 2026
dceb614
Merge branch 'main' into llada2-support
kashif Mar 18, 2026
a74514e
remove BlockRefinementPipeline
kashif Mar 18, 2026
f16cfb2
cleanup
kashif Mar 18, 2026
841b5d2
fix readme
kashif Mar 18, 2026
dd0be36
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
0f2e62a
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
e12b66d
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
ab94b94
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
fffe371
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
2edc179
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
d83bac0
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
4670881
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
954ed0f
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
d857afd
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
a0cc832
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
a115365
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
edfae84
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
5628769
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
93c340b
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 18, 2026
1f79c50
removed DiscreteDiffusionPipelineMixin
kashif Mar 18, 2026
c4ed8ec
add support for 2d masks for flash attn
kashif Mar 18, 2026
44d2101
Update src/diffusers/training_utils.py
kashif Mar 19, 2026
d00d1ad
Update src/diffusers/training_utils.py
kashif Mar 19, 2026
b195ee8
fix issues from review
kashif Mar 19, 2026
3d2ef8d
Merge branch 'main' into llada2-support
kashif Mar 19, 2026
31698b6
added tests
kashif Mar 19, 2026
4b09f40
Merge branch 'llada2-support' of https://github.com/kashif/diffusers …
kashif Mar 19, 2026
872eee9
formatting
kashif Mar 19, 2026
f600f25
add check_eos_finished to scheduler
kashif Mar 19, 2026
3a5b962
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 21, 2026
c3b6c1e
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 21, 2026
8fb4124
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 21, 2026
65b625d
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 21, 2026
8f58057
Update src/diffusers/schedulers/scheduling_block_refinement.py
kashif Mar 21, 2026
766cc1f
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 21, 2026
53d4237
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 21, 2026
e03d63a
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 21, 2026
86fbcfa
Update src/diffusers/schedulers/scheduling_block_refinement.py
kashif Mar 21, 2026
33925ac
fix renaming issues and types
kashif Mar 21, 2026
b855128
remove duplicate check
kashif Mar 21, 2026
11da078
Merge branch 'main' into llada2-support
kashif Mar 21, 2026
6a9ca8e
Merge branch 'main' into llada2-support
kashif Mar 24, 2026
946d443
Update docs/source/en/api/pipelines/llada2.md
kashif Mar 24, 2026
7eaf099
Merge branch 'main' into llada2-support
kashif Mar 24, 2026
bde9849
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 25, 2026
5466fba
Update src/diffusers/pipelines/llada2/pipeline_llada2.py
kashif Mar 25, 2026
825d50f
Merge branch 'main' into llada2-support
kashif Mar 25, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -580,6 +580,8 @@
title: Latent Diffusion
- local: api/pipelines/ledits_pp
title: LEDITS++
- local: api/pipelines/llada2
title: LLaDA2
- local: api/pipelines/longcat_image
title: LongCat-Image
- local: api/pipelines/lumina2
Expand Down Expand Up @@ -718,6 +720,8 @@
- sections:
- local: api/schedulers/overview
title: Overview
- local: api/schedulers/block_refinement
title: BlockRefinementScheduler
- local: api/schedulers/cm_stochastic_iterative
title: CMStochasticIterativeScheduler
- local: api/schedulers/ddim_cogvideox
Expand Down
83 changes: 83 additions & 0 deletions docs/source/en/api/pipelines/llada2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# LLaDA2

[LLaDA2](https://huggingface.co/collections/inclusionAI/llada21) is a family of discrete diffusion language models
that generate text through block-wise iterative refinement. Instead of autoregressive token-by-token generation,
LLaDA2 starts with a fully masked sequence and progressively unmasks tokens by confidence over multiple refinement
steps.

## Usage

```py
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

from diffusers import BlockRefinementScheduler, LLaDA2Pipeline

model_id = "inclusionAI/LLaDA2.1-mini"
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, dtype=torch.bfloat16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
scheduler = BlockRefinementScheduler()

pipe = LLaDA2Pipeline(model=model, scheduler=scheduler, tokenizer=tokenizer)
output = pipe(
prompt="Write a short poem about the ocean.",
gen_length=256,
block_length=32,
num_inference_steps=32,
threshold=0.7,
editing_threshold=0.5,
max_post_steps=16,
temperature=0.0,
)
print(output.texts[0])
```

## Callbacks

Callbacks run after each refinement step and can inspect or modify the current tokens.

```py
def on_step_end(pipe, step, timestep, callback_kwargs):
cur_x = callback_kwargs["cur_x"]
# Inspect or modify `cur_x` here.
return {"cur_x": cur_x}

out = pipe(
prompt="Write a short poem.",
callback_on_step_end=on_step_end,
callback_on_step_end_tensor_inputs=["cur_x"],
)
```

## Recommended parameters

LLaDA2.1 models support two modes:

| Mode | `threshold` | `editing_threshold` | `max_post_steps` |
|------|-------------|---------------------|------------------|
| Quality | 0.7 | 0.5 | 16 |
| Speed | 0.5 | 0.0 | 16 |

For LLaDA2.0 models, disable editing by passing `editing_threshold=None`.

For all models: `block_length=32`, `temperature=0.0`, `steps=32`.

## LLaDA2Pipeline
[[autodoc]] LLaDA2Pipeline
- all
- __call__

## LLaDA2PipelineOutput
[[autodoc]] pipelines.LLaDA2PipelineOutput
1 change: 1 addition & 0 deletions docs/source/en/api/pipelines/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
| [Latent Diffusion](latent_diffusion) | text2image, super-resolution |
| [Latte](latte) | text2image |
| [LEDITS++](ledits_pp) | image editing |
| [LLaDA2](llada2) | text2text |
| [Lumina-T2X](lumina) | text2image |
| [Marigold](marigold) | depth-estimation, normals-estimation, intrinsic-decomposition |
| [MultiDiffusion](panorama) | text2image |
Expand Down
25 changes: 25 additions & 0 deletions docs/source/en/api/schedulers/block_refinement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# BlockRefinementScheduler

The `BlockRefinementScheduler` manages block-wise iterative refinement for discrete token diffusion. At each step it
commits the most confident tokens and optionally edits already-committed tokens when the model predicts a different
token with high confidence.

This scheduler is used by [`LLaDA2Pipeline`].

## BlockRefinementScheduler
[[autodoc]] BlockRefinementScheduler

## BlockRefinementSchedulerOutput
[[autodoc]] schedulers.scheduling_block_refinement.BlockRefinementSchedulerOutput
50 changes: 50 additions & 0 deletions examples/discrete_diffusion/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Discrete Token Diffusion (Experimental)

This folder contains **training and sampling examples** for *discrete diffusion over token IDs* (language-model style), built to follow the `diffusers` + `accelerate` training conventions.

## LLaDA2

[LLaDA2](https://huggingface.co/collections/inclusionAI/llada21) generates text through block-wise iterative refinement. Instead of autoregressive token-by-token generation, it starts with a fully masked sequence and progressively unmasks tokens by confidence over multiple refinement steps.

### Train

The training script uses confidence-aware loss and works with any causal LM from the Hub (e.g. Qwen, Llama, Mistral):

```bash
accelerate launch examples/discrete_diffusion/train_llada2.py \
--model_name_or_path Qwen/Qwen2.5-0.5B \
--dataset_name wikitext \
--dataset_config_name wikitext-2-raw-v1 \
--text_column text \
--output_dir llada2-output \
--max_train_steps 1000 \
--prompt_length 32 \
--block_length 32 \
--lambda_conf 2.0 \
--conf_temperature 0.5
```

If you don't want to download a dataset, you can use random-token data:

```bash
accelerate launch examples/discrete_diffusion/train_llada2.py \
--model_name_or_path Qwen/Qwen2.5-0.5B \
--output_dir llada2-output \
--use_dummy_data \
--num_dummy_samples 2048
```

### Sample

```bash
python examples/discrete_diffusion/sample_llada2.py \
--model_id inclusionAI/LLaDA2.1-mini \
--prompt "Write a short poem about the ocean." \
--gen_length 256 \
--num_inference_steps 32 \
--threshold 0.7 \
--editing_threshold 0.5 \
--max_post_steps 16 \
--use_chat_template \
--add_generation_prompt
```
Loading
Loading