Skip to content

Commit dc20d8f

Browse files
arcticflyclaude
andcommitted
Fix LocalBackend fork_checkpoint to overwrite initial LoRA for vLLM
When forking a checkpoint, the source checkpoint was copied to checkpoints/{source_step} in the destination model directory. However, model.register(backend) already created an empty LoRA at checkpoints/0000. When vLLM starts, it loads @0 — the empty 0000 checkpoint — not the forked one. Fix by also copying the forked weights to checkpoints/0000 so vLLM loads the correct weights on startup. Fixes #651 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 4cbfa15 commit dc20d8f

1 file changed

Lines changed: 9 additions & 0 deletions

File tree

src/art/local/backend.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1434,6 +1434,15 @@ async def _experimental_fork_checkpoint(
14341434

14351435
shutil.copytree(source_checkpoint_dir, dest_checkpoint_dir)
14361436

1437+
# Also overwrite the initial empty checkpoint at step 0 so vLLM
1438+
# loads the forked weights on startup (it uses @0 by default)
1439+
step0_dir = get_step_checkpoint_dir(dest_model_dir, 0)
1440+
if os.path.exists(step0_dir) and step0_dir != dest_checkpoint_dir:
1441+
if verbose:
1442+
print(f"Overwriting initial checkpoint at {step0_dir} with forked weights")
1443+
shutil.rmtree(step0_dir)
1444+
shutil.copytree(dest_checkpoint_dir, step0_dir)
1445+
14371446
if verbose:
14381447
print(
14391448
f"Successfully forked checkpoint from {from_model} (step {selected_step}) to {model.name}"

0 commit comments

Comments
 (0)