Update dependency cache_dit to v1.2.2 #33

renovate · 2026-01-16T10:29:54Z

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package	Change	Age	Confidence
cache_dit	`==1.1.10` → `==1.2.2`

Release Notes

vipshop/cache-dit (cache_dit)

`v1.2.2`

Compare Source

What's Changed

fix load config docs typo by @DefTruth in #778
chore: rename hybrid parallel backend by @DefTruth in #779
feat: add an extend context parallel api by @DefTruth in #780
chore: set save_ctx as False for ring p2p by @DefTruth in #782
chore: add flux2-klein edit examples by @DefTruth in #783
fix ring lse fp32 convert error by @DefTruth in #785
feat: support cache for glm-image by @DefTruth in #787
chore: reset rdt as 0.12 in examples for better precision by @DefTruth in #789
chore: update badges by @DefTruth in #790
feat: ring attn w/ npu_fia for ascend npu by @luren55 in #792
feat: support tensor parallel for glm-image by @DefTruth in #794

Full Changelog: vipshop/cache-dit@v1.2.1...v1.2.2

`v1.2.1`: USP, 2D/3D Parallel

Compare Source

🎉 v1.2.1 release is ready, the major updates including: Ring Attention w/ batched P2P, USP (Hybrid Ring and Ulysses), Hybrid 2D and 3D Parallelism (💥USP + TP), VAE-P Comm overhead reduce.

# Hybrid 2D/3D Parallelism in Cache-DiT is fully compatible w/ torch.compile, 
# Cache Acceleration, Text Encoder Parallelism, VAE Parallelism and more.
torchrun --nproc_per_node=8 -m cache_dit.generate flux2 --config parallel_2d.yaml --compile
torchrun --nproc_per_node=8 -m cache_dit.generate flux2 --config parallel_3d.yaml --compile
torchrun --nproc_per_node=8 -m cache_dit.generate --parallel ulysses_tp --cache --compile

What's Changed

[chore] Align torch generator with example by @BBuf in #723
Fix generator bug in cache-dit by @BBuf in #724
examples: allow custom generator device by @DefTruth in #726
examples: allow custom warmup-steps by @DefTruth in #727
docs: add latest news by @DefTruth in #728
docs: fix docs format by @DefTruth in #729
fix selected metrics print by @66RING in #730
docs: add flux examples to tp docs by @DefTruth in #731
fix ltx-2 i2v example by @DefTruth in #734
Update README.md by @DefTruth in #735
chore: allow use default steps for scm by @DefTruth in #736
[chore] support gpu generator in server by @BBuf in #737
docs: update download badge by @DefTruth in #738
Refine profiler and serving docs by @BBuf in #739
example image-path support url by @BBuf in #742
fix UAA broken while using joint attn by @DefTruth in #743
compile: avoid graph break for UAA by @DefTruth in #744
refactor configs yml in examples by @DefTruth in #745
relax npu attention import by @DefTruth in #747
feat: add set_attn_backend api by @DefTruth in #748
docs: update quick start by @DefTruth in #749
fix ring attn w/ native backend in torch 2.10 by @DefTruth in #750
feat: NPU FA support attention mask by @zhangtao0408 in #751
feat: add cache-dit-generate cli tool by @DefTruth in #752
docs: update ascend npu examples by @DefTruth in #753
feat: support ring attn p2p comm by @DefTruth in #754
feat: support USP -> Ulysses + Ring by @DefTruth in #755
fix npu import error w/o triton by @DefTruth in #756
chore: use batched isend/irecv for vae-p by @DefTruth in #757
feat: tile batched p2p comm for vae-p by @DefTruth in #758
docs: update example installation by @DefTruth in #760
reduce comm overhead for vae-p by @DefTruth in #762
[chore] Fix FLUX2 Ulysses Anything NCCL Hang by @BBuf in #761
[2/N] reduce comm overhead for vae-p by @DefTruth in #763
misc: fix sglang diffusion docs link by @DefTruth in #764
feat: support hybrid CP/SP + TP by @DefTruth in #765
chore: use _cp_rank for cp_config & fix docs by @DefTruth in #766
fix hybrid parallel docs by @DefTruth in #768
chore: fix api docs typo by @DefTruth in #769
fix api docs typo by @DefTruth in #770
feat: support latest z-image in examples by @DefTruth in #771
chore: add show case for parallel vae by @DefTruth in #772
chore: update docs by @DefTruth in #773
[2/N] update docs part-2 by @DefTruth in #774
docs: add ComfyUI-CacheDiT link by @DefTruth in #775
feat: load config support hybrid parallel by @DefTruth in #777

New Contributors

@66RING made their first contribution in #730
@zhangtao0408 made their first contribution in #751

Full Changelog: vipshop/cache-dit@v1.2.0...v1.2.1

`v1.2.0`: Major Release: NPU, TE-P, VAE-P, CN-P, ...

Compare Source

v1.2.0 Major Release: NPU, TE-P, VAE-P, CN-P, ...

Overviews

v1.2.0 is a Major Release after v1.1.0. We introduced many updates in v1.2.0, thereby further enhancing the ease of use and performance of Cache-DiT. We sincerely thank the contributors of Cache-DiT. The main updates for this time are as follows, includes:

🎉New Models Support
🎉Request level cache context
🎉HTTP Serving Support
🎉Context Parallelism Optimization
🎉Text Encoder Parallelism
🎉Auto Encoder (VAE) Parallelism
🎉ControlNet Parallelism
🎉Ascend NPU Support
🎉Community Integration.

🔥New Models Support

Qwen-Image:
- Image: Qwen-Image-2512, Qwen-Image-Layered
- Edit: Qwen-Image-Edit-2511, Qwen-Image-Edit-2509
- ControlNet: Qwen-Image-ControlNet, Qwen-Image-ControlNet-Inpainting
Qwen-Image-Lightning: Qwen-Image-Lightning series, Qwen-Image-Edit-Lightning series
Wan: Wan 2.1 VACE, Wan 2.2 VACE.
Z-Image: Z-Image-Turbo, Z-Image-Turbo-Fun-ControlNet-2.0, Z-Image-Turbo-Fun-ControlNet-2.1
FLUX.2: FLUX.2-dev, FLUX.2-Klein-4B, FLUX.2-Klein-base-4B, FLUX.2-Klein-9B, FLUX.2-Klein-base-9B
LTX-2: LTX-2-I2V, LTX-2-T2V by @BBuf
Ovis-Image: Ovis-Image
LongCat-Image: LongCat-Image, LongCat-Image-Edit
Nunchaku INT4 Models: Z-Image-Turbo, Qwen-Image-Edit-2511

🔥Request level cache context

If you need to use a different num_inference_steps for each user request instead of a fixed value, you should use it in conjunction with refresh_context API. Before performing inference for each user request, update the cache context based on the actual number of steps. Please refer to 📚run_cache_refresh as an example.

import cache_dit
from cache_dit import DBCacheConfig
from diffusers import DiffusionPipeline

# Init cache context with num_inference_steps=None (default)
pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image")
pipe = cache_dit.enable_cache(pipe.transformer, cache_config=DBCacheConfig(num_inference_steps=None))

# Assume num_inference_steps is 28, and we want to refresh the context
cache_dit.refresh_context(pipe.transformer, num_inference_steps=28, verbose=True)
output = pipe(...) # Just call the pipe as normal.
stats = cache_dit.summary(pipe.transformer) # Then, get the summary

# Update the cache context with new num_inference_steps=50.
cache_dit.refresh_context(pipe.transformer, num_inference_steps=50, verbose=True)
output = pipe(...) # Just call the pipe as normal.
stats = cache_dit.summary(pipe.transformer) # Then, get the summary

# Update the cache context with new cache_config.
cache_dit.refresh_context(
    pipe.transformer,
    cache_config=DBCacheConfig(
        residual_diff_threshold=0.1,
        max_warmup_steps=10,
        max_cached_steps=20,
        max_continuous_cached_steps=4,
        # The cache settings should all be located in the cache config 
        # if cache config is provided. Otherwise, we will skip it.
        num_inference_steps=50,
    ),
    verbose=True,
)
output = pipe(...) # Just call the pipe as normal.
stats = cache_dit.summary(pipe.transformer) # Then, get the summary

🔥HTTP Serving Support

Built-in HTTP serving deployment support with simple REST APIs by @BBuf, deploy cache-dit models with HTTP API for text-to-image, image editing, multi-image editing, and text/image-to-video generation.

🔥Context Parallelism Optimization

UAA: Ulysses Anything Attention: support any sequence length and any head num by @DefTruth @gameofdimension @tingkuanpei
Async Ulysses CP: support Async Ulysses QKV Projection for FLUX.1, FLUX.2, Z-Image, Qwen-Image by @DefTruth
Async FP8 Ulysses: support async FP8 all2all comm for ulysses by @triple-mu

🔥Text Encoder Parallelism

Currently, cache-dit supported text encoder parallelism for T5Encoder, UMT5Encoder, Llama, Gemma 1/2/3, Mistral, Mistral-3, Qwen-3, Qwen-2.5 VL, Glm and Glm-4 model series, namely, supported almost 🔥ALL pipelines in diffusers.

Users can set the extra_parallel_modules parameter in parallelism_config (when using Tensor Parallelism or Context Parallelism) to specify additional modules that need to be parallelized beyond the main transformer — e.g, text_encoder in Flux2Pipeline. It can further reduce the per-GPU memory requirement and slightly improve the inference performance of the text encoder.

# pip3 install "cache-dit[parallelism]"
from cache_dit import ParallelismConfig

# Transformer Tensor Parallelism + Text Encoder Tensor Parallelism
cache_dit.enable_cache(
    pipe, 
    cache_config=DBCacheConfig(...),
    parallelism_config=ParallelismConfig(
        tp_size=2,
        parallel_kwargs={
            "extra_parallel_modules": [pipe.text_encoder], # FLUX.2
        },
    ),
)

🔥Auto Encoder (VAE) Parallelism

Currently, cache-dit supported auto encoder (vae) parallelism for AutoencoderKL, AutoencoderKLQwenImage, AutoencoderKLWan, and AutoencoderKLHunyuanVideo series, namely, supported almost 🔥ALL pipelines in diffusers. It can further reduce the per-GPU memory requirement and slightly improve the inference performance of the auto encoder. Users can set it by extra_parallel_modules parameter in parallelism_config, for example:

# pip3 install "cache-dit[parallelism]"
from cache_dit import ParallelismConfig

# Transformer Context Parallelism + Text Encoder Tensor Parallelism + VAE Data Parallelism
cache_dit.enable_cache(
    pipe, 
    cache_config=DBCacheConfig(...),
    parallelism_config=ParallelismConfig(
        ulysses_size=2,
        parallel_kwargs={
            "extra_parallel_modules": [pipe.text_encoder, pipe.vae], # FLUX.1
        },
    ),
)

🔥ControlNet Parallelism

Further, cache-dit even supported controlnet parallelism for specific models, such as Z-Image-Turbo with ControlNet. Users can set it by extra_parallel_modules parameter in parallelism_config, for example:

# pip3 install "cache-dit[parallelism]"
from cache_dit import ParallelismConfig

# Transformer Context Parallelism + Text Encoder Tensor Parallelism 

# + VAE Data Parallelism + ControlNet Context Parallelism
cache_dit.enable_cache(
    pipe, 
    cache_config=DBCacheConfig(...),
    parallelism_config=ParallelismConfig(
        ulysses_size=2,
        # case: Z-Image-Turbo-Fun-ControlNet-2.1
        parallel_kwargs={
            "extra_parallel_modules": [pipe.text_encoder, pipe.vae, pipe.controlnet],
        },
    ),
)

# torchrun --nproc_per_node=2 parallel_cache.py

🔥Ascend NPU Support

Cache-DiT now provides native support for Ascend NPU (by @gameofdimension @luren55 @DefTruth). Theoretically, nearly all models supported by Cache-DiT can run on Ascend NPU with most of Cache-DiT’s optimization technologies, including:

Hybrid Cache Acceleration (DBCache, DBPrune, TaylorSeer, SCM and more)
Context Parallelism (w/ Extended Diffusers' CP APIs, UAA, Async Ulysses, ...)
Tensor Parallelism (w/ PyTorch native DTensor and Tensor Parallelism APIs)
Text Encoder Parallelism (w/ PyTorch native DTensor and Tensor Parallelism APIs)
Auto Encoder (VAE) Parallelism (w/ Data or Tile Parallelism, avoid OOM)
ControlNet Parallelism (w/ Context Parallelism for ControlNet module)
Built-in HTTP serving deployment support with simple REST APIs

Please refer to Ascend NPU Supported Matrix for more details.

🔥Community Integration

Full Changelogs

chore: Update README.md by @DefTruth in #442
feat: support step compute mask by @DefTruth in #444
bugfix: fix bench distill cfg mismatch by @DefTruth in #445
chore: update step mask docs by @DefTruth in #446
chore: Update User_Guide.md by @DefTruth in #447
chore: update README by @DefTruth in #448
chore: update step mask example by @DefTruth in #449
chore: hightlight SCM - step computation mask by @DefTruth in #450
chore: hightlight SCM - step computation mask by @DefTruth in #451
chore: hightlight SCM - step computation mask by @DefTruth in #452
misc: support quantize and attn backend for flux example by @DefTruth in #453
misc: add quant and attn backend -> step mask example by @DefTruth in #454
chore: Update README.md by @DefTruth in #455
fix load options drop kwargs by @DefTruth in #456
chore: add maybe pad prompt utils by @DefTruth in #458
fix: move .to(device) to reduce tp mem by @BBuf in #459
example: support more overrided args and memory tracker by @BBuf in #461
Add missing model-path args in example by @BBuf in #463
UAA: ulysses anything attn w/ zero overhead by @DefTruth in #462
fix qwen-image multi-gpu mismatch by @BBuf in #464
Fix more models multi gpu mismatch by @BBuf in #466
feat: support unshard anything for UAA by @DefTruth in #465
chore: update qwen-image example for UAA by @DefTruth in #468
chore: Update README.md by @DefTruth in #470
chore: Update README.md by @DefTruth in #471
support skyreels cp and tp ulysses by @BBuf in #469
always use vae tiling if vram <= 48 GiB for qwen-image by @DefTruth in #472
chore: Add SkyReelsV2 tp/cp to support-matrix by @BBuf in #473
fix: correct string literal syntax errors in examples by @BBuf in #475
feat: allow UAA in compiled graph by @DefTruth in #474
chore: Add wan 2.2 i2v context parallel example by @DefTruth in #476
chore: optimize wan examples, compile & offload by @BBuf in #477
feat: support async ulysses cp for flux by @DefTruth in #480
chore: update support matrix by @DefTruth in #484
chore: update async ulysses cp docs by @DefTruth in #486
chore: update async ulysses cp refs by @DefTruth in #487
feat: support FLUX.2-dev Tensor Parallelism by @gameofdimension in #485
feat: support Hybrid cache + TP for 🔥FLUX.2 by @DefTruth in #489
feat: Add seq offload for 🔥FLUX.2 w/o parallel by @DefTruth in #490
feat: support 🔥FLUX.2 context parallel by @DefTruth in #492
feat: support torch profiler in cache-dit by @BBuf in #491
feat: support 🔥z-image tensor parallel by @gameofdimension in #494
feat: support lumina2 tensor parallel by @gameofdimension in #495
feat: support cache for 🔥z-image by @DefTruth in #496
feat: support context parallel for 🔥z-image by @DefTruth in #497
fix: temp FnB(n>0) workaround for z-image cache w/ cp by @DefTruth in #499
Add profiler for flux tp and cp example by @BBuf in #501
chore: Update README.md by @DefTruth in #502
feat: support FnB0 for z-image w/ cp by @DefTruth in #503
feat: support _sdpa_cudnn backend for cp by @DefTruth in #504
feat: support async ulysses cp for z-image by @DefTruth in #505
feat: add all_to_all_single v2 by @DefTruth in #507
feat: support async ulysses cp for qwen-image by @DefTruth in #508
feat: support all2all qkv per token fp8 by @triple-mu in #509
chore: improve flux2 and qwen image examples by @BBuf in #512
fix: workaround for uaa-fp8 .view compile error by @triple-mu in #514
feat: relaxed transformer strict assert by @DefTruth in #515
feat: all2all qkv fp8 for ulysses by @DefTruth in #516
feat: support pre-defined step masks by @DefTruth in #517
chore: separate chrono-edit and wan cp plan by @DefTruth in #519
fix example utils.py uaa fp8 flag typo by @DefTruth in #521
feat: extend predefined step masks for 4/6 steps by @DefTruth in #523
misc: add z-image-turbo predefined step masks by @DefTruth in #525
feat: support per_token_quant_fp8 triton kernel by @triple-mu in #524
feat: unified async ulysses fp8 by @DefTruth in #526
feat: support serving for cache-dit by @BBuf in #522
Fix get_model_info api 404 when serving with tp/cp by @BBuf in #529
feat: support cache for hunyuanvideo-1.5 by @DefTruth in #528
feat: support cache for ovis-image by @DefTruth in #530
Add heartbeat to avoid nccl timeout when the service hangs. by @BBuf in #531
chore: remove un-needed cp imports by @DefTruth in #532
feat: pointer casting for fp8 all2all by @triple-mu in #533
chore: relax block adapter deps by @DefTruth in #534
Add request queue to limit concurrent generation requests by @BBuf in #535
News: 🔥🔥SGLang Diffusion x Cache-DiT ready!🔥🔥 by @DefTruth in #536
feat: optimize async fp8 ulysses attn by @DefTruth in #537
chore: delete useless code in serving by @BBuf in #539
feat: make all_to_all comm unified by @DefTruth in #538
feat: add refresh cache context api by @DefTruth in #542
feat: support image edit model in serving by @BBuf in #541
chore: simplify ulysses async flag by @DefTruth in #543
chore: re-registered sage attn backend by @DefTruth in #544
feat: support any head num for ulysses by @DefTruth in #546
feat: support uneven heads in ulysses w/o padding by @triple-mu in #547
chore: refactor FLUX.2 image editing tests by @BBuf in #548
feat: Add ulysses for any heads w/o padding by @DefTruth in #549
feat: add envs manager for cache-dit by @DefTruth in #550
fix ulysses fp8 and uneven head conflicts by @DefTruth in #552
feat: uaa avoid extra memory IO access by @triple-mu in #551
chore: simplify quantize flags in example utils by @DefTruth in #553
chore: fix quantize flags in example by @DefTruth in #554
chore: fix quantize & TP conflicts for wan by @DefTruth in #556
feat: support serving text2video by @BBuf in #555
chore: Update SERVING Doc and FAQ Doc by @BBuf in #557
chore: qwen edit lightning cp/tp examples by @DefTruth in #559
feat: support ovis-image context parallel by @DefTruth in #560
feat: serving support image2video by @BBuf in #558
chore: add collect_env script by @DefTruth in #562
Add pre-commit and GitHub Actions CI by @DefTruth in #564
chore: refactor parallelism for better reusability by @DefTruth in #565
chore: Update vLLM-Omni integration by @SamitHuang in #566
feat: add pipe quant config for serving by @nono-Sang in #563
News: 🔥vLLM-Omni x Cache-DiT ready! by @DefTruth in #567
feat: enable custom attn backend for TP by @DefTruth in #568
feat: support TP for many text encoder by @DefTruth in #569
fix qwen-edit-lightning examples by @DefTruth in #571
fix get_text_encoder_from_pipe by @DefTruth in #572
fix: handle general compile options in example utils by @DefTruth in #573
chore: reduce un-popular examples by @DefTruth in #574
feat: add text_encoder tp for serving by @nono-Sang in #570
chore: simplify example by @DefTruth in #575
chore: make unified examples by @DefTruth in #576
chore: fix vllm-omni docs link by @DefTruth in #577
chore: optimize examples default path mapping by @DefTruth in #579
chore: fix vllm-omni docs link by @DefTruth in #580
feat: support Ovis-Image tensor parallel by @DefTruth in #582
chore: fix typo in User_Guide.md by @DefTruth in #583
chore: fail fast TP validation for attn heads by @CPFLAME in #581
fix patch functor for multi transformers by @DefTruth in #586
chore: add qwen image controlnet example by @DefTruth in #588
chore: update docs by @DefTruth in #590
feat: register fa3 backend for context parallel by @nono-Sang in #589
chore: support separate quant-type for text encoder by @DefTruth in #591
hotfix for fa3 backend import error by @DefTruth in #593
chore: fix typo in README.md by @DefTruth in #594
chore: set save_ctx to False for inference by @nono-Sang in #596
fix flux examples model path mismatch by @DefTruth in #597
Simplify CLI: Make task argument optional by @BBuf in #600
chore: fix extra path compare by @DefTruth in #603
CI: Add build_wheel CI by @DefTruth in #604
feat: support cache for LongCat-Image by @e1ijah1 in #602
feat: Serving support LORA by @BBuf in #601
CI: Add Forward Pattern CPU CI Tests by @DefTruth in #605
chore: Update README.md by @DefTruth in #606
Fix typo by @BBuf in #607
feat: z-image-controlnet 🔥4x speedup! by @DefTruth in #608
fix lora path mismatch in examples by @DefTruth in #609
feat: support TP and CP for longcat-image by @DefTruth in #610
misc: fix typo by @DefTruth in #612
feat: support 🔥Qwen-Image-Edit-2511 by @DefTruth in #614
feat: support 🔥Qwen-Image-Layered by @DefTruth in #615
chore: simplify parallelism dispatch by @DefTruth in #616
chore: simplify quantize dispatch by @DefTruth in #617
chore: refactor kernels module by @DefTruth in #618
ci: add refresh context ci tests by @DefTruth in #619
misc: add device info to example summary by @DefTruth in #621
feat: support ⚡️Z-Image-Turbo Nunchaku by @DefTruth in #623
[chore] Improve error_logging in serving tp_worker.py by @BBuf in #627
chore: support more alias for quant types by @DefTruth in #628
chore: fix alias rev map for quant types by @DefTruth in #629
chore: lazy import check for quantize api by @DefTruth in #630
chore: add more compile flags setting by @DefTruth in #631
[Bug] Apply --attn backend in single-GPU examples by @BBuf in #633
Bump up to v1.1.10 by @DefTruth in #634
[chore] SERVING support more quant and reduce warmup steps and align performace with example by @BBuf in #635
feat: fused permute into ulysses qkv fp8 by @triple-mu in #636
feat: support 🔥qwen-image-2512 by @DefTruth in #637
[chore] Serving cient support return inference time by @BBuf in #638
fix wan 2.1 i2v cp and t5/umt5 te-p by @DefTruth in #639
Serving: Align inference time format with LightX2V (timezone-aware) by @BBuf in #640
[Test] Serving flux2_image_edit test add ulysses-anything test by @BBuf in #641
chore: fix wan 2.1 t2v example path mapping by @DefTruth in #642
feat: native attn support ring-attn by @DefTruth in #643
feat: support 🔥vae parallelism by @DefTruth in #645
feat: support flux2 vae parallelism by @DefTruth in #646
bugfix: fix wan vae parallel by @DefTruth in #647
update license by @DefTruth in #648
chore: fix video output path parse error by @DefTruth in #649
chore: fix load options utils by @DefTruth in #650
feat: support ascend npu by @gameofdimension in #651
chore: optimize qwen-image vae parallel by @DefTruth in #652
feat: add abstract platform by @DefTruth in #653
chore: use unified platform for ci tests by @DefTruth in #655
chore: use unified platform for serving by @DefTruth in #656
[chore] Optional stats + path/base64 outputs by @BBuf in #658
feat: add online readthedocs.io by @DefTruth in #659
hotfix for online readthedocs.io by @DefTruth in #660
hotfix for online readthedocs.io by @DefTruth in #661
hotfix for online readthedocs.io by @DefTruth in #662
chore: test online readthedocs.io ci by @DefTruth in #663
support ulysses npu attention by @luren55 in #654
docs: update cache-dit docs by @DefTruth in #664
docs: add example docs by @DefTruth in #665
docs: update docs by @DefTruth in #666
docs: fix typos by @DefTruth in #667
docs: optimize user docs by @DefTruth in #668
docs: add more image assets for docs by @DefTruth in #670
docs: fix image rendering broken by @DefTruth in #671
feat: allow pass attn_mask for npu attention by @DefTruth in #672
[SERVING] Add FLUX.2 LoRA tests& support sigmas by @BBuf in #674
feat: support load cache/parallel config from yaml by @DefTruth in #675
docs: add dbprune assets by @DefTruth in #676
docs: fix dbprune image broken by @DefTruth in #677
docs: fix lemica link broken by @DefTruth in #678
[Doc] Profiling doc match new generate.py by @BBuf in #679
CI: add check-mkdocs ci by @DefTruth in #680
CI: add load configs ci tests by @DefTruth in #681
docs: refactor benchmark docs by @DefTruth in #682
docs: update README.md by @DefTruth in #683
docs: update nvidia gpu benchmark by @DefTruth in #684
docs: move down serving deployment docs by @DefTruth in #685
Add NPU support user guide by @luren55 in #687
chore: deprecated load_options in refresh_context by @DefTruth in #689
feat: support ltx-2 by @DefTruth in #691
fix qwen-image cp error for latest diffusers by @DefTruth in #692
Serving inference time support microsecond and save image/video with uuid by [@BBuf](https://redirect.github.c

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

renovate bot force-pushed the renovate/cache_dit-1.x branch from 9463dfe to fd0ef19 Compare February 2, 2026 05:23

renovate bot changed the title ~~Update dependency cache_dit to v1.2.0~~ Update dependency cache_dit to v1.2.1 Feb 2, 2026

Update dependency cache_dit to v1.2.2

c65665d

renovate bot force-pushed the renovate/cache_dit-1.x branch from fd0ef19 to c65665d Compare February 10, 2026 08:50

renovate bot changed the title ~~Update dependency cache_dit to v1.2.1~~ Update dependency cache_dit to v1.2.2 Feb 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dependency cache_dit to v1.2.2 #33

Update dependency cache_dit to v1.2.2 #33

Uh oh!

renovate bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Update dependency cache_dit to v1.2.2 #33

Are you sure you want to change the base?

Update dependency cache_dit to v1.2.2 #33

Uh oh!

Conversation

renovate bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release Notes

v1.2.2

What's Changed

v1.2.1: USP, 2D/3D Parallel

What's Changed

New Contributors

v1.2.0: Major Release: NPU, TE-P, VAE-P, CN-P, ...

v1.2.0 Major Release: NPU, TE-P, VAE-P, CN-P, ...

Overviews

🔥New Models Support

🔥Request level cache context

🔥HTTP Serving Support

🔥Context Parallelism Optimization

🔥Text Encoder Parallelism

🔥Auto Encoder (VAE) Parallelism

🔥ControlNet Parallelism

🔥Ascend NPU Support

🔥Community Integration

Full Changelogs

Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

renovate bot commented Jan 16, 2026 •

edited

Loading

`v1.2.2`

`v1.2.1`: USP, 2D/3D Parallel

`v1.2.0`: Major Release: NPU, TE-P, VAE-P, CN-P, ...