GitHub

v5.5.2

release notes

Published 4/9/2026

PatchSafe upgrade

Release notes

Small patch dedicated to optimizing gemma4, fixing inference with use_cache=False due to k/v states sharing between layers, as well as conversion mappings for some models that would inconsistently serialize their weight names. It contains the following PRs:

Add MoE to Gemma4 TP plan (#45219) by @sywangyi and @Cyrilvallez
[gemma4] Dissociate kv states sharing from the Cache (#45312) by @Cyrilvallez
[gemma4] Remove all shared weights, and silently skip them during loading (#45336) by @Cyrilvallez
Fix conversion mappings for vlms (#45340) by @Cyrilvallez

Diff stats+199-9846 filesMain areassrcexamplestestsconfig

v5.5.2

release notes

Published 4/9/2026

PatchSafe upgrade

Release notes

Add MoE to Gemma4 TP plan (#45219) by @sywangyi and @Cyrilvallez
[gemma4] Dissociate kv states sharing from the Cache (#45312) by @Cyrilvallez
[gemma4] Remove all shared weights, and silently skip them during loading (#45336) by @Cyrilvallez
Fix conversion mappings for vlms (#45340) by @Cyrilvallez

Diff stats+199-9846 filesMain areassrcexamplestestsconfig

Latest release

Version v5.9.0is out. See relase notes.

v5.5.2

release notes

Published 4/9/2026

PatchSafe upgrade

Release notes

Add MoE to Gemma4 TP plan (#45219) by @sywangyi and @Cyrilvallez
[gemma4] Dissociate kv states sharing from the Cache (#45312) by @Cyrilvallez
[gemma4] Remove all shared weights, and silently skip them during loading (#45336) by @Cyrilvallez
Fix conversion mappings for vlms (#45340) by @Cyrilvallez

Diff stats+199-9846 filesMain areassrcexamplestestsconfig

v5.5.2

huggingface/transformers

v5.5.2

v5.5.2

huggingface/transformers