Gemma 4 Fixes #4921
shimmyshimmer
announced in
Announcements
Gemma 4 Fixes
#4921
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey everyone, we’ve updated Gemma 4 training and quants with many fixes. The bugs are universal and affected all packages and implementations and did not originate from Unsloth. We identified the bugs, fixed them, and Gemma 4 training now works properly only in Unsloth.
You need 8GB VRAM to train Gemma-4-E2B locally. Unsloth trains Gemma 4 ~1.5x faster with ~60% less VRAM than FA2 setups.
You can also train 26B-A4B and 31B or train via Unsloth Studio. Studio and the notebooks work for Vision, Text, Audio and inference.
For more details, guide + notebooks on training Gemma 4, view our blog: https://unsloth.ai/docs/models/gemma-4/train
Gemma 4 Training Fixes:
For fix details see our blog.
use_cache=Falsehad gibberish for E2B, E4B - see [Gemma 4]use_cache=Falsecorrupts attention computation, producing garbage logits huggingface/transformers#45242If you see losses higher than 13-15 (like 100 or 300) most likely gradient accumulation is not being accounted properly - we have fixed this as part of Unsloth and Unsloth Studio.
Gemma 4 Quant Re-uploads
We also updated our Gemma 4 GGUFs so you will need to re-download. Once again, the quant issues are not related to or originated from Unsloth:
<unused24> tokensCUDA: check for buffer overlap before fusing ggml-org/llama.cpp#21566Unsloth Studio Updates
What's Changed
New Contributors
Full Changelog: v0.1.35-beta...v0.1.36-beta
This discussion was created from the release Gemma 4 Fixes.
Beta Was this translation helpful? Give feedback.
All reactions