Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2024-04-04 | 1.7 kB | |
v0.1.2 source code.tar.gz | 2024-04-04 | 105.6 kB | |
v0.1.2 source code.zip | 2024-04-04 | 135.0 kB | |
Totals: 3 Items | 242.3 kB | 0 |
- MQA implementation
- Ops refactorings and optimizations
- Bugfixes
- Model exporting script (
util/convert_weights.py
)
Important Note: With the MQA implementation, older 2B model artifacts need to be updated. Please re-download weights from Kaggle and ensure you have the latest version (-mqa or version 3).
What's Changed
- Clean up docs for developers by @austinvhuang in https://github.com/google/gemma.cpp/pull/102
- MQA Implementation for 2B models by @ufownl in https://github.com/google/gemma.cpp/pull/114
- Enhancing Utility Functions in ops.h by @enum-class in https://github.com/google/gemma.cpp/pull/105
- Added a missing space in app.h by @villesundell in https://github.com/google/gemma.cpp/pull/115
- Fix compilation error when
HWY_COMPILER_GCC_ACTUAL < 1300
by @ufownl in https://github.com/google/gemma.cpp/pull/120 - .bazelversion: Bazel 7.1.1 by @LINKIWI in https://github.com/google/gemma.cpp/pull/122
- Add standalone tool to compress weights. by @szabadka in https://github.com/google/gemma.cpp/pull/125
- 1.07x speedup: merge MQA parallel sections as suggested by @veluca93 by @copybara-service in https://github.com/google/gemma.cpp/pull/126
- Fix off-by-one errors in generation code and token streaming callback. by @szabadka in https://github.com/google/gemma.cpp/pull/127
New Contributors
- @villesundell made their first contribution in https://github.com/google/gemma.cpp/pull/115
- @LINKIWI made their first contribution in https://github.com/google/gemma.cpp/pull/122
- @szabadka made their first contribution in https://github.com/google/gemma.cpp/pull/125
Full Changelog: https://github.com/google/gemma.cpp/compare/v0.1.1...v0.1.2