What's Changed
- feat: added awq marlin qlinear by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/315
- build: speed up compilation for marlin kernels by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/316
- test: added unittests for marlin kernels by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/317
- refactor: clean up build warnings and refactor marlin kernels by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/318
- fix: clean up build warnings: "LOG" redefined by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/319
- cmake: make includes private and disable jinja2cpp build by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/320
- ci: allow build without requiring a physical gpu device by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/321
- fix: put item into asyncio.Queue in a thread-safe way by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/324
- refactor: added static switch for marlin kernel dispatch by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/325
- feat: fix and use marlin kernel for awq by default by @guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/326
Full Changelog: https://github.com/vectorch-ai/ScaleLLM/compare/v0.2.0...v0.2.1