GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
LLM inference in C/C++
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
TT-NN operator library, and TT-Metalium low level kernel programming
This website is a free, open-source guide on prompt engineering
Efficient MoE model for million-token reasoning and coding