Port of Facebook's LLaMA model in C/C++
Awesome multilingual OCR toolkits based on PaddlePaddle
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Fast, Sharp & Reliable Agentic Intelligence
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
FAIR Sequence Modeling Toolkit 2
Clean and efficient FP8 GEMM kernels with fine-grained scaling
FlashMLA: Efficient Multi-head Latent Attention Kernels
Runtime extension of Proximus enabling Deployment on AMD Ryzen™ AI
Locally run an Instruction-Tuned Chat-Style LLM
Learning embeddings for classification, retrieval and ranking