Port of Facebook's LLaMA model in C/C++
Official inference repo for FLUX.2 models
MiniMax M2.1, a SOTA model for real-world dev & agents.
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Flux 2 image generation model pure C inference
Awesome multilingual OCR toolkits based on PaddlePaddle
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Fast, Sharp & Reliable Agentic Intelligence
FAIR Sequence Modeling Toolkit 2
Clean and efficient FP8 GEMM kernels with fine-grained scaling
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Foundational Models for State-of-the-Art Speech and Text Translation
Open-source large language model family from Tencent Hunyuan
FlashMLA: Efficient Multi-head Latent Attention Kernels
Powerful open source image generation model
Real-time behaviour synthesis with MuJoCo, using Predictive Control
llama.go is like llama.cpp in pure Golang
Locally run an Instruction-Tuned Chat-Style LLM
ChatGPT integration with Unity Editor
Learning embeddings for classification, retrieval and ranking
Learning Continuous Signed Distance Functions for Shape Representation
Code for reproducing key results in the paper
Text-to-image model optimized for artistic quality and safe generation