Awesome multilingual OCR toolkits based on PaddlePaddle
Port of Facebook's LLaMA model in C/C++
The most powerful local music generation model
State-of-the-art TTS model under 25MB
Phi-3.5 for Mac: Locally-run Vision and Language Models
New set of lightweight state-of-the-art, open foundation models
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
CodeGeeX2: A More Powerful Multilingual Code Generation Model
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Block Diffusion for Ultra-Fast Speculative Decoding
OCR expert VLM powered by Hunyuan's native multimodal architecture
26m function call model that runs on incredibly small devices
GLM-4 series: Open Multilingual Multimodal Chat LMs
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Clean and efficient FP8 GEMM kernels with fine-grained scaling
Extension index for stable-diffusion-webui
Personalize Any Characters with a Scalable Diffusion Transformer
ICLR2024 Spotlight: curation/training code, metadata, distribution
Memory-efficient and performant finetuning of Mistral's models
Uncommon Objects in 3D dataset
Blazeface is a lightweight model that detects faces in images
This repository contains the official implementation of research
PyTorch implementation of MAE