Python bindings for llama.cpp
Official repository for LTX-Video
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Inference framework for 1-bit LLMs
Contexts Optical Compression
LTX-Video Support for ComfyUI
The official repo of Qwen chat & pretrained large language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Open-source multi-speaker long-form text-to-speech model
Sharp Monocular Metric Depth in Less Than a Second
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Video understanding codebase from FAIR for reproducing video models
OCR expert VLM powered by Hunyuan's native multimodal architecture
Official implementation of DreamCraft3D
Diffusion Transformer with Fine-Grained Chinese Understanding
Large Multimodal Models for Video Understanding and Editing
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding
Large-language-model & vision-language-model based on Linear Attention
Dataset of GPT-2 outputs for research in detection, biases, and more
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Detect faces in an image
A Conversational Speech Generation Model
Encoder of greater-than-word length text trained on a variety of data