Deep learning optimization library making distributed training easy
AI agents running research on single-GPU nanochat training
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A Model Context Protocol server for searching and analyzing arXiv
High-Fidelity and Controllable Generation of Textured 3D Assets
Faster and easier training and deployments
Traditional Mandarin LLMs for Taiwan
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
ICLR2024 Spotlight: curation/training code, metadata, distribution
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Official implementation of DreamCraft3D
Framework for building, orchestrating, and deploying AI agents
Code and models for ICML 2024 paper, NExT-GPT
LightLLM is a Python-based LLM (Large Language Model) inference
Large Audio Language Model built for natural interactions
Open-source framework for intelligent speech interaction
OCR expert VLM powered by Hunyuan's native multimodal architecture
Open-Source Financial Large Language Models
Open source AI pair programmer for coding, debugging, automation
Hackable and optimized Transformers building blocks
Positron, a next-generation data science IDE
An on-premises, OCR-free unstructured data extraction