Open-source deep-learning framework
Open-Source Financial Large Language Models
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
State-of-the-art TTS model under 25MB
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Industrial-level controllable zero-shot text-to-speech system
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Provides convenient access to the Anthropic REST API from any Python 3
CodeGeeX2: A More Powerful Multilingual Code Generation Model
RGBD video generation model conditioned on camera input
Revolutionizing Database Interactions with Private LLM Technology
AlphaFold 3 inference pipeline
A 0.1B Omni model trained from scratch
Recovering the Visual Space from Any Views
Python SDK for Claude Agent
PyTorch code and models for the DINOv2 self-supervised learning
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Tooling for the Common Objects In 3D dataset
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Generating Immersive, Explorable, and Interactive 3D Worlds
State-of-the-art (SoTA) text-to-video pre-trained model
Qwen-Image is a powerful image generation foundation model
Netease Youdao's open-source embedding and reranker models
Phi-3.5 for Mac: Locally-run Vision and Language Models