Chat & pretrained large vision language model
Open Source Differentiable Computer Vision Library
Hub of ready-to-use datasets for ML models
A multi-function Discord bot
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Inference script for Oasis 500M
A minimal yet professional single agent demo project
Interface for OuteTTS models
Converts text to speech in realtime
Toolkit for audio, music, and speech generation
Omnilingual ASR Open-Source Multilingual SpeechRecognition
The official Meta Llama 3 GitHub site
Utilities intended for use with Llama models
Set of tools to assess and improve LLM security
Code for Cicero, an AI agent that plays the game of Diplomacy
PyTorch code and models for V-JEPA self-supervised learning from video
A PyTorch library for implementing flow matching algorithms
PyTorch3D is FAIR's library of reusable components for deep learning
Official implementation of DreamCraft3D
A neural network that transforms a design mock-up into static websites
Diffusion Transformer with Fine-Grained Chinese Understanding
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A research prototype of a human-centered web agent
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
SAPIEN Manipulation Skill Framework