Official implementation of DreamCraft3D
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Research code artifacts for Code World Model (CWM)
Controllable & emotion-expressive zero-shot TTS
DeepMind model for tracking arbitrary points across videos & robotics
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Language modeling in a sentence representation space
Repo of Qwen2-Audio chat & pretrained large audio language model
Open-weight, large-scale hybrid-attention reasoning model
Large-language-model & vision-language-model based on Linear Attention
Capable of understanding text, audio, vision, video
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Towards Real-World Vision-Language Understanding
Stable Diffusion with Core ML on Apple Silicon
The ChatGPT Retrieval Plugin lets you easily find personal documents
Pushing the Limits of Mathematical Reasoning in Open Language Models
Chat & pretrained large vision language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
High-Resolution Image Synthesis with Latent Diffusion Models
AI-powered tool to quickly remove watermarks from images flawlessly
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
AI Suite for upscaling, interpolating & restoring images/videos