Ling is a MoE LLM provided and open-sourced by InclusionAI
RGBD video generation model conditioned on camera input
Bidirectional token-classification model for identifiable info
Achieving 3+ generation speedup on reasoning tasks
Long-form streaming TTS system for multi-speaker dialogue generation
This repository contains the official implementation of FastVLM
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Diffusion Transformer with Fine-Grained Chinese Understanding
DeepMind model for tracking arbitrary points across videos & robotics
FAIR Sequence Modeling Toolkit 2
code for Mesh R-CNN, ICCV 2019
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
GPT4V-level open-source multi-modal model based on Llama3-8B
A SOTA open-source image editing model
Chinese and English multimodal conversational language model
High-Resolution Image Synthesis with Latent Diffusion Models
Open-source, high-performance Mixture-of-Experts large language model
Powerful open source image generation model
Chat & pretrained large vision language model
Pushing the Limits of Mathematical Reasoning in Open Language Models
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
StudioOllamaUI is a local, portable interface for Ollama
A Conversational Speech Generation Model