Ultra-Efficient LLMs on End Device
4M: Massively Multimodal Masked Modeling
A PyTorch library for implementing flow matching algorithms
Hackable and optimized Transformers building blocks
Official implementation of DreamCraft3D
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Controllable & emotion-expressive zero-shot TTS
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
FAIR Sequence Modeling Toolkit 2
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
An AI-powered security review GitHub Action using Claude
Designed for text embedding and ranking tasks
A Unified Framework for Text-to-3D and Image-to-3D Generation
Large-language-model & vision-language-model based on Linear Attention
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Genome modeling and design across all domains of life
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Capable of understanding text, audio, vision, video
Audio foundation model excelling in audio understanding
Easy Docker setup for Stable Diffusion with user-friendly UI
ChatGPT interface with better UI
A state-of-the-art open visual language model
High-Resolution Image Synthesis with Latent Diffusion Models
Towards Real-World Vision-Language Understanding
AI Suite for upscaling, interpolating & restoring images/videos