A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Official code for Style Aligned Image Generation via Shared Attention
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
ICLR2024 Spotlight: curation/training code, metadata, distribution
A PyTorch library for implementing flow matching algorithms
Official DeiT repository
Memory-efficient and performant finetuning of Mistral's models
Diffusion Transformer with Fine-Grained Chinese Understanding
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Large-language-model & vision-language-model based on Linear Attention
Pokee Deep Research Model Open Source Repo
Unified Multimodal Understanding and Generation Models
Language modeling in a sentence representation space
Dataset of GPT-2 outputs for research in detection, biases, and more
The ChatGPT Retrieval Plugin lets you easily find personal documents
FlashMLA: Efficient Multi-head Latent Attention Kernels
High-Resolution Image Synthesis with Latent Diffusion Models
A Conversational Speech Generation Model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Open-Source Financial Large Language Models!
Powerful open source image generation model
Open-source, high-performance Mixture-of-Experts large language model