Capable of understanding text, audio, vision, video
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
The official PyTorch implementation of Google's Gemma models
Inference code for scalable emulation of protein equilibrium ensembles
A state-of-the-art open visual language model
Chat & pretrained large vision language model
Implementation of "MobileCLIP" CVPR 2024
Official implementation of Watermark Anything with Localized Messages
Towards Real-World Vision-Language Understanding
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Personalize Any Characters with a Scalable Diffusion Transformer
Chat & pretrained large audio language model proposed by Alibaba Cloud
Official code for Style Aligned Image Generation via Shared Attention
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
ICLR2024 Spotlight: curation/training code, metadata, distribution
A Production-ready Reinforcement Learning AI Agent Library
A PyTorch library for implementing flow matching algorithms
Official DeiT repository
PyTorch code and models for the DINOv2 self-supervised learning
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Memory-efficient and performant finetuning of Mistral's models
Towards Ultimate Expert Specialization in Mixture-of-Experts Language