Language modeling in a sentence representation space
Multimodal Diffusion with Representation Alignment
Transformers4Rec is a flexible and efficient library
LLM training code for MosaicML foundation models
A Repo For Document AI
Chat & pretrained large vision language model
Toolkit for conversational AI
Extract schema, statistics and entities from datasets
Generate 3D objects conditioned on text or images
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Designed for text embedding and ranking tasks
Guiding Instruction-based Image Editing via Multimodal Large Language
text and image to video generation: CogVideoX (2024) and CogVideo
Implementation of AudioLM audio generation model in Pytorch
Implementation of "MobileCLIP" CVPR 2024
Memory-efficient and performant finetuning of Mistral's models
Open-source framework for conversational voice AI agents
Open-source choice to scale, assess and maintain natural language data
Aider is AI pair programming in your terminal
A Model Context Protocol server for searching and analyzing arXiv
Refer and Ground Anything Anywhere at Any Granularity
Deterministic LLMs Outputs for AI Applications and AI Agents
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Open-source all-in-one platform for engineering AI products