Unified Multimodal Understanding and Generation Models
The official PyTorch implementation of Google's Gemma models
Code for the paper Language Models are Unsupervised Multitask Learners
Large Multimodal Models for Video Understanding and Editing
Dynamic Documents for R
ktrain is a Python library that makes deep learning AI more accessible
Multimodal Diffusion with Representation Alignment
Official code for Style Aligned Image Generation via Shared Attention
Memory-efficient and performant finetuning of Mistral's models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Low-latency REST API for serving text-embeddings
Aider is AI pair programming in your terminal
The open-source data curation platform for LLMs
Official python implementation of UTCP. UTCP is an open standard
InvokeAI is a leading creative engine for Stable Diffusion models
Solve end to end problems using Llama model family
Conversational voice AI agents
Efficient few-shot learning with Sentence Transformers
Matplotlib style sheets to nicely format figures for scientific papers
Virtual AI anchor that combines state-of-the-art technology
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
Pushing the Limits of Mathematical Reasoning in Open Language Models
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Embed images and sentences into fixed-length vectors
LLM powered fuzzing via OSS-Fuzz