Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Official python implementation of UTCP. UTCP is an open standard
Concatenate a directory full of files into a single prompt
Multi-lingual large voice generation model, providing inference
Superduper: Integrate AI models and machine learning workflows
A Customizable Image-to-Video Model based on HunyuanVideo
The official PyTorch implementation of Google's Gemma models
Get a ChatGPT plugin up and running in under 5 minutes
Code for the paper Language Models are Unsupervised Multitask Learners
Multimodal Diffusion with Representation Alignment
Implementation of Vision Transformer, a simple way to achieve SOTA
PyTorch3D is FAIR's library of reusable components for deep learning
Pushing the Limits of Mathematical Reasoning in Open Language Models
Foundation Models for Time Series
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Sample code and notebooks for Generative AI on Google Cloud
Towards Human-Sounding Speech
The no-nonsense RAG chunking library
Easily turn large sets of image urls to an image dataset
Documentation for Google's Gen AI site - including Gemini API & Gemma
Collection of reference environments, offline reinforcement learning
Towards Real-World Vision-Language Understanding
Open source platform for the machine learning lifecycle
MemU is an open-source memory framework for AI companions
A Powerful Native Multimodal Model for Image Generation