Audiocraft is a library for audio processing and generation
A lightweight, powerful framework for multi-agent workflows
An Open Source package that allows video game creators
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Official python implementation of UTCP. UTCP is an open standard
Concatenate a directory full of files into a single prompt
Multi-lingual large voice generation model, providing inference
Superduper: Integrate AI models and machine learning workflows
Pushing the Limits of Mathematical Reasoning in Open Language Models
A Customizable Image-to-Video Model based on HunyuanVideo
The official PyTorch implementation of Google's Gemma models
Get a ChatGPT plugin up and running in under 5 minutes
Code for the paper Language Models are Unsupervised Multitask Learners
Multimodal Diffusion with Representation Alignment
Implementation of Vision Transformer, a simple way to achieve SOTA
PyTorch3D is FAIR's library of reusable components for deep learning
Foundation Models for Time Series
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Industrial-level controllable zero-shot text-to-speech system
Towards Human-Sounding Speech
Benchmarking Multimodal Agents for Open-Ended Tasks
The no-nonsense RAG chunking library
Easily turn large sets of image urls to an image dataset
Towards Real-World Vision-Language Understanding
Open source platform for the machine learning lifecycle