Ongoing research training transformer models at scale
Open-sourced unified customization model
Audiocraft is a library for audio processing and generation
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A lightweight, powerful framework for multi-agent workflows
An Open Source package that allows video game creators
The official PyTorch implementation of Google's Gemma models
Official python implementation of UTCP. UTCP is an open standard
Concatenate a directory full of files into a single prompt
Multi-lingual large voice generation model, providing inference
Pushing the Limits of Mathematical Reasoning in Open Language Models
A Customizable Image-to-Video Model based on HunyuanVideo
Superduper: Integrate AI models and machine learning workflows
Foundation Models for Time Series
Get a ChatGPT plugin up and running in under 5 minutes
Code for the paper Language Models are Unsupervised Multitask Learners
Multimodal Diffusion with Representation Alignment
Implementation of Vision Transformer, a simple way to achieve SOTA
PyTorch3D is FAIR's library of reusable components for deep learning
Industrial-level controllable zero-shot text-to-speech system
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Towards Human-Sounding Speech
Towards Real-World Vision-Language Understanding
Benchmarking Multimodal Agents for Open-Ended Tasks
The no-nonsense RAG chunking library