Audiocraft is a library for audio processing and generation
A lightweight, powerful framework for multi-agent workflows
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Ongoing research training transformer models at scale
An Open Source package that allows video game creators
Concatenate a directory full of files into a single prompt
Official python implementation of UTCP. UTCP is an open standard
Multi-lingual large voice generation model, providing inference
Pushing the Limits of Mathematical Reasoning in Open Language Models
A Customizable Image-to-Video Model based on HunyuanVideo
Superduper: Integrate AI models and machine learning workflows
Multimodal Diffusion with Representation Alignment
Implementation of Vision Transformer, a simple way to achieve SOTA
The official PyTorch implementation of Google's Gemma models
PyTorch3D is FAIR's library of reusable components for deep learning
Get a ChatGPT plugin up and running in under 5 minutes
Code for the paper Language Models are Unsupervised Multitask Learners
Towards Human-Sounding Speech
Foundation Models for Time Series
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Industrial-level controllable zero-shot text-to-speech system
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
A Powerful Native Multimodal Model for Image Generation
When LLM Meets Domain Experts
Diversity-driven optimization and large-model reasoning ability