Foundation model for image generation
Implementation of Video Diffusion Models
Automated translation solution for visual novels
A speech-text foundation model for real time dialogue
Document content and metadata extraction microservice
Controllable and fast Text-to-Speech for over 7000 languages
The Python code to reproduce illustrations from Machine Learning Book
Python library for scraping and analyzing online news articles easily
Python MUD/MUX/MUSH/MU* development system
A Python package for segmenting geospatial data with the SAM
lightweight package to simplify LLM API calls
Qwen3-ASR is an open-source series of ASR models
Generate audiobooks from e-books
Fast stable diffusion on CPU and AI PC
Instant voice cloning by MIT and MyShell. Audio foundation model
Interface for OuteTTS models
Scalable data pre processing and curation toolkit for LLMs
User toolkit for analyzing and interfacing with Large Language Models
Automatically translates the text of a video based on a subtitle file
A very simple framework for state-of-the-art NLP
The open-source data curation platform for LLMs
Code and models for ICML 2024 paper, NExT-GPT
Extract audio and video content and organize it into a Markdown note
Implementation of AudioLM audio generation model in Pytorch
Minimalist Vim Plugin Manager