Foundation model for image generation
A speech-text foundation model for real time dialogue
Controllable and fast Text-to-Speech for over 7000 languages
Instant voice cloning by MIT and MyShell. Audio foundation model
Search all of YouTube from the command line
A list of free LLM inference resources accessible via API
Qwen3-ASR is an open-source series of ASR models
Fast stable diffusion on CPU and AI PC
The Python code to reproduce illustrations from Machine Learning Book
Python library for scraping and analyzing online news articles easily
Automatically translates the text of a video based on a subtitle file
Turn words into chords
Main repository for the Sphinx documentation builder
Public opinion analysis system
An Open Source implementation of Notebook LM with more flexibility
Scalable data pre processing and curation toolkit for LLMs
User toolkit for analyzing and interfacing with Large Language Models
Open source terminal session recorder
A Python package for segmenting geospatial data with the SAM
State-of-the-art (SoTA) text-to-video pre-trained model
Check code for common misspellings
Code and models for ICML 2024 paper, NExT-GPT
Extract audio and video content and organize it into a Markdown note
Implementation of AudioLM audio generation model in Pytorch
Document content and metadata extraction microservice