A python library that makes AMR parsing, generation and visualization
A sound cloning tool with a web interface, using your voice
text and image to video generation: CogVideoX (2024) and CogVideo
End-to-end speech processing toolkit
A TTS model capable of generating ultra-realistic dialogue
A Repo For Document AI
Interface for OuteTTS models
User toolkit for analyzing and interfacing with Large Language Models
Automatically translates the text of a video based on a subtitle file
State-of-the-art (SoTA) text-to-video pre-trained model
A simple, high-quality voice conversion tool focused on ease of use
Scalable generative AI framework built for researchers and developers
MARS5 speech model (TTS) from CAMB.AI
Qwen3 is the large language model series developed by Qwen team
Large-language-model & vision-language-model based on Linear Attention
Documentation for Google's Gen AI site - including Gemini API & Gemma
Chat & pretrained large vision language model
Foundational model for human-like, expressive TTS
Create videos with Stable Diffusion
Instant voice cloning by MIT and MyShell. Audio foundation model
Real-time voice interactive digital human
Towards Real-World Vision-Language Understanding
Multi-lingual large voice generation model, providing inference
Python framework for adversarial attacks, and data augmentation
LLM abstractions that aren't obstructions