GUI for a Vocal Remover that uses Deep Neural Networks
1 min voice data can also be used to train a good TTS model
Instant voice cloning by MIT and MyShell. Audio foundation model
Python inference and LoRA trainer package for the LTX-2 audio–video
From Images to High-Fidelity 3D Assets
Run Local LLMs on Any Device. Open-source
Open-source infrastructure for Computer-Use Agents. Sandboxes
Offline Text To Speech synthesis for python
Awesome multilingual OCR toolkits based on PaddlePaddle
Long-form streaming TTS system for multi-speaker dialogue generation
SOTA Open Source TTS
Spark-TTS Inference Code
Document Image Parsing via Heterogeneous Anchor Prompting”
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Supercharge Your LLM with the Fastest KV Cache Layer
A Family of Open Sourced Music Foundation Models
Definitions for AI/ML tasks like dataset creation
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Asynchronous multi-platform robot framework written in Python
Multilingual sentence & image embeddings with BERT
Run LLM prompts from your shell
Efficient Triton Kernels for LLM Training
ImageBind One Embedding Space to Bind Them All