An Open Source text-to-speech system built by inverting Whisper
Scalable generative AI framework built for researchers and developers
Interface for OuteTTS models
MARS5 speech model (TTS) from CAMB.AI
Plug-and-play library to enable agents to call MCP and UTCP tools
This repository provides an advanced RAG
An MCP server that autonomously evaluates web applications
Chinese and English multimodal conversational language model
Repo of Qwen2-Audio chat & pretrained large audio language model
Python package for AutoML on Tabular Data with Feature Engineering
MII makes low-latency and high-throughput inference possible
Toloka-Kit is a Python library for working with Toloka API
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Tensor search for humans
The data structure for multimodal data
Django friendly finite state machine support
Jittor is a high-performance deep learning framework
Implementation of Imagen, Google's Text-to-Image Neural Network
Open Source Differentiable Computer Vision Library
Build cross-modal and multimodal applications on the cloud
A library for deep learning end-to-end dialog systems and chatbots
A Python toolbox for scalable outlier detection
Deep learning library
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Experimental, AI/ML-powered and open sourced Marketing Mix Modeling