Build Vision Agents quickly with any model or video provider
An Open Source text-to-speech system built by inverting Whisper
MARS5 speech model (TTS) from CAMB.AI
A command-line utility for taking automated screenshots of websites
This repository provides an advanced RAG
MetricFlow allows you to define, build, and maintain metrics in code
An MCP server that autonomously evaluates web applications
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
Chinese and English multimodal conversational language model
Repo of Qwen2-Audio chat & pretrained large audio language model
A disk schredder application for debian-linux systems
Helping you get the most out of AWS, wherever you use MCP
The best free open source website change detection and restock service
A distributed and persistent archive replay system using IPFS
Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Deep learning optimization library: makes distributed training easy
Tensor search for humans
The data structure for multimodal data
Command-line tool to delete merged Git branches
Django friendly finite state machine support
Misago is fully featured modern forum application
Implementation of Imagen, Google's Text-to-Image Neural Network
Open Source Differentiable Computer Vision Library
Build cross-modal and multimodal applications on the cloud