Automatic Speech Recognition with Word-level Timestamps
Open-source multi-speaker long-form text-to-speech model
Advanced LLM-powered brute-force tool combining AI intelligence
AI framework for automated short video creation and editing tools
A python tool that uses GPT-4, FFmpeg, and OpenCV
ClawTeam: Agent Swarm Intelligence (One Command → Full Automation)
Refer and Ground Anything Anywhere at Any Granularity
MOSS‑TTS Family open‑source speech and sound generation model
Unleashing 10,000+ Word Generation from Long Context LLMs
ChatGPT extension for scientific research work
An Open Source text-to-speech system built by inverting Whisper
Open-source, high-performance AI model with advanced reasoning
Tools to build web AI agents that can authenticate
Agentic, Reasoning, and Coding (ARC) foundation models
A specialized Claude Code workspace for creating long-form
Structured RAG: ingest, index, query
Marrying Grounding DINO with Segment Anything & Stable Diffusion
Long-form streaming TTS system for multi-speaker dialogue generation
Using AI models to automatically provide commentary and edit videos
Fully Local Manus AI. No APIs, No $200 monthly bills
Uplift modeling and causal inference with machine learning algorithms
The Multi-Agent Framework
Make websites accessible for AI agents
Open source libraries and APIs to build custom preprocessing pipelines
Parallax is a distributed model serving framework