Open-source framework for intelligent speech interaction
This repo contains the code for 1D tokenizer and generator
A Universal Customization Method for Single and Multi Conditioning
Bailing is a voice dialogue robot similar to GPT-4o
Reading book source
MARS5 speech model (TTS) from CAMB.AI
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
A trainable PyTorch reproduction of AlphaFold 3
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
LLM-based agent for general purpose software engineering tasks
Multi-modal large language model designed for audio understanding
Images to inference with no labeling
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Deploy and share agents with open infrastructure
An MCP server that autonomously evaluates web applications
The leading agent orchestration platform for Claude
No-code multi-agent framework to build LLM Agents, workflows
Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant
Tensor search for humans
The data structure for multimodal data
Hub of ready-to-use datasets for ML models
Build cross-modal and multimodal applications on the cloud
GUI Exploration Lab. One of the best GUI agent solutions