A Pragmatic VLA Foundation Model
Multimodal embedding and reranking models built on Qwen3-VL
Taming Stable Diffusion for Lip Sync
High-resolution models for human tasks
A lightweight text-to-speech model with zero-shot voice cloning
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Context management for Claude Code. Hooks maintain state via ledgers
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Interface for OuteTTS models
Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant
Scalable machine learning for time series forecasting
On-device Speech-to-Intent engine powered by deep learning
Ray Aviary - evaluate multiple LLMs easily
OpenMMLab Model Deployment Framework
Graphical User Interface Face Anonymization Tool
Embed images and sentences into fixed-length vectors
Official code for Style Aligned Image Generation via Shared Attention
Shinkai allows you to create advanced AI (local) agents effortlessly
Free, local, open-source AI app builder
This repository contains the complete code and data for studying primo
AI-powered PC monitoring that explains. Not shows numbers/spikes.
Zylthra: A PyQt6 app to generate synthetic datasets with DataLLM.
Open source implementation of Microsoft's VALL-E X zero-shot TTS model
CoTracker is a model for tracking any point (pixel) on a video