Open Source Generative Process Automation
Towards Human-Sounding Speech
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Document content and metadata extraction microservice
AI-Powered Personalized Learning Assistant
48khz stereo neural audio codec for general audio
Machine Learning automation and tracking
AI tool that generates tests to improve code coverage quickly
Open source multimodal creative AI assistant with infinite canvas tool
AI assistant for ComfyUI workflow generation, debugging, and tuning
A general fine-tuning kit geared toward image/video/audio diffusion
Using AI models to automatically provide commentary and edit videos
Run LLMs locally on Cloud Workstations
Sharp Monocular Metric Depth in Less Than a Second
SDG is a specialized framework
Collection of awesome LLM apps with AI Agents and RAG using OpenAI
Refine and quantize messy AI pixel art into clean, perfect pixels
Build GenAI application quick and easy
4M: Massively Multimodal Masked Modeling
Set of tools to assess and improve LLM security
Official implementation of DreamCraft3D
Agent Framework / shim to use Pydantic with LLMs
Synthetic data generators for structured and unstructured text
Controllable & emotion-expressive zero-shot TTS
A fast TTS architecture with conditional flow matching