[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences
A Web UI for easy subtitle using whisper model
Qwen-Image is a powerful image generation foundation model
Standalone, small, language-neutral
OCR model for complex documents with layout-aware structured outputs
RAG-Anything: All-in-One RAG Framework
A Model Context Protocol (MCP) server
Controllable & emotion-expressive zero-shot TTS
Controllable and fast Text-to-Speech for over 7000 languages
Framework for building real-time voice and multimodal AI agents
Python library for scraping and analyzing online news articles easily
A python library that makes AMR parsing, generation and visualization
Instant voice cloning by MIT and MyShell. Audio foundation model
CLI tool to extract (meta)data from PDF and manipulate PDF files
Python Terminal Toolkit - a Spiced Up TUI Library
A markdown parser written in Go. Easy to extend, standard, compliant
Fast stable diffusion on CPU and AI PC
A Repo For Document AI
Foundational model for human-like, expressive TTS
A Coverage-Guided, Native Python Fuzzer
Stable Diffusion web UI
Scalable data pre processing and curation toolkit for LLMs
User toolkit for analyzing and interfacing with Large Language Models
Public opinion analysis system
Data Infrastructure providing an approach to multimodal AI workloads