OCR expert VLM powered by Hunyuan's native multimodal architecture
InvokeAI is a leading creative engine for Stable Diffusion models
Skills, a Chinese software copyright application material generator
Edit videos with Claude Code
OpenRecall is a fully open-source, privacy-first alternative
Powerful open source team chat application
A python tool that uses GPT-4, FFmpeg, and OpenCV
go1pylib is a Python library designed to control the Go1 robot
Phi-3.5 for Mac: Locally-run Vision and Language Models
AI-Powered Personalized Learning Assistant
Adding guardrails to large language models
Windrecorder is a memory search app by records everything
This repo contains the code for 1D tokenizer and generator
Python tool for crawling and extracting structured data from news site
Multi-modal large language model designed for audio understanding
SOTA discrete acoustic codec models with 40/75 tokens per second
One-click deployment (including offline integration package)
Solve end to end problems using Llama model family
Pretrained model hub for Keras 3
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Implementation of AudioLM audio generation model in Pytorch
Repo of Qwen2-Audio chat & pretrained large audio language model
MOSS‑TTS Family open‑source speech and sound generation model
Open-source Video Translation Skill
Motivation-driven skill for learning from strong academic papers