Toolkit for conversational AI
Enhances Tesseract OCR output using LLMs (local or API)
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Capable of understanding text, audio, vision, video
Repo of Qwen2-Audio chat & pretrained large audio language model
Chat & pretrained large vision language model
Qwen3-omni is a natively end-to-end, omni-modal LLM
GLM-4-Voice | End-to-End Chinese-English Conversational Model
LLM Large Model of Selling Anchor
Integrating LLMs into structured NLP pipelines
Refer and Ground Anything Anywhere at Any Granularity
Label, clean and enrich text datasets with LLMs