Translate the video from one language to another and embed dubbing
Fast multimodal LLM for real-time voice interaction and AI apps
Advanced NLP with spaCy: A free online course
Large Audio Language Model built for natural interactions
StreamSpeech is a seamless model for offline speech recognition
Conversational voice AI agents
Open source annotation tool for machine learning practitioners
Ready-to-use OCR with 80+ supported languages
Persian NLP Toolkit
CLI tool to extract (meta)data from PDF and manipulate PDF files
Models for the spaCy Natural Language Processing (NLP) library
The no-nonsense RAG chunking library
A Web UI for easy subtitle using whisper model
Visual Causal Flow
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Real-time voice interactive digital human
Semantic search and workflows for medical/scientific papers
AI-powered tool for generating, optimizing, and translating subtitles
Powerful Android AI agent with tools, automation, and Linux shell
The behavior guidance framework for customer-facing LLM agents
An on-premises, OCR-free unstructured data extraction
Get your documents ready for gen AI
Capable of understanding text, audio, vision, video
Windrecorder is a memory search app by records everything
A Foundation Model for the Language of Financial Markets