Repo of Qwen2-Audio chat & pretrained large audio language model
A proof-of-concept jupyter extension which converts english queries
Semantic search and workflows for medical/scientific papers
An open and fair framework for everyone to build AI agents
LLM Large Model of Selling Anchor
Persian NLP Toolkit
Get your documents ready for gen AI
The behavior guidance framework for customer-facing LLM agents
Open-Source AI Camera. Empower any camera/CCTV
A framework to enable multimodal models to operate a computer
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Large Audio Language Model built for natural interactions
StreamSpeech is a seamless model for offline speech recognition
Industrial-strength Natural Language Processing (NLP)
Video understanding codebase from FAIR for reproducing video models
OCR expert VLM powered by Hunyuan's native multimodal architecture
Stanford NLP Python library for many human languages
Translate the video from one language to another and embed dubbing
A simple tool for reading in poorly redacted documents
Real-time voice interactive digital human
Integrating LLMs into structured NLP pipelines
Advanced NLP with spaCy: A free online course
Python Audio Analysis Library: Feature Extraction, Classification
UI Automation Framework for Games and Apps
Conversational voice AI agents