Sharp Monocular Metric Depth in Less Than a Second
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
A Lightweight Face Recognition and Facial Attribute Analysis
Qwen2.5-VL is the multimodal large language model series
A game theoretic approach to explain the output of ml models
Comprehensive Gradio WebUI for audio processing
A simple native web interface that uses ChatTTS to synthesize text
Python binding to the Apache Tika™ REST services
Source code of PyGAD, Python 3 library for building genetic algorithms
The official Python client for the Huggingface Hub
A Model Context Protocol server for searching and analyzing arXiv
Python Client for Supabase. Query Postgres from Flask, Django
Code for the paper Language Models are Unsupervised Multitask Learners
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Easily turn large sets of image urls to an image dataset
GUI/CLI tool for downloading Xiaohongshu
Code for running inference with the SAM 3D Body Model 3DB
A sound cloning tool with a web interface, using your voice
The most intuitive, flexible, way for researchers to build models
Uncommon Objects in 3D dataset
Free, open source crypto trading bot
Controllable and fast Text-to-Speech for over 7000 languages
Photorealistic Synthetic Dataset for Holistic Indoor Scene
Data Lake for Deep Learning. Build, manage, and query datasets
Ling is a MoE LLM provided and open-sourced by InclusionAI