Full stack AI software engineer
Transforming Multimodal Content into Captivating Multilingual Audio
Spark-TTS Inference Code
Tool for exploring and debugging transformer model behaviors
A python tool that uses GPT-4, FFmpeg, and OpenCV
GUI Exploration Lab. One of the best GUI agent solutions
TFDS is a collection of datasets ready to use with TensorFlow,
Command-line YAML, XML, TOML processor
Multi-class confusion matrix library in Python
Voilà turns Jupyter notebooks into standalone web applications
Machine learning metrics for distributed, scalable PyTorch application
The Open Source Cowork Desktop to Unlock Your Exceptional Productivity
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Recovering the Visual Space from Any Views
Real-World Centric Foundation GUI Agents
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Chat & pretrained large vision language model
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
From Paper to Presentation in One Click
code for Mesh R-CNN, ICCV 2019
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
TFX is an end-to-end platform for deploying production ML pipelines
Magnetoencephalography (MEG) and Electroencephalography EEG in Python
Interpretable prompting and models for NLP
Speakr is a personal, self-hosted web application