Document (PDF, Word, PPTX ...) extraction and parse API
Implementation for MatMul-free LM
Skywork-R1V is an advanced multimodal AI model series
A dataset consists of 15,140 ChatGPT prompts from Reddit
Code and models for ICML 2024 paper, NExT-GPT
Examples and tutorials to help developers build AI systems
LightLLM is a Python-based LLM (Large Language Model) inference
Robust recipes to align language models with human and AI preferences
An Open-source Framework for Data-centric Language Agents
Open Source Deep Research Alternative to Reason and Search
Anomaly detection related books, papers, videos, and toolboxes
Retrieval and Retrieval-augmented LLMs
A lightweight vLLM implementation built from scratch
Real-time Claude Code usage monitor with predictions and warnings
Pretrained time-series foundation model developed by Google Research
AWS Skills for Agents
A simple yet powerful agent framework for personal assistants
The absolute trainer to light up AI agents
Natural language workflows for AI agents
A lightweight text-to-speech model with zero-shot voice cloning
Follow along with my AI Agents Masterclass videos
Inference script for Oasis 500M
Document Image Parsing via Heterogeneous Anchor Prompting”
StreamSpeech is a seamless model for offline speech recognition
Generate Any 3D Scene in Seconds