Replace OpenAI GPT with another LLM in your app
Proofs, cases, concept supplements, and reference explanations
Petastorm library enables single machine or distributed training
The Triton Inference Server provides an optimized cloud
Build Vision Agents quickly with any model or video provider
The data structure for multimodal data
Python binding to the Apache Tika™ REST services
Large Multimodal Models for Video Understanding and Editing
Benchmarking synthetic data generation methods
An advanced paper search agent powered by large language models
Capable of understanding text, audio, vision, video
Constrained Value Alignment via Safe Reinforcement Learning
Automatically Visualize any dataset, any size
An opinionated CLI to transcribe Audio files w/ Whisper on-device
GPU environment management and cluster orchestration
Visual Automation IDE — automate anything you see on screen
Towards Real-World Vision-Language Understanding
airda(Air Data Agent
AI-powered PC monitoring that explains. Not shows numbers/spikes.
Plug-n-play module turning text-to-image models into animation
Automatic question answering for local knowledge bases based on LLM
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
AI Suite for upscaling, interpolating & restoring images/videos
Fault-tolerant, highly scalable GPU orchestration
Run LLMs locally on Cloud Workstations