Build Vision Agents quickly with any model or video provider
TextWorld is a sandbox learning environment for the training
Diffusion Transformer with Fine-Grained Chinese Understanding
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Open Source Generative Process Automation
Multi-Agent daTa geneRation Infra and eXperimentation framework
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Lightweight Python library for adding real-time multi-object tracking
Python binding to the Apache Tika™ REST services
A library for deep learning end-to-end dialog systems and chatbots
The official repo of Qwen chat & pretrained large language model
Model Context Protocol with Neo4j
LLM based data scientist, AI native data application
A cross-platform Python library for differentiable programming
Open-source multi-speaker long-form text-to-speech model
The fastest way to bring multi-agent workflows to production
A framework to enable multimodal models to operate a computer
Turns Data and AI algorithms into production-ready web applications
The AI toolkit for the AI developer
Research-oriented chatbot framework
Self-Modifying Framework from the Future
Towards Human-Sounding Speech
Sharp Monocular Metric Depth in Less Than a Second
Video understanding codebase from FAIR for reproducing video models
Superfast AI decision making and processing of multi-modal data