VMZ: Model Zoo for Video Modeling
Official implementation of Watermark Anything with Localized Messages
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
Operating LLMs in production
Conditional GAN for generating synthetic tabular data
Connect any LLM to your internal knowledge sources
TokenSpeed is a speed-of-light LLM inference engine
TensorRT LLM provides users with an easy-to-use Python API
MCP server that integrates Confluence and Jira
OpenDILab Decision AI Engine
Data and tools for generating and inspecting OLMo pre-training data
Efficient Retrieval Augmentation and Generation Framework
TextWorld is a sandbox learning environment for the training
A Multi-Modal World Model for Reconstructing, Generating, Simulation
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Generate blog articles from video or audio
High-Fidelity and Controllable Generation of Textured 3D Assets
DeepMind model for tracking arbitrary points across videos & robotics
Designed for text embedding and ranking tasks
Leveraging BERT and c-TF-IDF to create easily interpretable topics
Benchmarking Multimodal Agents for Open-Ended Tasks
Qwen2.5-VL is the multimodal large language model series
Fast image augmentation library and an easy-to-use wrapper