Superfast AI decision making and processing of multi-modal data
CogView4, CogView3-Plus and CogView3(ECCV 2024)
SGLang is a fast serving framework for large language models
"VideoRAG: Chat with Your Videos
Multimodal embedding and reranking models built on Qwen3-VL
High-resolution models for human tasks
Qwen2.5-VL is the multimodal large language model series
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Code and models for ICML 2024 paper, NExT-GPT
A Pioneering Open-Source Alternative to GPT-4o
GPT4V-level open-source multi-modal model based on Llama3-8B
Multi-modal large language model designed for audio understanding
Virtual AI anchor that combines state-of-the-art technology
SUMO is a microscopic, multi-modal traffic simulation.
Radiation Spectrum Method : a modal BPM (Beam Propagation Method)
Embed images and sentences into fixed-length vectors
Transformers4Rec is a flexible and efficient library
An Open Toolkit for Knowledge Graph Extraction and Construction
Implementation of Nougat Neural Optical Understanding
Langchain Apps on Production with Jina & FastAPI
Task-oriented finetuning for better embeddings on neural search
A multi-language debugging system for Vim
Implementation of research papers on Deep Learning+ NLP+ CV in Python
Collaborative Editing for Vim