GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen2.5-VL is the multimodal large language model series
Code and models for ICML 2024 paper, NExT-GPT
GPT4V-level open-source multi-modal model based on Llama3-8B
A Pioneering Open-Source Alternative to GPT-4o
Multi-modal large language model designed for audio understanding
Virtual AI anchor that combines state-of-the-art technology
Embed images and sentences into fixed-length vectors
Transformers4Rec is a flexible and efficient library
An Open Toolkit for Knowledge Graph Extraction and Construction
Implementation of Nougat Neural Optical Understanding
Langchain Apps on Production with Jina & FastAPI
Task-oriented finetuning for better embeddings on neural search
Implementation of research papers on Deep Learning+ NLP+ CV in Python