Multi-modal large language model designed for audio understanding
Netease Youdao's open-source embedding and reranker models
Detect faces in an image
A CNN model that predicts human joints from RGB images of a person
The ChatGPT Retrieval Plugin lets you easily find personal documents
Implementation of model parallel autoregressive transformers on GPUs
Code for reproducing key results in the paper
Reasoning-powered OCR VLM for converting complex documents to Markdown
BGE-Large v1.5: High-accuracy English embedding model for retrieval
Speculative-decoding accelerator for the 675B Mistral Large 3