PyTorch code and models for the DINOv2 self-supervised learning
A Family of Open Sourced Music Foundation Models
Reference PyTorch implementation and models for DINOv3
Accurate × Fast × Comprehensive
Open-Source Financial Large Language Models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Pokee Deep Research Model Open Source Repo
Lightweight multimodal translation model for 55 languages
Multimodal Transformer for document image understanding and layout
Qwen2.5-VL-3B-Instruct: Multimodal model for chat, vision & video
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
Small 3B-base multimodal model ideal for custom AI on edge hardware