CLIP, Predict the most relevant text snippet given an image
New family of code large language models (LLMs)
4M: Massively Multimodal Masked Modeling
Python inference and LoRA trainer package for the LTX-2 audio–video
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
An experimental version of DeepSeek model
A Powerful Native Multimodal Model for Image Generation
Block Diffusion for Ultra-Fast Speculative Decoding
PyTorch code and models for the DINOv2 self-supervised learning
The ChatGPT Retrieval Plugin lets you easily find personal documents
Pretrained time-series foundation model developed by Google Research
ICLR2024 Spotlight: curation/training code, metadata, distribution
Open-source, high-performance Mixture-of-Experts large language model
Official code for Style Aligned Image Generation via Shared Attention
A library for Multilingual Unsupervised or Supervised word Embeddings
Large language model developed and released by NVIDIA
Portuguese ASR model fine-tuned on XLSR-53 for 16kHz audio input