Bringing BERT into modernity via both architecture changes and scaling
PyTorch code and models for V-JEPA self-supervised learning from video
Collection of Gemma 3 variants that are trained for performance
PyTorch code and models for VJEPA2 self-supervised learning from video
Self-supervised visual learning using momentum contrast in PyTorch
Visual Causal Flow
Fast inference engine for Transformer models
Moonshot's most powerful AI model
Provides code for running inference with the SegmentAnything Model
C++ Implementation of PyTorch Tutorials for Everyone
Accurate × Fast × Comprehensive
Official inference repo for FLUX.2 models
OCR expert VLM powered by Hunyuan's native multimodal architecture
Taming Stable Diffusion for Lip Sync
Generate 3D objects conditioned on text or images
Basaran, an open-source alternative to the OpenAI text completion API
Neural machine translation and sequence learning using TensorFlow
Singing voice change based on whisper, lora for singing voice clone
CPT: A Pre-Trained Unbalanced Transformer
Deep learning for text to speech
An implementation of Tacotron 2 that supports multilingual experiments
End-to-end object detection with transformers
CakeChat: Emotional Generative Dialog System
Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201