Vision Transformer model fine-tuned for facial age classification
Flexible text-to-text transformer model for multilingual NLP tasks
Transformer model for image classification with patch-based input.
Custom BLEURT model for evaluating text similarity using PyTorch
Speaker segmentation model for 10s audio chunks with powerset labels
Zero-shot image-text classification with ViT-B/32 encoder.