CLIP, Predict the most relevant text snippet given an image
Implementation of "MobileCLIP" CVPR 2024
Bidirectional token-classification model for identifiable info
Audio foundation model excelling in audio understanding
Encoder of greater-than-word length text trained on a variety of data
Dataset of GPT-2 outputs for research in detection, biases, and more
RoBERTa Chinese pre-training model: RoBERTa for Chinese
Flexible text-to-text transformer model for multilingual NLP tasks
T5-Small: Lightweight text-to-text transformer for NLP tasks
CLIP model fine-tuned for zero-shot fashion product classification
CLIP ViT-bigG/14: Zero-shot image-text model trained on LAION-2B
Robust BERT-based model for English with improved MLM training
Multimodal Transformer for document image understanding and layout
Lightweight on-device model for private AI text redaction
CTC-based forced aligner for audio-text in 158 languages
Versatile 8B-base multimodal LLM, flexible foundation for custom AI
Small 3B-base multimodal model ideal for custom AI on edge hardware