face recognition software using OpenCv library and development face id
Face recognition, Liveness detection, ID document recognition
Detects speech activity in audio using pyannote.audio 2.1 pipeline
BERT-based Chinese language model for fill-mask and NLP tasks
Portuguese ASR model fine-tuned on XLSR-53 for 16kHz audio input
Russian ASR model fine-tuned on Common Voice and CSS10 datasets
Zero-shot image-text matching with ViT-B/32 Transformer encoder
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
CLIP model for zero-shot image-text tasks using 336x336 patches
A Multimodal approach to control a Unmanned Aerial Vehicle