Robust Speech Recognition via Large-Scale Weak Supervision
Speech-to-text, text-to-speech, and speaker recognition
Automatic Speech Recognition with Word-level Timestamps
A free, open source, and extensible speech-to-text application
A Web UI for easy subtitle using whisper model
Bailing is a voice dialogue robot similar to GPT-4o
A fast, powerful, and simple hierarchical vision transformer
Amica is an open source interface for interactive communication
Human Activity Recognition example using TensorFlow on smartphone
Efficient 3D human pose estimation in video using 2D keypoint
Resources about activity recognition