Robust Speech Recognition via Large-Scale Weak Supervision
Automatic Speech Recognition with Word-level Timestamps
A Web UI for easy subtitle using whisper model
Bailing is a voice dialogue robot similar to GPT-4o
A fast, powerful, and simple hierarchical vision transformer
Human Activity Recognition example using TensorFlow on smartphone
Efficient 3D human pose estimation in video using 2D keypoint