activity recognition free download

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. ...

Downloads: 71 This Week

Last Update: 2025-06-26

See Project

WhisperX

Automatic Speech Recognition with Word-level Timestamps

WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy.

Downloads: 88 This Week

Last Update: 2026-05-25

See Project

Windrecorder

Windrecorder is a memory search app by records everything

Windrecorder is an open-source personal memory search engine that continuously records on-screen activity in a highly optimized and storage-efficient format. It captures screen content locally and builds a searchable database using OCR and image understanding, allowing users to rewind and rediscover anything they have previously seen. The system indexes only meaningful visual changes, extracting text, browser data, and contextual information to improve search accuracy and reduce storage...

Downloads: 19 This Week

Last Update: 2026-04-24

See Project

Whisper-WebUI

A Web UI for easy subtitle using whisper model

...It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. Whisper WebUI also includes advanced preprocessing and postprocessing features such as voice activity detection, background music separation, and speaker diarization, enabling more accurate and structured outputs.

Downloads: 6 This Week

Last Update: 2026-03-18

See Project

Bailing

Bailing is a voice dialogue robot similar to GPT-4o

Bailing is an open-source voice-dialogue assistant designed to deliver natural voice-based conversations by combining automatic speech recognition (ASR), voice activity detection (VAD), a large language model (LLM), and text-to-speech (TTS) in a single pipeline. Its goal is to offer a “voice-first” chat experience similar to what one might expect from a system like GPT-4o, but fully open and deployable by users. The project is modular: each core function — ASR, VAD, LLM, TTS — exists as a separately replaceable component, which allows flexibility in picking your preferred models depending on resources or languages. ...

Downloads: 0 This Week

Last Update: 2025-11-28

See Project

Hiera

A fast, powerful, and simple hierarchical vision transformer

...Community discussions cover topics like dataset pretrains, integration in other frameworks, and comparisons with related implementations. Security and contribution guidelines follow Meta’s open-source practices, and activity shows ongoing interest and usage across the community.

Downloads: 0 This Week

Last Update: 2025-10-08

See Project

LSTMs for Human Activity Recognition

Human Activity Recognition example using TensorFlow on smartphone

LSTM-Human-Activity-Recognition is a machine learning project that demonstrates how recurrent neural networks can be used to recognize human activities from sensor data. The repository implements a deep learning model based on Long Short-Term Memory (LSTM) networks to classify physical activities using time-series data collected from wearable sensors.

Downloads: 17 This Week

Last Update: 2026-03-11

See Project

tom_core

tom_core - a tool for automating events on a computer

tom_core is a software tool used for the automation of everything that happens on your computer. By using this application, you can easily record your activity on your computer, starting the recording at any moment that you choose. The application repeats all your clicks or drags, keystrokes, hotkeys, etc. All in exactly the timing and number of repetitions you need. The toolbox such as the optical recognition and voice control enables to branch out the recordings into complex forms, with which application brings the possibility of programming even to those who don’t have programming skills or experiences.

Downloads: 0 This Week

Last Update: 2022-05-17

See Project

VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint

...The framework includes pretrained models, data preprocessing utilities, visualization tools, and evaluation scripts for standard benchmarks like Human3.6M. VideoPose3D has been used widely in computer vision research for human motion understanding, activity recognition, and animation generation.

Downloads: 1 This Week

Last Update: 2025-10-07

See Project

Search Results for "activity recognition"

Showing 9 open source projects for "activity recognition"

Whisper

WhisperX

Windrecorder

Whisper-WebUI

Bailing

Hiera

LSTMs for Human Activity Recognition

tom_core

VideoPose3D

Search Results for "activity recognition"

Showing 9 open source projects for "activity recognition"

Whisper

WhisperX

Windrecorder

Whisper-WebUI

Bailing

Hiera

LSTMs for Human Activity Recognition

tom_core

VideoPose3D

Related Searches

Related Categories