stt download | SourceForge.net

stt is a standalone speech recognition tool that locally converts spoken content in audio or video files into textual formats without requiring internet access, giving users control over their data and reducing reliance on external APIs. It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. The project is designed to be easy to deploy: you can run a local Python server that exposes an HTTP API for uploading audio/video files and retrieving transcriptions in different formats. It supports GPU acceleration if available, enabling faster processing on compatible hardware but still offers reliable performance on CPUs alone.

Features

Offline speech-to-text transcription
Outputs text, JSON, and SRT subtitle formats
Local HTTP API for easy integration
Supports multiple model sizes
Optional GPU acceleration
Standalone desktop deployment

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow stt

stt Web Site

Other Useful Business Software

Build Agents and Models on One Platform

Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free

Rate This Project

User Reviews

Be the first to post a review of stt!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2026-02-17

Similar Business Software

Google Cloud Speech-to-Text

Google Cloud’s Speech API processes more than 1 billion voice minutes per month with close to human levels of understanding for many commonly spoken languages. Powered by the best of Google's AI research and technology, Google Cloud's Speech-to-Text API helps you accurately transcribe speech...

See Software
Crowdin

Crowdin, a localization management software powered by AI, facilitates the localization of diverse content such as websites, mobile apps, games, desktop and web applications, help centers, blogs, and email campaigns. With a repertoire of over 700 add-ons and integrations, the platform...

See Software
LM-Kit.NET

LM-Kit.NET is a complete local AI runtime for .NET that lets engineering teams ship AI-powered features without cloud dependencies, per-token costs, or data leaving the network. Most .NET AI integrations stop at inference. LM-Kit.NET covers the full range of capabilities production...

See Software
Adobe Firefly

Adobe Firefly is an AI-powered creative platform that enables users to generate and edit images, videos, and other media using simple text prompts. It provides an intuitive workspace where users can create content on an infinite canvas and experiment with different creative ideas. The platform...

See Software
TinyPNG

TinyPNG (by Tinify) is a free image optimization tool trusted by developers and designers worldwide. It uses smart lossy compression to compress JPEG, PNG, WebP, AVIF, and JPEG XL (JXL) files by up to 80% without visible quality loss - boosting speed, SEO, and reducing bandwidth. Compress,...

See Software
Vaiz

Vaiz is a work management platform built for small and mid-sized teams — startups, agencies, and growing SaaS companies — who want the structure of tools like Jira or ClickUp without the complexity, setup time, or price tag. It brings tasks, docs, and technical work into one lightweight...

See Software