Showing 26 open source projects for "timing"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 1
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    ...It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets the generated dub track stay in sync with the original video structure. The project supports a wide range of languages for translation, spanning major world languages (English, Spanish, French, German, Chinese, Arabic, etc.) and many regional or less widely spoken languages, making it suitable for broad internationalization. ...
    Downloads: 41 This Week
    Last Update:
    See Project
  • 2
    Story Flicks

    Story Flicks

    Generate high-definition story short videos with one click using AI

    Story Flicks is another open-source project in the AI-assisted video generation / editing space, focused on creating short, story-style videos from script or prompt inputs. It aims to let users generate high-definition short movies or video stories with minimal manual effort, using AI models under the hood to assemble visuals, timing, and possibly narration or subtitles. For creators who want to produce narrative short-form content — whether for social media, storytelling, or prototyping video ideas — story-flicks offers a lightweight, code-backed alternative to complex video editing suites. Because the project is open and modifiable, developers can customize the generation pipeline: adjust story structure, alter rendering parameters, tweak video quality or resolution, or integrate with other AI models (e.g. for audio, voice-over, or image-to-video). ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    ...NeuralNote supports polyphonic transcription, meaning it can detect multiple notes played simultaneously, making it useful for instruments such as piano or guitar. The system relies on neural network models to analyze audio signals and infer pitch, timing, and other musical attributes that can be represented as MIDI data. The resulting MIDI output can be edited, quantized, or exported to other instruments within a music production workflow.
    Downloads: 137 This Week
    Last Update:
    See Project
  • 4
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    ...The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while simultaneously producing corresponding audio elements such as speech, music, ambient sound, or effects. This unified approach allows creators to generate complete multimedia sequences where motion, timing, and sound are aligned automatically. LTX-2 is designed for both research and production workflows and can generate high-resolution video clips with precise control over structure, motion, and camera behavior.
    Downloads: 82 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 5
    RealtimeSTT

    RealtimeSTT

    A robust, efficient, low-latency speech-to-text library

    RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    WhisperX

    WhisperX

    Automatic Speech Recognition with Word-level Timestamps

    WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy. It is particularly effective for long recordings, where traditional approaches may suffer from drift, repetition, or misalignment. whisperx also supports speaker diarization, allowing identification of different speakers within a conversation. ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 7
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 8
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    ...The tool leverages speech-to-text models, including OpenAI Whisper, to produce high-quality transcriptions and can differentiate between speakers using diarization techniques. Users can customize subtitle styling, adjust timing, and export results in multiple formats, making it suitable for content creators, filmmakers, and editors. AutoSubs is designed with performance in mind, offering efficient processing through a Rust-based backend and supporting multiple operating systems including Windows, macOS, and Linux.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 9
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    ...After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. It can also translate subtitles into other languages while preserving the original timing, making it suitable for multilingual video publishing and accessibility. In addition to generating subtitles, it supports editing, formatting, and embedding subtitles into videos as either hard or soft subtitles.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 10
    Auto Synced & Translated Dubs

    Auto Synced & Translated Dubs

    Automatically translates the text of a video based on a subtitle file

    Auto-Synced-Translated-Dubs is a toolchain that automatically translates and re-dubs videos using AI voices while keeping the new speech aligned to the original timing via subtitle files. It assumes you have a human-made SRT (or similar) subtitle file; the script then uses translation services such as Google Cloud or DeepL to generate translated subtitle tracks in one or more target languages. Using the timestamps of each subtitle line, it computes the required duration of each spoken segment and synthesizes audio via neural TTS services, producing one audio clip per subtitle entry. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Model Explorer

    Model Explorer

    A modern model graph visualizer and debugger

    Model Explorer is a visual tool for exploring, debugging, and optimizing ML models deployed on edge devices. Developed by Google AI Edge, it offers a browser-based interface to inspect layer-wise performance, memory usage, and inference timing of TensorFlow Lite and other supported models. It’s a powerful utility for developers optimizing models for constrained environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Scriberr

    Scriberr

    Self-hosted AI audio transcription

    ...Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. The application includes a polished user interface that simplifies the management of recordings, transcripts, and annotations, making it suitable for both casual users and professionals handling large volumes of audio. Beyond transcription, Scriberr also integrates features such as summarization, tagging, and interaction with language models, allowing users to extract insights from conversations or meetings efficiently.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    BruteForceAI

    BruteForceAI

    Advanced LLM-powered brute-force tool combining AI intelligence

    BruteForceAI is an open-source security testing tool that applies large language models to the analysis of login forms and authentication flows in web applications. At a high level, the project uses AI to inspect HTML content, identify the relevant form elements, and automate selector discovery so that a tester does not need to hand-map every field before evaluation. It combines that analysis layer with automated credential testing workflows, framing itself as a more adaptive alternative to...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    Violin

    Violin

    Open-source Video Translation Skill

    ...It transcribes the original speech, translates the text, generates natural-sounding speech in the target language, and remuxes the new audio back into the video. The project is designed to keep the generated speech aligned with the original timing so the final result feels closer to a real dubbed video. It can be used from the command line, through a FastAPI web app, or as a Claude Code skill. Violin supports multilingual workflows and is useful for creators, educators, localization teams, and developers building automated video translation pipelines. It is especially practical for turning lectures, tutorials, interviews, demos, and social videos into accessible content for wider audiences.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Open Infra Index

    Open Infra Index

    Production-tested AI infrastructure tools

    ...The repo's README describes the project as sharing “humble building blocks” of their online service—code that is documented, deployed, and battle-tested in production. The timing of its opening matches DeepSeek’s “Open-Source Week” campaign (starting around February 2025) when they gradually released internal infrastructure components publicly. It is licensed under CC0-1.0 (Creative Commons Zero) to maximize openness.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Seamless Communication

    Seamless Communication

    Foundational Models for State-of-the-Art Speech and Text Translation

    ...The research prototype includes components for visual grounding (understanding when a user references something in view), gesture recognition and synthesis, and turn-taking mechanisms that mirror human conversational timing. Because latency and synchronization are critical, the codebase invests in asynchronous scheduling, overlap of perception and reasoning, and fast fallback responses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    snntorch

    snntorch

    Deep and online learning with spiking neural networks in Python

    ...This allows researchers to train spiking neural models using familiar deep learning workflows while taking advantage of GPU acceleration and automatic differentiation. snnTorch provides implementations of common spiking neuron models, surrogate gradient training methods, and utilities for handling temporal neural dynamics. Because spiking neural networks operate over time and encode information through spike timing, the library includes tools for simulating temporal behavior.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Melodfy

    Melodfy

    ✨:AI-Powered Piano Audio to MIDI Converter 🎶

    Melodfy is an application that utilizes the power of artificial intelligence (developed by ByteDance) to seamlessly convert audio recordings of piano playing into playable MIDI files.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Weather Cast

    Weather Cast

    A desktop weather app powered by AI

    Weather app is a desktop weather app for Windows OS that shows detailed weather information for the searched city. The dashboard shows the current temperature of the city, description of temperature, pressure, wind, humidity, dew point, uv index, local time, air pollution index.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    SoftVC VITS Singing Voice Conversion

    SoftVC VITS Singing Voice Conversion

    SoftVC VITS Singing Voice Conversion

    SoftVC VITS Singing Voice Conversion is a deep learning project focused on singing voice conversion, allowing users to transform one voice into another while preserving melody and timing. Unlike traditional text-to-speech systems, it specializes specifically in singing scenarios and does not provide general TTS functionality. The project leverages neural network architectures derived from VITS and SoftVC research to achieve high-quality voice transformation. It is commonly used in creative audio workflows, especially in communities experimenting with synthetic singing and character voices. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    VideoSubFinder
    The main purpose of this program is to provide functionality for extract hardcoded subtitles (hardsub) from video. It provides two main features: 1) Autodetection of frames with hardcoded text (hardsub) on video with saving info about timing positions. 2) Generation of cleared from background text images, which allows with usage of OCR programs (like FineReader, Subtitle Edit, Google Drive) to generate complete subtitles with original text and timing. For working of this program on Windows will be required "Microsoft Visual C++ Redistributable runtime libraries 2022": https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads Latest versions were built and tested on: Windows 10 x64, Ubuntu 20.04.5 LTS, openSUSE Leap 15.4, Arch Linux (EndeavourOS Cassini Nova 03-2023) For faster support in case of bug fixes please contact me in: https://vk.com/skosnits For donate: https://sourceforge.net/projects/videosubfinder/donate
    Leader badge
    Downloads: 529 This Week
    Last Update:
    See Project
  • 22
    DiffSinger

    DiffSinger

    Singing Voice Synthesis via Shallow Diffusion Mechanism

    ...The core idea is to view generation of a sung voice (mel-spectrogram) as a diffusion process: starting from noise, the model iteratively “denoises” while being conditioned on a music score (lyrics, pitch, musical timing). This avoids some of the typical problems of prior SVS models — like over-smoothing or unstable GAN training — and produces more realistic, expressive, and natural-sounding singing. The method introduces a “shallow diffusion” mechanism: instead of diffusing over many steps, generation begins at a shallow step determined adaptively, which leverages prior knowledge learned by a simple mel-spectrogram decoder and speeds up inference.
    Downloads: 104 This Week
    Last Update:
    See Project
  • 23
    Subtitle Workshop

    Subtitle Workshop

    Free subtitle editor

    Subtitle Workshop is a free application for creating, editing, and converting text-based subtitle files. It supports all the subtitle formats you need and has all the features you would want.
    Leader badge
    Downloads: 1,026 This Week
    Last Update:
    See Project
  • 24
    Darwin 2: Java Framework for Evolutionary Computation (genetic algorithm, GA). A true framework with out-of-the-box functionality and extensibility of all classes. Interface-based pattern with dependency-injection to configure components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    RoboBeans is an interface to the "Robocup 2D Soccer Simulation Server" that allows developers to write Robocup teams\agents concentrating on behaviour and AI without having to worry about syntax of communication or network issues.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB