Showing 9 open source projects for "ffmpeg"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 1
    AI YouTube Shorts Generator

    AI YouTube Shorts Generator

    A python tool that uses GPT-4, FFmpeg, and OpenCV

    AI-YouTube-Shorts-Generator is a Python-based tool that automates the creation of short-form vertical video clips (“shorts”) from longer source videos — ideal for adapting content for platforms like YouTube Shorts, Instagram Reels, or TikTok. It analyzes input video (whether a local file or a YouTube URL), transcribes audio (with optional GPU-accelerated speech-to-text), uses an AI model to identify the most compelling or engaging segments, and then crops/resizes the video and applies...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    ChatTTS webUI & API

    ChatTTS webUI & API

    A simple native web interface that uses ChatTTS to synthesize text

    ChatTTS-ui is a local web interface and API wrapper around the ChatTTS speech synthesis system, designed to make advanced TTS models easy to use from a browser. It runs a small backend server (Python + Torch + ffmpeg) and exposes a simple webpage where you can type text, adjust parameters, and generate audio. The project supports Chinese, English, and mixed text with digits and control symbols, making it suitable for bilingual content and numerically heavy text like announcements or prompts. From version 0.96 onward, ffmpeg installation is required for deployment, and previous CSV/PT voice tables are no longer valid, so users instead work with updated “voice value” parameters. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 4
    ChatGPT Telegram Bot

    ChatGPT Telegram Bot

    A Telegram bot that integrates with OpenAI's official ChatGPT APIs

    A Telegram bot that integrates with OpenAI's official ChatGPT, DALL·E and Whisper APIs to provide answers. Ready to use with minimal configuration required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • OpManager the network monitoring software used by over 1 million IT admins Icon
    OpManager the network monitoring software used by over 1 million IT admins

    Network performance monitoring, uncomplicated.

    ManageEngine OpManager is a powerful network monitoring software that provides deep visibility into the performance of your routers, switches, firewalls, load balancers, wireless LAN controllers, servers, VMs, printers, and storage devices. It is an easy-to-use and affordable network monitoring solution that allows you to drill down to the root cause of an issue and eliminate it.
    Learn More
  • 5
    CSM (Conversational Speech Model)

    CSM (Conversational Speech Model)

    A Conversational Speech Generation Model

    The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Warlock-Studio

    Warlock-Studio

    Suite with Real-ESRGAN, BSRGAN , RealESRNet, IRCNN, GFPGAN & RIFE.

    v5.1.1. Warlock-Studio is a Windows application that uses Real-ESRGAN, BSRGAN, IRCNN, GFPGAN, RealESRNet, RealESRAnime and RIFE Artificial Intelligence models to upscale, restore faces, interpolate frames and reduce noise in images and videos. the application supports GPU acceleration (including multi-GPU setups) and offers batch processing for large workloads. It includes drag-and-drop handling for single or multiple files, optional pre-resize functions, and an automatic tiling system...
    Leader badge
    Downloads: 21 This Week
    Last Update:
    See Project
  • 7
    DCVGAN

    DCVGAN

    DCVGAN: Depth Conditional Video Generation, ICIP 2019.

    This paper proposes a new GAN architecture for video generation with depth videos and color videos. The proposed model explicitly uses the information of depth in a video sequence as additional information for a GAN-based video generation scheme to make the model understands scene dynamics more accurately. The model uses pairs of color video and depth video for training and generates a video using the two steps. Generate the depth video to model the scene dynamics based on the geometrical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    3D ResNets for Action Recognition

    3D ResNets for Action Recognition

    3D ResNets for Action Recognition (CVPR 2018)

    We uploaded the pretrained models described in this paper including ResNet-50 pretrained on the combined dataset with Kinetics-700 and Moments in Time. We significantly updated our scripts. If you want to use older versions to reproduce our CVPR2018 paper, you should use the scripts in the CVPR2018 branch.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    JAVT - Just Another Voice Transformer

    JAVT - Just Another Voice Transformer

    Just Another Speech Recognition and Text to Speech software.

    JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Get to know our award-winning HR software. Icon
    Get to know our award-winning HR software.

    HR software with heart.

    BambooHR is all-in-one HR software made for small and medium businesses and the people who work in them—like you. Our software makes it easy to collect, maintain, and analyze your people data, improve the way you hire talent, onboard new employees, manage compensation, and develop your company culture. It’s designed to set you free to focus on what matters most—your people.
    Try it Free
  • Previous
  • You're on page 1
  • Next