Showing 63 open source projects for "java audio effects"

View related business solutions
  • Outgrown Windows Task Scheduler? Icon
    Outgrown Windows Task Scheduler?

    Free diagnostic identifies where your workflow is breaking down—with instant analysis of your scheduling environment.

    Windows Task Scheduler wasn't built for complex, cross-platform automation. Get a free diagnostic that shows exactly where things are failing and provides remediation recommendations. Interactive HTML report delivered in minutes.
    Download Free Tool
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    Pedalboard

    Pedalboard

    A Python library for audio

    pedalboard is a Python library for working with audio: reading, writing, rendering, adding effects, and more. It supports the most popular audio file formats and a number of common audio effects out of the box and also allows the use of VST3® and Audio Unit formats for loading third-party software instruments and effects. pedalboard was built by Spotify’s Audio Intelligence Lab to enable using studio-quality audio effects from within Python and TensorFlow. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    HunyuanVideo-Foley

    HunyuanVideo-Foley

    Multimodal Diffusion with Representation Alignment

    HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks. Produces high-quality 48 kHz audio output suitable for professional use. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces, enabling multiple characters to be animated in a scene. Character image injection module for better consistency between training and inference conditioning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time and offline workflows in mind, enabling applications from consumer editing to professional content creation and batch processing. Internally optimized for multi-core processors and hardware acceleration where available, LTX-Video makes it feasible to work with high-resolution content and complex timelines without sacrificing responsiveness.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    comfyui-mixlab-nodes

    comfyui-mixlab-nodes

    Workflow and speech recognition app

    ...The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that mix live screen content, generative models, and visual effects. For audio and speech, it provides nodes for SpeechRecognition and SpeechSynthesis, plus workflows that combine voice generation with real-time face swapping and other audio-visual effects. On the AI side, it integrates multiple LLM providers (cloud and local), supports OpenAI-compatible endpoints, Siliconflow models, and includes prompt-focused utilities for random prompt generation, Chinese prompts, clip interrogation.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    LTX-2 is a powerful, open-source toolkit developed by Lightricks that provides a modular, high-performance base for building real-time graphics and visual effects applications. It is architected to give developers low-level control over rendering pipelines, GPU resource management, shader orchestration, and cross-platform abstractions so they can craft visually compelling experiences without starting from scratch. Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries,...
    Downloads: 46 This Week
    Last Update:
    See Project
  • 8
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 50 This Week
    Last Update:
    See Project
  • 9
    Voxal voice changer

    Voxal voice changer

    Transform your voice in real-time voxal voice changer

    Voxal Voice Changer is a program that allows you to modify your voice by applying various effects (e.g. pitch change, echo, etc.) in real-time. Effects can be added in any sequence and in any combination, allowing you to distort your voice beyond recognition. Take your audio to the next level! Our powerful Voice Changer software lets you morph your voice in real-time with stunning AI-powered quality. Whether you're looking to have fun, protect your privacy, or create engaging content, we have the perfect voice for you. ...
    Leader badge
    Downloads: 25 This Week
    Last Update:
    See Project
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 10
    txtai

    txtai

    Build AI-powered semantic search applications

    txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications. Traditional search systems use keywords to find data. Semantic search applications have an understanding of natural language and identify results that have the same meaning, not necessarily the same keywords. Backed by state-of-the-art machine learning models, data is transformed into vector representations for search (also known as embeddings). Innovation is happening at a rapid...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    elevenlabs-api

    elevenlabs-api

    elevenlabs-api is an open source Java wrapper around the ElevenLabs

    Elevenlabs-api is an open-source Java wrapper around the ElevenLabs Voice Synthesis and Cloning Web API. Compiled JARs are available via the Releases tab. To access your ElevenLabs API key, head to the official website, you can view your xi-API-key using the 'Profile' tab on the website. To set up your ElevenLabs API key, you must register it with the ElevenLabsAPI Java API.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    eGuideDog free software for the blind
    eGuideDog project develops free software for the blind. Currently, we focus on WebSpeech, Ekho TTS and WebAnywhere.
    Leader badge
    Downloads: 140 This Week
    Last Update:
    See Project
  • 13
    Conversations

    Conversations

    App in java for chatting to a generative A.I. (involving tts and stt)

    Java application for chatting to generative AI Llama3. * The user can speak into the microphone (speechToText), edit the recognized text and send it to the AI. * The AI ​​responds and the server returns that response in real time, and the sentences converted to audio (textToSpeech), and the application broadcasts them through the speaker.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Intelligent Java

    Intelligent Java

    Integrate with the latest language models, image generation and speech

    Intelligent java (IntelliJava) is the ultimate tool to integrate with the latest language models and deep learning frameworks using java. The library provides an intuitive functions for sending input to models like ChatGPT and DALL·E, and receiving generated text, speech or images. With just a few lines of code, you can easily access the power of cutting-edge AI models to enhance your projects.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Meihu-FaceBeauty-Live

    Meihu-FaceBeauty-Live

    Beauty can be applied to live broadcasts, short videos, and selfies

    Meihu beauty sdk is a mobile sdk with face recognition technology as the core, providing professional-grade real-time beauty, big eyes and face reduction, beauty filters, dynamic stickers and other filters, to create a multi-functional video beauty software The goal is to fully meet the beautification needs of customers in many audio and video software business scenarios such as live beauty and short video beauty. The open source version is now available for iOS, and the Android open source...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AlphaPlayer

    AlphaPlayer

    AlphaPlayer is a video animation engine

    AlphaPlayer is positioned as a multimedia or media-player library or application under ByteDance, likely intended to provide video/audio playback functionality, streaming, or media rendering capabilities. It probably serves as a foundation for building media-heavy applications — offering features like playback control, streaming support, adaptive media handling, and possibly integration with custom codecs or streaming protocols. For developers building web, desktop, or mobile applications...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    FastoCloud PRO

    IPTV/NVR/CCTV/Video cloud https://fastocloud.com

    IPTV/Video cloud Features: Cross-platform (Linux, MacOSX, FreeBSD, Raspbian/Armbian) GPU/CPU Encode/Decode/Post Processing Stream statistics CCTV Adaptive hls streams Load balancing Temporary urls HLS push EPG scanning Subtitles to text conversions AD insertion Logo overlay Video effects Relays Timeshifts Catchups Playlists Restream/Transcode from online streaming services like Youtube, Twitch ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    jMIR

    jMIR

    Music research software

    jMIR is an open-source software suite implemented in Java for use in music information retrieval (MIR) research. It can be used to study music in the form of audio recordings, symbolic encodings and lyrical transcriptions, and can also mine cultural information from the Internet. It also includes tools for managing and profiling large music collections and for checking audio for production errors. jMIR includes software for extracting features, applying machine learning algorithms, applying heuristic error error checkers, mining metadata and analyzing metadata.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    ILA - teachable voice assistant

    ILA - teachable voice assistant

    ILA is a fully customizable and teachable voice assistant for Java

    ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux. It is designed to integrate with your home enviroment and for example build up your own,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Ansj Chinese word segmentation

    Ansj Chinese word segmentation

    Ansj word segmentation

    The real java implementation of ict. The word segmentation effect is faster than the open source version of ict. Chinese word segmentation, name recognition, part-of-speech tagging, user-defined dictionary. This is a java implementation of Chinese word segmentation based on n-Gram+CRF+HMM. The word segmentation speed reaches about 2 million words per second (tested under mac air), and the accuracy rate can reach more than 96%.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    J-Syncker
    This application assists in generating pre-compositional material based on a computational interpretation of the 'Schillinger System of Musical Composition' (Schillinger 1946).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    jaivox

    jaivox

    Speech recognition application builder and library

    Java library and tools to create open source speech recognition applications. Generates dialogs for conversational interfaces. Works with a popular open source speech recognition library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next