voice analysis free download

Showing 49 open source projects for "voice analysis"

View related business solutions

Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

Qwen2-Audio

Repo of Qwen2-Audio chat & pretrained large audio language model

Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound classification, emotion, etc.), and offers pretrained models (e.g. 7B) released via ModelScope and Hugging Face. ...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
2

SEO Machine

A specialized Claude Code workspace for creating long-form

...The architecture emphasizes context-awareness, using brand voice, style guides, and keyword strategies to maintain consistency across outputs. It also includes performance evaluation tools that score content and suggest improvements before publishing.

Downloads: 0 This Week

Last Update: 2026-04-10
See Project
3

Ultravox

Fast multimodal LLM for real-time voice interaction and AI apps

...Internally, it leverages pretrained language models and speech encoders, with a multimodal adapter that integrates both modalities for inference and training. Ultravox is optimized for low latency, achieving fast response times suitable for interactive voice agents and real-time applications. It supports use cases such as conversational AI agents, speech-to-speech translation, and analysis of spoken audio content. Ultravox also includes tooling and configuration systems for training, evaluation, and dataset integration.

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
4

Amazing-Python-Scripts

Curated collection of Amazing Python scripts

...The repository encourages community contributions, allowing developers to add their own scripts and improve existing ones through pull requests. Examples include scripts for sentiment analysis, data scraping, web automation, log analysis, and interactive applications such as games or voice-controlled tools. The project also provides contribution guidelines and documentation so that developers can easily collaborate and expand the collection of scripts.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
Your monitoring isn't a stack. It's a pile. Fix that.
Errors, performance, logs, uptime. One install, one invoice, one UI.

Replace Datadog, New Relic, and Sentry without adding three more dashboards.

Free 30 days.
5

Big-AGI

AI suite powered by state-of-the-art models and providing advanced AI

Big-AGI is a comprehensive, open-source AI workspace built to serve as a powerful multi-model interface for developers, researchers, and professionals who want deep control over generative AI workflows and outputs. It unifies access to multiple large language models (LLMs) and AI services through a modern web UI that emphasizes efficient interaction, flexibility, and extensibility, enabling users to conduct multi-model chats, execute code, generate images, and perform voice or text-based...

Downloads: 1 This Week

Last Update: 2026-05-13
See Project
6

NVIDIA NeMo

Toolkit for conversational AI

NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...

Downloads: 3 This Week

Last Update: 2026-04-22
See Project
7

MiniCPM-o

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...

Downloads: 0 This Week

Last Update: 2025-05-15
See Project
8

Open Interpreter

A natural language interface for computers

Open Interpreter is an open-source tool that provides a natural-language interface for interacting with your computer. It lets large language models (LLMs) run code locally (Python, JavaScript, shell, etc.), enabling you to ask your computer to do tasks like data analysis, file manipulation, browsing, etc. in human terms (“chat with your computer”), with safeguards. Runs locally or via configured remote LLM servers/inference backends, giving flexibility to use models you trust or have...

Downloads: 18 This Week

Last Update: 18 hours ago
See Project
9

Eliza

Autonomous agents for everyone

Build and deploy autonomous AI agents with consistent personalities across Discord, Twitter, and Telegram. Full support for voice, text, and media interactions. Built-in RAG memory system, document processing, media analysis, and autonomous trading capabilities. Supports multiple AI models including Llama, GPT-4, and Claude. Create custom actions, add new platform integrations, and extend functionality through a modular plugin system. Full TypeScript support.

Downloads: 1 This Week

Last Update: 2 days ago
See Project
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
10

Auto-Commenter

A Claude skill that automatically posts personalized comments

Auto-Commenter is a Claude-oriented automation project built to help users write and post comments that sound natural and context-aware in targeted online communities. It centers on learning a user’s writing style from their real comment history, then applying that style to generate responses that feel consistent with the user rather than generic template text. The workflow emphasizes deeper post analysis so the system can respond to what is actually being discussed, instead of replying with...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
11

MiMoCode

Where Models and Agents Co-Evolve

...It can read and edit code, run commands, manage Git, and preserve project knowledge across sessions. The tool includes multiple agent modes, including a build mode for development, a plan mode for read-only analysis, and a compose mode for structured workflows. Its persistent memory system stores project notes, checkpoints, scratch notes, and task progress so the assistant can resume work with context. It also supports subagents, goal checking, voice input, MCP connections, and custom provider configuration. MiMo-Code is useful for developers who want an autonomous coding assistant that combines terminal workflows, long-running task management, and project-aware memory.

Downloads: 1 This Week

Last Update: 4 days ago
See Project
12

VOIP-VOICE-TO-TEXT&ANALYS

Convert VoIP calls to text and analyze them with AI

The VoIP voice-to-text software for Issabel is an intelligent, AI-based solution that converts calls into accurate Persian text. After each call, the audio file is sent to the GPT-4O AI engine, producing editable transcripts. The software also provides AI-powered call analysis, extracting key points, customer requests, satisfaction levels, and sensitive topics, all stored in the database.

1 Review

Downloads: 0 This Week

Last Update: 2025-11-22
See Project
13

Recorder

HTML5 js recording mp3 wav ogg webm amr format

Supports microphone recording and real-time processing in most of the implemented getUserMediamobile and PC browsers, mainly including Chrome, Firefox, Safari, iOS 14.3+, Android WebView, Tencent Android X5 kernel (QQ, WeChat, Mini Program WebView) , uni-app (App, H5), and most Android phones updated after 2021 have their own browsers; do not support: UC-based kernel (typical Alipay), most of the old domestic mobile phones that have not been updated have their own browsers and any other...

Downloads: 1 This Week

Last Update: 2025-01-11
See Project
14

Luna AI

Virtual AI anchor that combines state-of-the-art technology

Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE),...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
15

SpectrumNotes

Live analysis of pitches, harmonics, chords, and keys.

Windows 10+ 64-bit desktop application for analyzing live audio (mic or output), and displaying as pitches, color coded for pitch, analyzing chords, keys, and harmonics, with a built-in instrument tuner.

Downloads: 1 This Week

Last Update: 2026-03-15
See Project
16

Qwen-Audio

Chat & pretrained large audio language model proposed by Alibaba Cloud

Qwen-Audio is a large audio-language model developed by Alibaba Cloud, built to accept various types of audio input (speech, natural sounds, music, singing) along with text input, and output text. There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
17

InstrumentalMusic

Application which detects musical notes from the microphone.

Application which detects musical notes from the microphone. It allows listening to the microphone and play the detected notes to output (in midi). Multilanguage support. Zoom Dark mode option JDK-17 compatibility With v1.2 it includes a pitch shifter (making voice lower or sharper through a slider) There is a demo video which shows how it works (the demo video can be visited from Help menu of the application) You can also see the pitch-shifter demo version...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
18

Amphion

Toolkit for audio, music, and speech generation

Amphion is a toolkit from OpenMMLab dedicated to audio, music, and speech generation, aimed at both reproducible research and helping newcomers get started in generative audio. It provides standardized implementations and recipes for classic and state-of-the-art generative models in audio, including TTS, music generation, and voice conversion. A distinctive feature of Amphion is its emphasis on visualization: it offers interactive visualizations of model architectures and generation...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
19

DoSA-2D

2D open source actuator simulation software

DoSA-2D is a two-dimensional open source software for magnetic force analysis of actuators and solenoids. Not only individuals but also companies can use the program for free and participate in the development of it themselves. The program environment is developed to be similar to that of product development, so even product developers who have not majored in analysis can easily analyze the magnetic force of actuators or solenoids. DoSA-2D is responsible for an easy working...

Downloads: 2 This Week

Last Update: 2024-04-14
See Project
20

DoSA-3D

3D open source actuator simulation software

DoSA-3D is a 3D open source software for magnetic force analysis of actuators and solenoids. Not only individuals but also companies can use the program for free and participate in the development of it themselves. The program environment is developed to be similar to that of product development, so even product developers who have not majored in analysis can easily analyze the magnetic force of actuators. In DoSA-3D, three programs are connected and operated as follows. - DoSA-3D :...

Downloads: 1 This Week

Last Update: 2024-04-14
See Project
21

Feishu ChatGPT

Voice dialogue, role-playing, multi-topic discussion, picture creation

Feishu × (GPT-3.5 + DALL·E + Whisper) = flying-like work experience. Voice dialogue, role-playing, multi-topic discussion, picture creation, table analysis, document export. Golang language, it goes without saying! Master the gin framework proficiently, developing the backend is as natural as breathing! Familiar with the SDKs of DingTalk, Feishu, Qiwei and other platforms, and be able to develop and integrate a series of amazing functions!

Downloads: 0 This Week

Last Update: 2023-11-20
See Project
22

VoiceFixer

General Speech Restoration

VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that reconstructs a high-quality waveform from those features. Unlike many single-purpose noise reduction tools, VoiceFixer targets a “general speech restoration” problem (GSR), capable of handling multiple types of distortions at once, which makes it suitable for old recordings, phone-call audio, amateur voice recordings, or archival media. ...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
23

vocoder_chung

vocoder chung is a small educational vocoder using discrete fourier transform FFT spectrum written in easy fast compiled freebasic . (24/12/2019) uses fast and accurate FFTdll.dll (28/03/2020) algorythmic voice cloning / change / morphing experiment added

Downloads: 0 This Week

Last Update: 2020-06-03
See Project
24

Resemblyzer

A python package to analyze and compare voices with deep learning

Resemblyzer is a Python package for analyzing and comparing voices with deep learning. It works by turning speech audio into a compact voice embedding that represents the speaker’s vocal characteristics. These embeddings can then be used for speaker similarity, clustering, diarization experiments, voice comparison, and audio dataset exploration. The project is useful for researchers and developers who need a practical way to reason about speaker identity without building a voice encoder from...

Downloads: 1 This Week

Last Update: 2026-06-10
See Project
25

Psygraph

Code for the Psygraph mobile application

Psygraph is a Personal Data Collector (PDC) and activity timer. It includes a stopwatch, timer, counter, and note taker (voice recorder), each of which collects data from the device’s sensors (e.g. the device velocity and location (via GPS)). Although the interface is simple (a button or two on each screen), the data is saved for later analysis and display (you can store and view the data on WordPress). It is a scientific instrument that is easy to use.

Downloads: 0 This Week

Last Update: 2023-04-20
See Project