Showing 27 open source projects for "image text input"

View related business solutions
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Tagify

    Tagify

    Lightweight, efficient Tags input component in Vanilla JS

    Transforms an input field or a textarea into a Tags component, in an easy, customizable way, with great performance and a small code footprint, exploded with features. Customizable HTML templates for the different areas of the component (wrapper, tags, dropdown, dropdown item, dropdown header, dropdown footer) Shows suggestions list (flexible settings & styling) at full (component) width or next to the typed texted (caret) Allows setting suggestions' aliases for easier fuzzy-searching....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    ...PersonaPlex also supports persona and voice control, allowing developers to define the role and speaking style of the agent using text prompts and voice conditioning, making it suitable for applications like customized voice assistants, interactive character agents, or domain-specific conversational tools. Internally, it processes continuous audio streams in a hybrid input format so that speech understanding and generation occur jointly.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Speakr

    Speakr

    Speakr is a personal, self-hosted web application

    Speakr is an open-source, real-time text-to-speech (TTS) web application that allows users to convert written text into natural-sounding speech in just a few clicks. It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    ...Moshi models two streams of audio: one corresponds to Moshi, and the other one to the user. At inference, the stream from the user is taken from the audio input, and the one for Moshi is sampled from the model's output. Along these two audio streams, Moshi predicts text tokens corresponding to its own speech, its inner monologue, which greatly improves the quality of its generation. A small Depth Transformer models inter codebook dependencies for a given time step, while a large, 7B parameter Temporal Transformer models the temporal dependencies.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Podcastfy.ai

    Podcastfy.ai

    Transforming Multimodal Content into Captivating Multilingual Audio

    Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization and scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Shutter Encoder

    Shutter Encoder

    Free professional video converter Windows|Mac|Linux

    Shutter Encoder is an video, audio and image converter based on FFmpeg and other great tools. It has been designed by video editors in order to be as accessible and efficient as possible. It's a swiss knife tool for any video editor. Link to website & downloads : https://www.shutterencoder.com - Without conversion: Cut without re-encoding, Replace audio, Rewrap, Conform, Merge, Extract, Subtitling, Video inserts - Sound conversions: WAV, AIFF, FLAC, ALAC, MP3, AAC, AC3,...
    Leader badge
    Downloads: 90 This Week
    Last Update:
    See Project
  • 8
    Drumstick MIDI Monitor

    Drumstick MIDI Monitor

    MIDI monitor for Linux

    Drumstick MIDI Monitor is a MIDI monitor for Linux using ALSA sequencer and Qt5 user interface.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    JMP3Renamer
    JMP3Renamer is a plugin-based renamer/tagger written in Java. It supports automatical assignment of the data to the files and magic cookies to specify the filename format. Currently available plugins: Discogs, MusicBrainz, Filename, Filetag, Mp3, Ogg
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    ARITA

    ARITA

    Extraordinary audio player for FreeBSD & GNU/Linux

    ...As for 'cuesheets': tracks are merged into a single continuous audio file and a supplementary text file, which provides information on where tracks start and end.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    psgdump

    Dump psg/ym chip tune files to txt and midi format

    PSGDump tool is parser and converter for chip tune files. It supports PSG and YM input file formats, focusing on AY/YM chip tunes from ZX Spectrum and Atari ST. The tool produces text output of notes played and creates multi-track MIDI file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    quickplot

    quickplot

    interactive 2D plotter

    Quickplot is a fast interactive 2D plotter with infinite zooming, value picking, pipe input, and unlimited plots displayed. Quickplot is meant for looking at your data quickly and of secondary importance is making static pictures of your data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13

    NotesTyper

    Convert your text typing into music

    NotesTyper system converts text typing into music. You will need computer keyboard, microphone (notebook mic is ok) and Chrome or Firefox browser. NotesTyper has multiple operating modes and settings, which allow to produce different music from same text. Computer keyboards do no allow to input keypress velocity, but NotesTyper overcomes this restriction by processing microphone input level when you type.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    Guitar Chord and Scale Diagrammer

    Quickly create, edit, and print guitar chord/scale fingering diagrams

    ...Click any added dot to remove it (change it back to an empty position). Right-Click any fret on any string to add a character of your choice (add finger numbers, root note labels, interval labels, etc.). Click the title text to give the chord/scale/arpeggio a name. Right-Click the title to save the diagram to a .png image. Press the "p" key on your keyboard to create an HTML or Image layout of any selected diagrams you've created. It's fast, Fast, FAST. Create totally customized collections of chord fingerings in seconds for songs, or for specific topics. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    EnKoDeur-Mixeur
    EnKoDeur-Mixeur (EKD) is an open source software which makes videos, pictures and audio post-production. It can be also used to convert videos in many formats. It is written in python and use the PyQt4 bindings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    BasicSynth is a software sound synthesis system written in C++. It includes C++ classes implementing a variety of signal generators, processors, synthesis instruments and score processing, command line and GUI synthesizers built with the libraries.
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    jMorse generates audible morse code from input text. jMorse is written in java and can be used via ant, log4j, java logging, or directly in-code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SteGUI is a graphical front-end to Steghide. It lets users view the images and play the sounds that Steghide allows as cover files, and command the program all with one tool. It also embeds a simple text editor to manage text payload files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Algomusic is a Java tool made for introducing algorithmic music. Originally a college project it's still quite immature and needs help going forward. Plans include a GUI and fully developed toolkit/API for possible educational use. (Ver: Alpha 0.1)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Text-User-Interface base SHOUTCast tuner. tuxshout can run comfortablely under low-memory and fossil environment. Supported input device is not only keyboard but also mouse! So you can use like gui.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The audio CD image extractor library and console application. It can unpack various audio image formats and store audio tracks as media files. Supported input formats: CUE, NRG (Nero), WavPack. Supported output formats: mp3, wav.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Convert text to International Morse Code. Input is ASCII text. Output can be: - . -..- - on the console, raw 8bit PCM suitable for piping to /dev/audio, .wav files or even (mp3|ogg). Good for headlines on your MP3 player or code practice.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23

    DuMP3 - duplicate & similar file finder

    DuMP3 is a duplicate and similar file finder.

    DuMP3 is a duplicate and similar file finder. It finds exact duplicate binaries by hash, similar text files by substring content, images (JPG, BMP, GIF, PNG, etc) by color and audio files (MP3, WAV, OGG, etc) by wave data. Future: fonts, video.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    EMC is a Entertaiment-System for playing and displaying Multimedia-Data like Audio, Video, Text, ....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Amiant Navigator is a cross-platform plug-in based all-in-one file manager, media content viewer/editor/converter, archiver, text editor, ftp browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB