Showing 5 open source projects for "image text input"

View related business solutions
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    PersonaPlex

    PersonaPlex

    PersonaPlex code

    ...PersonaPlex also supports persona and voice control, allowing developers to define the role and speaking style of the agent using text prompts and voice conditioning, making it suitable for applications like customized voice assistants, interactive character agents, or domain-specific conversational tools. Internally, it processes continuous audio streams in a hybrid input format so that speech understanding and generation occur jointly.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    ...Moshi models two streams of audio: one corresponds to Moshi, and the other one to the user. At inference, the stream from the user is taken from the audio input, and the one for Moshi is sampled from the model's output. Along these two audio streams, Moshi predicts text tokens corresponding to its own speech, its inner monologue, which greatly improves the quality of its generation. A small Depth Transformer models inter codebook dependencies for a given time step, while a large, 7B parameter Temporal Transformer models the temporal dependencies.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Virtual Speech Mechanism System

    Virtual Speech Mechanism System

    Virtual Speech Mechanism System converts text to voice.

    Virtual Speech Mechanism System is .NET based application written in C#. It can convert text to speech either in interactive mode or take input from a TEXT file. It's output can either be directed to speakers or saved as WAV file that can be played with any audio player. Output wave can be selected to be of channel 1 or 2. It is 2 by default. The speech rate can be controlled by -10 to 10 points depending upon the requirements along with volume ranging from 0 to 100%. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    JRTalk is a speech syth program. It allows handicap users to use a mouse and the keyboard to select phrases, type words and sentences. The software converts this text input into speech and plays it though the speakers attached to the soundcard. It also a
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    A machine translation program designed to accept verbal or text input and provide text or speech synthesized voice translation as output. Makes use of 3 current open-source projects. The source is currently C/C++ and embedded perl.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB