Menu

#10 Support Voice Input and Output

open
nobody
2025-08-18
2025-08-18
Anonymous
No

Originally created by: haiphucnguyen

Enable Askimo to handle voice-based interactions in addition to text. This includes:

  • Voice Input (Speech-to-Text): Allow users to provide prompts by speaking into their microphone or supplying an audio file.

  • Voice Output (Text-to-Speech): Read AI responses aloud using a TTS engine.

This will make Askimo more accessible and useful in hands-free or accessibility-focused workflows.

Actions:

  • Integrate a speech-to-text engine (e.g., OpenAI Whisper, Vosk, or other provider APIs).

  • Extend CLI to accept audio file inputs (e.g., askimo --audio input.wav), also streaming (? - I don't know how to do it yet )

  • Add optional real-time microphone capture for interactive sessions.

  • Integrate a text-to-speech engine (e.g., OpenAI TTS, ElevenLabs, Amazon Polly).

  • Provide CLI flags for enabling voice output (e.g., askimo --speak).

  • Document supported audio formats and example commands in README.md.

Discussion


Log in to post a comment.