Thread: [tuxdroid-user] Speech software suite: TTS & Recognition
Status: Beta
Brought to you by:
ks156
From: Florent T. <ft...@gm...> - 2007-03-23 12:49:29
|
*********** Speech recognition: * Option 1: Xvoice + IBM ViaVoice I found interesting stuff. I don't know why, but sphinx really seems hard to use: it almost killed the xvoice project. So, the best option for voice recognition / command launching seems to be the old Xvoice, using IBM's viavoice (which isn't available anymore @ IBM); the following link is a viavoice & xvoice tutorial http://taint.org/wk/ViaVoiceModernLinux (2004 last updated) + It's supposed to work quite good - old, unsupported, no future, non-free * Option 2: sphinx Considering tux's microphone performance, our best chance is with PocketSphinx http://www.speech.cs.cmu.edu/pocketsphinx/ There's an package / ipkg already in openembedded http://www.openembedded.org/viewmtn/revision.psp?id=5369901ff675d4c926113d856ce9da4c8c12fed0 *********************** Text to speech * Option 1: stick with tuxdroid's team choice (acapela) * Option 2: use the simple & efficient espeak I really was surprised by the plug-n-play functionality, and it's light resource footprint. http://espeak.sourceforge.net I may switch to this one. We may want to let the possibility to choose between several packages (i.e. no hard acapela-specific integration). Is it possible do keep it modular? |
From: Florent T. <ft...@gm...> - 2007-03-23 13:08:37
|
************* CVoiceControl Well, i spoke too fast. Seems to me this one may be the good one (needs some testing though): it's really pattern-matching oriented, simple, intended for command line script launching. "CVoiceControl is a tool that gives the user voice control over unix commands. A template matching based speaker dependent isolated word recognition approach is employed." But it's 16 kHz, 16 bit, mono, not 8 kHz... http://www.kiecza.net/daniel/linux/ There's even a deb ! I have to test it when i'm home :) Depends on: * Ncurses library and header files * Pthreads library * OSS sound library (sys/soundcard.h) ************* Yet another speech recog software: Julius We could also try Julius http://julius.sourceforge.jp/en_index.php |
From: David B. <da...@ja...> - 2007-03-23 14:19:37
|
On Fri, 23 Mar 2007 14:08:30 +0100, Florent THIERY <ft...@gm...> wrote: > But it's 16 kHz, 16 bit, mono, not 8 kHz... > It should be fine to do some resmaple if necessary. I really think the quality of the microphone is not that bad. There's 2 problems: 1. when the motors are running or the speaker is used, the microphone will get that noise with a high level and 2. there's some 500Hz noise due to the RF digital modulation. We have to do some tests, once you selected something that's good for you, I'll give it a shot with tux and also with an external microphone. Thanks for your information, david |
From: Florent T. <ft...@gm...> - 2007-03-23 15:14:01
|
> It should be fine to do some resmaple if necessary 16 to 8 Khz downsampling + signal filtering For instance, audacity filtering does it just fine: you select a portion of the recording that has only the noise in it, and it automatically removes it, really a great feature. We may be able to use the libs/plugins... Any ideas: - how to downsample in real time - how to filter the 500 Hz noise > There's 2 problems: 1. when the motors are running or the speaker is used, the microphone will get that noise with a high level Well, personnally i'm not gonna use tux's motors that much: only when it requires much attention for instance, so i guess it won't be a big trouble for me. Really, regarding the animation capabilities, what can we imagine that would exploit them? I was also wondering: For now, animation is restricted to 1/4 turn actions. When the firmware will be more sphisticated, will we have more fine control over movements? How precise are the step-by-step motors? Will we have the possibility to do smaller/slower movements? This would reduce the noise. |
From: Philippe T. <ph...@te...> - 2007-03-23 17:35:23
|
Florent THIERY wrote: >> It should be fine to do some resmaple if necessary >> > 16 to 8 Khz downsampling + signal filtering > > Here it's about upsampling from 8 to 16 as the microphone is 8 and the soft requires its input at 16. I created a page with all your good searches so we can track those alternatives: http://www2.tux-is-alive.com/wiki/Test-to-speech Phil |
From: Philippe T. <ph...@te...> - 2007-03-23 18:27:54
|
> http://www2.tux-is-alive.com/wiki/Test-to-speech > > Stupid typo when I created the page, I moved it to http://www2.tux-is-alive.com/wiki/Text-to-speech |
From: David B. <da...@ja...> - 2007-03-24 12:05:41
|
On Fri, 23 Mar 2007 16:13:58 +0100, Florent THIERY <ft...@gm...> wrote: >> It should be fine to do some resmaple if necessary > > 16 to 8 Khz downsampling + signal filtering > > For instance, audacity filtering does it just fine: you select a > portion of the recording that has only the noise in it, and it > automatically removes it, really a great feature. We may be able to > use the libs/plugins... > > Any ideas: > - how to downsample in real time > - how to filter the 500 Hz noise It's indeed upsampling from 8kHz to 16kHz. Audacity do it fine and there's probably a lot of small command line tools that can be used to do the same. I'm not sure we'll need to remove the 500Hz noise, it's not that important and the voice recognition software may be not sensitive to it. But if you want to record something with your Tux, filtering the 500Hz noise in Audacity afterwards improves the quality. > For now, animation is restricted to 1/4 turn actions. When the > firmware will be more sphisticated, will we have more fine control > over movements? How precise are the step-by-step motors? Will we have > the possibility to do smaller/slower movements? This would reduce the > noise. Probably not from the firmware. We don't use step-by-step motors but standrad ones, The motor is stopped whenever a tooth pushes on a position switch and we can only detect 4th of turns that way. It's possible though to use the motors in another way from the software side by running the motor only for a specific time period instead of running it for a specific angle. Start-wait-stop motor with a smaller PWM should give you finer movements, but the position will never be very accurate I'm afraid. This makes me talk about nex functions I want to develop in the daemon. For now, the firmware only does something like: - start_spinning(pwm) - spin_right(angle, pwm) - spin_left(angle, pwm) - stop_spinning() I don't think these functions should be accessible from the api, at least not in the standard set of functions. I would want the daemon to add a level on top of that which provides some more complex functions that can also benefit from the status received and stored. For example, the functions that should be accessible from the API could provide spinning functionalities like: - spin for a specific angle - spin for x turns - spin for a given time - sets in absolute angle (where 0 could be used to reset in initial position, the obsolute position should be stored in the daemon and updated from the status and commands sent) - stop spinning |
From: Florent T. <ft...@gm...> - 2007-03-24 13:54:26
|
> This makes me talk about nex functions I want to develop in the daemon. > For now, the firmware only does something like: > - start_spinning(pwm) > - spin_right(angle, pwm) > - spin_left(angle, pwm) > - stop_spinning() > > I don't think these functions should be accessible from the api, at least > not in the standard set of functions. I would want the daemon to add a > level on top of that which provides some more complex functions that can > also benefit from the status received and stored. For example, the > functions that should be accessible from the API could provide spinning > functionalities like: > - spin for a specific angle > - spin for x turns > - spin for a given time > - sets in absolute angle (where 0 could be used to reset in initial > position, the obsolute position should be stored in the daemon and updated > from the status and commands sent) > - stop spinning Ok YAQ (yet another question :p): is it possible to control the speed of the motors? |
From: Florent T. <ft...@gm...> - 2007-03-24 14:18:00
|
Btw thanks for creating the wiki articles :) I'll add the things i find with time... |
From: David B. <da...@ja...> - 2007-03-24 15:31:27
|
On Sat, 24 Mar 2007 14:54:24 +0100, Florent THIERY <ft...@gm...> wrote: > YAQ (yet another question :p): is it possible to control the speed of > the motors? There's 5 values for the PWM of the wings and the spinning. I think the api already uses this but by default it's at maximum speed. 0 is stopped, 1 is slow and 5 is maximum speed. Though there are much chances 1 and maybe 2 are too small values so the current won't be able to even start the motor. I think the daemon should control this and only provide the speeds that make sense in the functions. |
From: Florent T. <ft...@gm...> - 2007-03-24 15:40:40
|
> There's 5 values for the PWM of the wings and the spinning. I think the > api already uses this but by default it's at maximum speed. 0 is stopped, > 1 is slow and 5 is maximum speed. Good news !!! > Though there are much chances 1 and > maybe 2 are too small values so the current won't be able to even start > the motor. Well, just suppress the support of the useless ones in the api >I think the daemon should control this and only provide the > speeds that make sense in the functions. In fact, i was thinking of linear speed modulation, so that the movements are less rough. 1 - 2 - 3 - 4 - 5 - 5 - 5 - 5 - 4 - 3 - 2 - 1 |
From: David B. <da...@ja...> - 2007-03-24 16:21:28
|
On Sat, 24 Mar 2007 16:40:36 +0100, Florent THIERY <ft...@gm...> wrote: >> There's 5 values for the PWM of the wings and the spinning. I think the >> api already uses this but by default it's at maximum speed. 0 is >> stopped, >> 1 is slow and 5 is maximum speed. > > Good news !!! > >> Though there are much chances 1 and >> maybe 2 are too small values so the current won't be able to even start >> the motor. > > Well, just suppress the support of the useless ones in the api > >> I think the daemon should control this and only provide the >> speeds that make sense in the functions. > > In fact, i was thinking of linear speed modulation, so that the > movements are less rough. > > 1 - 2 - 3 - 4 - 5 - 5 - 5 - 5 - 4 - 3 - 2 - 1 Well, movements look rough because I wanted to stop the movement on the switch. When the eyes are closed, they need to be stopped very quickly otherwise they reopen. So I had to do braking by reverting the motor fo a short time. The mouth is less critical as going a bit too far doesn't open the mouth sompletely but if we want the switch to have a correct meaning, it should be pushed so we again need to stop very fast. For the wings, there's maybe a slight braking but I'm not sure, and there's nothing for spinning. So it should be possible to start more smoothly but stopping smoothly will be much more complex. That's to be inplemented in firmware anyway. |