This section will show (and tell) you how to use speech synthesis with sub-utterance units, make you aware of the draw-backs, and indicate how to stay informed about speech delivery progress.
This part requires you to actually implement/code. You will extend some classes in Java and maybe write some of your own.
There's some example code that we'll be relying on in the remainder of the tutorial. Make yourself a nice directory somewhere and git clone https://bitbucket.org/timobaumann/SimplisticBabelfish/
(or svn checkout svn+ssh://gate.spectrum.uni-bielefeld.de/vol/acl/projects/InPro/INPRO_SVN/Text/IS2013tutorial/demo
).
Import the project that you now find in the directory into Eclipse. Ideally (given that you have the Inpro Eclipse project setup), everything should just work. If not, you will have to add the dependencies to you Inpro project as well as to all the Jars in Inpro/lib to your buildpath for the new project.
In your newly imported Eclipse project, take a look at synthesis/SynthesisRunner
which extends IUModule
.
leftBufferUpdate
method; we do not plan to accept but only to produce IUs.Run the new module (right-click, start as Java application).
Notice how it didn't work? You have to have MaryTTS installed, including the tweaks that are necessary for incremental processing support and tell Java about the path. Everything should be set up in the lab (if not, have a look at the setup instructions ), all you need is to add -Dmary.base=/path/to/mary/install/directory
to your Eclipse run configuration (second tab, VM arguments).
Re-run the new module and rejoice.
The previous program was hardly incremental (internally it was, but we didn't make use of this). How about adding two PhraseIUs, one for "Dies ist ein langer" and another for "und inkrementell erstellter Satz.". Try this. Is there a difference whether you call notifyListeners()
twice or whether you add both IUs in one go? (Is there a difference internally?)
Second, add a short pause between the PhraseIUs that you send. Do this to simulate the case that your module just doesn't know how to finish the utterance, maybe because it is waiting for more data, or processing takes very long. Use Thread.sleep(). What happens prosodically the longer the pause is that you set?
Third, you can add a HesitationIU
after your first phrase, in order to cover the pause. What happens if there is no pause?
Next, you could add LabelWriters, both to your current module (in addition to the synthesizer), and also to the synthesis module. What output to they generate? Can you also use the CurrentHypothesisViewer? (Why is the CurrentHypothesisViewer boring? You will later be able to improve it!)
Finally, try adding a bunch of phrases, notifyListeners, and then revoke the last few IUs. Does this work? In what cases does it not work? Why? Do you have ideas to improve the current behaviour? Find out how to implement them...
So far, the SynthesisRunner does not get any feedback about speech synthesis progress. All we can do is guess how much time we have, or add hesitations to account for possible delays.
Look at the interface IUUpdateListener
defined inside inpro.incremental.unit.IU
. Now extend that interface (either with your main class or an internal class). Put some
If you want to use the Eclipse debugger, you need to stop the whole VM, not only the current thread (Right-click a break-point->Breakpoint Properties->Suspend VM).
Concurrency plays an important role in the speech synthesis component, so you need to be familiar with the concept of multiple threads simultaneously participating in a common task. Specifically, in incremental speech synthesis two tasks run in parallel:
InproTK's inter-module communication is limited to left-to-right processing. (Remember? There is a right-buffer object but there is no left-buffer object which could potentially feed back information to a previous module -- also, there are good reasons for this.) To feed back information about delivery status, the synthesis module updates the incoming PhraseIUs' progress information and updates any listeners of the PhraseIUs.
InproTK's incremental speech synthesis is based on MaryTTS which is a state-of-the-art speech synthesis and text-to-speech toolkit written in Java.