Menu

AddingNewLanguage

Markus Toman

Adding new HTS voices and languages to flite+hts_engine

This document describes the process of adding new languages and HMM voices to flite+hts_engine.
Please feel free to contact the project members if you would like to contribute!

External ressources

Flite: http://www.festvox.org/flite/
Flite Documentation: http://www.festvox.org/flite/doc/index.html
Flite+hts_engine: http://hts-engine.sourceforge.net/
HTS: http://hts.sp.nitech.ac.jp/

Building HMM voices for flite+hts_engine

At the time of writing, the current versions of flite and hts_engine are:
- Flite+hts_engine version 1.04 (December 25, 2012)
- hts_engine API version 1.07 (December 25, 2012)

If you have an existing HTS-voice you should make sure it is in the correct format for hts_engine 1.07.
The current HTS version producing this combined format (.htsvoice) is HTS 2.3 alpha.
A straightforward possibility is to use the demonstration training scripts of HTS to train your voice.

Conversion of existing festival data for flite

The original Flite package (this means not the flite+hts_engine distribution) contains a folder "tools" to assist this process.
Most of these files are SIOD-scheme scripts, so the most straighforward way to use them is by using a festival distribution.

If you encounter problems with the heap at any point, try adding the command line option

--heap 10000000

to the festival call.

In the following code snippets we will use
$festival as the path to the festival binary
$festpath as the path to the festival directory
$tools as the path to the flite tool scripts

Conversion of phone set

Using make_phoneset.scm you can generate C-code for flite for the phone set definition.
Output file will be called: ${phonesetname}_phoneset.c and placed in $outputdir.

~~~~~~~~~~~~~~~~~
phonesetname=myPhoneset
phonesetdef=/path/to/phoneset.scm
silence=sil
outputdir=output

$festival -libdir $festpath/lib \ $tools/make_phoneset.scm \
-b "(phonesettoC $phonesetname (car (load "$phonesetdef" t)) $silence $outputdir)"
~~~~~~~~~~~~~~~~~~~

The output file should later be placed in flite+hts_engine/flite/lang/$language.

Conversion of lexicon

From flite/tools you need make_lex.scm and huff_table, we put it into a directory scripts here.

In huff_table you should edit the following paths to match your configuration:

   $ESTDIR/../festival/bin/festival -b $FLITEDIR/tools/make_lex.scm '(utf8entries "huff.entries.corpus" "huff.tmp.corpus")' 

Of course you can also set the ESTDIR and FLITEDIR environment variables and put the scripts at the correct locations.

You can convert your lexicon as following:

$festival --heap 10000000 \
          --libdir $festpath/lib \
          scripts/make_lex.scm \
          -b "(begin (lextoC \"$lexname\" \"$inputlex\" \"$output\"))"

scripts/huff_table phones $output/${lexname}_lex_data $output/${lexname}_lex_phones_huff_table.c
scripts/huff_table entries $output/${lexname}_lex_data $output/${lexname}_lex_entries_huff_table.c
~~~~~

Set `$lexname` to a name you choose for your lexicon.
Set `$inputlex` to the path to your lexicon file.
Set `$output` to the output directory where the .c and .h files should be placed.

Conversion of letter-to-sound rules
-----------------------------------

From `flite/tools` you need `make_lts_wfst.scm` and `make_lts.scm`, we put them into a directory `scripts` here.

For `make_lts_wfst.scm` make sure you have set the `$ESTDIR` directory to the Edinburgh Speech Tools binary directory, else you can change the path in `make_lts_wfst.scm` here:
 (system 
  (format nil 
      "$ESTDIR/bin/wfst_build -heap 10000000 -type rg -detmin -o %s/%s.tree.wfst %s/%s.tree.rg"
      odir (car a)
      odir (car a)))
You can then convert your scheme LTS rules as following:

$festival --heap 10000000 \ --libdir $festpath/lib \ scripts/make_lts_wfst.scm \ scripts/make_lts.scm \ $inputlts \ -b "(begin (lts_to_rg_to_wfst lts_rules \"$output\") (ltsregextoC \"$ltsname\" lts_rules \"$output\" \"$output\"))"

Set `$inputlts` to a path to your lts scheme file.
Set `$output` to an arbitrary output directory.
Set `$ltsname` to a name for your lts rules.

You will encounter problems if you have other features than defined in `make_lts.scm`:

(define (lts_feat trans)
"(lts_feat trans)
Returns the feature number represented in this transition name."
(format t "lts_feat %s\n" trans)
(let ((fname (substring trans 5 (- (length trans) 11))))
(if (string-matches fname ".*i?")
(set! fname (string-before fname "
")))
(cond
((string-equal fname "p.p.p.p.name") 0)
((string-equal fname "p.p.p.name") 1)
((string-equal fname "p.p.name") 2)
((string-equal fname "p.name") 3)
((string-equal fname "n.name") 4)
((string-equal fname "n.n.name") 5)
((string-equal fname "n.n.n.name") 6)
((string-equal fname "n.n.n.n.name") 7)
(t (error (format nil "ltsregex2C: unknown feat %s %s\n" fname trans ))))))

Adding a new language to flite+hts_engine
=========================================

Here we will use the approach to copy and modify the existing "usenglish" code.
We will refer to your new language as `$language`. You will replace occurrences of "usenglish" with `$language`.
Also we will refer to your language shortcut as `$ln`. You will replace occurrences of "us" with `$ln`.
`$voice` will be used for the newly added voice (if any) and `$lexicon` for the name of the newly added lexicon.

If you don't plan to use unit selection voices, you can remove all blocks of
`\#ifndef FLITE_PLUS_HTS_ENGINE ... \#endif`

flite/lang/$language
--------------------

First copy `flite+hts_engine/flite/lang/usenglish` to `flite+hts_engine/flite/lang/$language`.

The main entry point for your text analysis are `$language.c` and `$language.h`.

flite/lang/$voice
-----------------

Create a directory `flite/lang/$voice` for now if you plan to use a new voice model.
See the separate chapter for adding a new voice to flite+hts_engine.

flite/lang/$lexicon
-------------------

Create a directory `flite/lang/$lexicon`.
If you have converted your lexicon from a festival lexicon, copy the files here.
If you have converted your lts rules from a festival lts-tree, copy the files here.

TODO: How to create a new lexicon and lts rules instead of using conversion from festival

From `flite/lang/cmulex` copy cmu_lex.c, cmu_lex.h, cmu_postlex.h. Rename them and replace all occurrences of "cmu" with your lexicon name.

bin/Makefile.am
---------------

Copy the lines

~~~~~~
::::bash
-I$(top_srcdir)/flite/lang/cmu_us_kal \
-I$(top_srcdir)/flite/lang/cmulex \
-I$(top_srcdir)/flite/lang/usenglish \
~~~~~~

and replace `cmu_us_kal` with `$voice`, `cmulex` with `$lexicon` and `usenglish` with `$language`.

flite/lang/$language/$language.h
--------------------------------

Don't forget to change the `\#ifndef` and `\#define` directives to something unique.

/ Voices call this to use usenglish. /
void usenglish_init(cst_voice *v);

This is the initialization function for your new module - rename "usenglish" to `$language`.

/ Default functions and values that you might need. /
extern const cst_phoneset us_phoneset;
extern const cst_cart us_phrasing_cart; //TODO: perhaps remove this
extern const cst_cart us_int_accent_cart; //TODO: perhaps remove this
extern const cst_cart us_int_tone_cart; //TODO: perhaps remove this
extern const cst_cart us_pos_cart; //TODO: perhaps remove this

Rename these instances.
You might want to remove or comment the CART trees later on if you don't have these for your language.
Another possibility is to try the english CART trees.

flite/lang/$language/$language.c
--------------------------------

Replace all occurrences of "en" or "us".

TODO: Change CARTS

flite/lang/$language/${ln}_text.h
----------------------------------

This is the main entry point for the text analysis.

Replace all occurrences of "en", "us" or "usenglish".

extern const cst_cart us_nums_cart;

Remove or comment this number CART tree if you don't plan to use one.

flite/lang/$language/${ln}_text.h
----------------------------------

This is the main entry point for the text analysis.

Replace all occurrences of "en", "us" or "usenglish".

flite/lang/$language/${ln}_phoneset.c
-------------------------------------

If you have a scheme phone set for festival you can generate this file (see the appropriate section above).

static const char * const at_phonenames[] = {

This is the list of phones in your new language, adapt it.

static const int at_fv_000[] = { 0, 1, 1, 1, 1, 1, 1, 0, -1 };
...

These are feature vectors for each phone.
Adapt them and also adapt all following structures to reflect this.

const cst_phoneset at_phoneset = {
"at",
at_featnames,
at_featvals,
at_phonenames,
"pau",
96,
at_fvtable
};

Enter the correct names from above here and don't forget to correct the number of phones (96 in this example).

lib/Makefile.am
---------------

To `libflhtse_a_SOURCES` add all your new source files.

Call `automake`and `configure` in the project root directory.

Example configure call:

./configure --with-hts-engine-header-path=/.../hts_engine_API-1.07/include --with-hts-engine-library-path=/.../hts_engine_API-1.07/lib
~~~~

Adding a new voice to flite+hts_engine

Here we will use $voice as the name of the new voice to be added.

As a first step, copy flite/lang/cmu_us_kal to flite/lang/$voice.
If you don't plan to use unit selection, you can remove all blocks of \#ifndef FLITE_PLUS_HTS_ENGINE ... \#endif.

Instead of

#include "usenglish.h"
#include "cmu_lex.h"

include your own text analysis and lexicon.

Replace all occurrences of "cmu_us_kal" with $voice.

In register_$voice, replace

v->name = "kal";

with a voice name of your choice and

usenglish_init(v);

with the initialization function of your language.

Also, change

    /* Lexicon */
    lex = at_lex_init();

to your own lexicon initialization function.

Finally, find

static cst_utterance *$voice_postlex(cst_utterance *u)
{
    cmu_postlex(u);

And replace "cmu_postlex" with your own postlex function.