From: Mate A. <ele...@gm...> - 2015-07-22 01:29:36
|
I am looking to compare the forced alignments I generated for the TIMIT dataset with the ground truth provided in the corpus. However, The 'text' file from the data preparation step and the PHN files provided in the corpus (which hold the ground truth) provide differing phoneme sequences for each utterance. For instance, take the sample utterance FAEM0_SX42: phoneme sequence in the *text* file: sil b ih vcl b l ih cl k el s cl k aa l er z sil aa r vcl g y uw hh ih s cl t r iy sil phoneme sequence in *PHN* file (ground truth): h# b ih bcl b l ih kcl k el s kcl k aa l er z pau q aa r gcl g y ux hv ih s tcl t r iy h# As you can see, the phoneme sequences in both files differ by several phonemes, disregarding the h#/sil phonemes. 1. Is this normal? How can I accurately test the validity of my alignments when the ground truth specifies different phoneme sequences than my generated alignments? 2. Is there a script that would provide the phoneme error rate for the generated alignments? 3. What kind of metric can I use to compare my forced alignments to the ground truth? |