CMU Sphinx / Forums / Help: map_adapt problem Linux x64

ITPhoenix - 2011-06-21

Adaptation recordings were made and no problems until this point which I do
not understand:

The -agc none parameter is very important. Make sure the arguments here match
the parameters in feat.params file inside the acoustic model folder. Please
not that not all the parameters from feat.param is supported by bw, only a few
of them. bw for example doesn't suppport upperf or other feature extraction
params. But those which supported should match.

So I proceeded to MLLR which failed, but I did not record the errors.

I then tried MAP and got this:

linux-2i5i:/home/vince/adaptation # ./map_adapt > -meanfn hub4wsj_sc_8k/means > -varfn hub4wsj_sc_8k/variances > -mixwfn hub4wsj_sc_8k/mixture_weights > -tmatfn hub4wsj_sc_8k/transition_matrices > -accumdir . > -mapmeanfn hub4wsj_sc_8kadapt/means > -mapvarfn hub4wsj_sc_8kadapt/variances > -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights > -maptmatfn hub4wsj_sc_8kadapt/transition_matricesINFO: cmd_ln.c(559): Parsing command line: ./map_adapt hub4wsj_sc_8k/means hub4wsj_sc_8k/variances hub4wsj_sc_8k/mixture_weights hub4wsj_sc_8k/transition_matrices . hub4wsj_sc_8kadapt/means hub4wsj_sc_8kadapt/variances hub4wsj_sc_8kadapt/mixture_weights hub4wsj_sc_8kadapt/transition_matrices ERROR: "cmd_ln.c", line 614: Unknown argument name 'hub4wsj_sc_8k/means' ERROR: "cmd_ln.c", line 705: cmd_ln_parse_r failed ERROR: "cmd_ln.c", line 754: cmd_ln_parse failed, forced exit

Note: These lines

export LD_LIBRARY_PATH=/usr/local/lib export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

needed to be entered every time the program was run. So I created a file
called local.conf in the subdirectory /etc/ld.so.conf.d containing just the
line /usr/local/lib. That is,

Contents of /etc/ld.so.conf.d/local.conf:

/usr/local/lib

This works every time.

I also have reason to suspect the onboard sound card and inexpensive desktop
microphone are causing problems with noise and limited input level. The system
does work except only about 2% of all results are accurate.

Is there a way to compensate for the substandard equipment such as "beam
tuning" and such? Increasing the gain caused so much noise the engine thought
there was input.

Any help appreciated.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-21

Adaptation recordings were made and no problems until this point which I do
not understand:

This paragraph was corrected. I hope it's more clear now.

I then tried MAP and got this:

If you want to resolve the issue you have you just need to read the output of
the command. It told you you didn't specify the command correctly. The command
must be

map_adapt \
-meanfn hub4wsj_sc_8k/means \
-varfn hub4wsj_sc_8k/variances \
-mixwfn hub4wsj_sc_8k/mixture_weights \
-tmatfn hub4wsj_sc_8k/transition_matrices \
-accumdir . \
-mapmeanfn hub4wsj_sc_8kadapt/means \
-mapvarfn hub4wsj_sc_8kadapt/variances \
-mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \
-maptmatfn hub4wsj_sc_8kadapt/transition_matrices

And not

/map_adapt hub4wsj_sc_8k/means hub4wsj_sc_8k/variances
hub4wsj_sc_8k/mixture_weights hub4wsj_sc_8k/transition_matrices .
hub4wsj_sc_8kadapt/means hub4wsj_sc_8kadapt/variances
hub4wsj_sc_8kadapt/mixture_weights hub4wsj_sc_8kadapt/transition_matrices

The command shouldn't have redirection symbols inside. You can learn more
about shell commands and options reading the shell manual

The system does work except only about 2% of all results are accurate. Is
there a way to compensate for the substandard equipment such as "beam tuning"
and such? Increasing the gain caused so much noise the engine thought there
was input.

I don't think that your hypothesis about noise or substandard equipment
matters. Real issue is in some other place. To improve accuracy you need to
provide more information what are you trying to do. What command are you
running, what speech are you trying to recognize, what results do you get. You
need to be as precise as possible, it will help you to get the solution
quickly.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-22

Thank you for responding.

I use

pocketsphinx_continuous

to start the program.

I am recognizing English of the Northeast United States. The kind that
pronounces Rs hard. In other words, not NYC and not Massachusetts, and
certainly not British.

Here is are typical, consistent results. The spoken word(s) are listed first
and the result second:

Hello 000000000: both

What is wrong with you INFO: ngram_search.c(474): Resized score stack to
200000 entries
INFO: ngram_search.c(466): Resized backpointer table to 10000 entries
INFO: ngram_search.c(466): Resized backpointer table to 20000 entries
INFO: ngram_search.c(466): Resized backpointer table to 40000 entries
INFO: ngram_search.c(466): Resized backpointer table to 80000 entries
INFO: ngram_search.c(474): Resized score stack to 400000 entries
INFO: ngram_search.c(466): Resized backpointer table to 160000 entries
INFO: ngram_search.c(474): Resized score stack to 800000 entries
INFO: ngram_search.c(466): Resized backpointer table to 320000 entries
pocketsphinx_continuous: feat.c:362: feat_array_alloc: Assertion `nfr > 0'
failed.
Aborted

(This happens very infrequently and sometimes returns to the "READY" prompt
after a minute or two)

Restart.........

What time is it 000000000: couple times it

Are you having trouble 000000001: are to keep having trouble bit

Inconsistent 000000002: it is that a a systems to have

And so on......

I tried the MAP commands above and now come up with this:

linux-2i5i:/home/vince/adaptation # ./map_adapt \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -mixwfn hub4wsj_sc_8k/mixture_weights \ -tmatfn hub4wsj_sc_8k/transition_matrices \ -accumdir . \ -mapmeanfn hub4wsj_sc_8kadapt/means \ -mapvarfn hub4wsj_sc_8kadapt/variances \ -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \ -maptmatfn hub4wsj_sc_8kadapt/transition_matrices INFO: cmd_ln.c(559): Parsing command line: ./map_adapt -meanfn hub4wsj_sc_8k/means -varfn hub4wsj_sc_8k/variances -mixwfn hub4wsj_sc_8k/mixture_weights -tmatfn hub4wsj_sc_8k/transition_matrices -accumdir . -mapmeanfn hub4wsj_sc_8kadapt/means -mapvarfn hub4wsj_sc_8kadapt/variances -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights -maptmatfn hub4wsj_sc_8kadapt/transition_matrices ERROR: "cmd_ln.c", line 614: Unknown argument name ' -meanfn' ERROR: "cmd_ln.c", line 705: cmd_ln_parse_r failed ERROR: "cmd_ln.c", line 754: cmd_ln_parse failed, forced exit

Again, much thanks for your interest.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-22

Here is are typical, consistent results. The spoken word(s) are listed first
and the result second:

Please run

pocketsphinx_continuous -rawlogdir .

Note the dot (current dir) after rawlogdir. It will also dump audio it's
trying to recognize to a file. Pack the files into archive and upload them to
a public file sharing. Give here a link

I am recognizing English of the Northeast United States. The kind that
pronounces Rs hard. In other words, not NYC and not Massachusetts, and
certainly not British.

It doesn't matter. Most probable reason is that the default language model
provided is not really suitable to recognize the text you are trying to
recognize. You need to build your own language model or to download some
generic one like lm_giga.

./map_adapt \ -meanfn hub4wsj_sc_8k/means

In shell \ is used to escape sequences and join the lines. This way you pass
the argumnt " -meanfn" to the shell command (note the space which is escaped
by backslash). when I provided you the command in previous post it meant to be
multiline command. If you want to enter it as a single line use:

./map_adapt -meanfn hub4wsj_sc_8k/means -varfn hub4wsj_sc_8k/variances -mixwfn ...

Without backslashes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-22

Ok. 5 utterances @ http://www.flickr.com/photos/56063668@N07/

I have not yet run any adaptation scripts yet. These are from v. 0.7 release,
out of the box. The .raw format had to be converted to PNG-24 in Photoshop on
Windows 7 since nothing on openSUSE could do it. Let me know if they need
improvement.

Thank you for the CLI pointers.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-22

Sorry: http://www.flickr.com/photos/56063668@N07/

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-22

Sorry, i wanted to get your raw files. What I am supposed to do with that
flickr link?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-22

File sharing resource is for example http://dropbox.com

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-22

ftp://ftp.earlybir.w06.winhost.com/

username: earlybir

passwd: a123

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-22

Hm, looking on your files it seems that adaptation will not help you

You have some issues with the driver or with pocketsphinx sound input API. The
audio is no recorded properly, it contains skips and jumps. Most likely some
issue with the driver

Can you record audio at all? Outside pocketsphinx. Can you record audio through ALSA api with arecord? Can you record with pulseaudio with parecord?

Which pocketsphinx/sphinxbase version are you using

In sphinxbase snapshot we implemented pulseaudio API. Can you try it?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-23

I have successfully recorded with Audacity for the adaptation recordings and
they played back well but the recording level required speaking directly into
the mic, otherwise weak. I will have to check the others. I do not like the
ALSA driver and mixer. There just seems to be something funny about it.
openSUSE is notorious for sound difficulties, in general. No help from ALSA,
there is nobody home, so to speak.

pocketsphinx 0.7 sphinxbase 0.7

I do not know what sphinxbase snapshot is or what to do with it if I found it.

Here is my system info in case it helps:

OS: openSUSE 11.4 x86_64
Kernel: Linux 2.6.37.6-0.5-desktop
Desktop: KDE 4.6.00 rel 6
Machine: HP xw9400 AMD 64 Opteron
Chipset: nVidia nForce Pro 3600 and 3050 (proprietary Tyan Thunder)
Drive: OCZ Vertex 60 GB dedicated system--single boot
RAM: 4 GB ECC

Video: nVidia GT200 (GeForce 210) 512 MB
2D Driver: nouveau
3D Driver: swarst (no 3D acceleration) (7.10)

Audio: Onboard "card 0": nVidia MCP55 Analog Stereo
Onboard "card 1": nVidia Corproation Digital Stereo
Chip: Realtek ALC262
Alsa Driver: v. 1.0.23

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-23

Update:

Reinstalled ALSA drivers..no improvement.

Ran ALSA diagnostics. Some problems seem to be present. Notably ACPI...may
have to disable in BIOS. And, Clocksource tsc unstable, ALC262 codec not
ready, etc. I will have to research further. Disabling ACPI is easy and has
been known to cause problems in Linux distros if left activated. Maybe IRQ
adjustments, too. I noticed there was a swap which may be part of the
diagnostic routine, but if not, there should be no swapping since the system
has 4GB RAM.

ALSA dmesg available here: http://www.alsa-
project.org/db/?f=e88fdf992b5297289437ddfa9e4e6206fcce27b8

It is quite extensive.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-24

Update: Ran arecord test .wav recording. The playback was terrible, could
hardly understand, sounded broken up.!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-24

Try to build sphinxbase with OSS or pulseaudio support. Maybe those systems
will work better for you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-06-24

I will try rebuilding sphinxbase but I do not know how to add support for OSS
or Pulseaudio.

I did try

pocketsphinx_continuous -adcdev [device]

with both of the only two devices, 0 and 1 and it returns

ad_oss.c(103): Failed to open audio device(0): No such file or directory FATAL_ERROR: "continuous.c", line 242: Failed top open audio device

for example.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Changes were made to ALSA ans Pulseaudio and recognition has improved greatly.
Would you please check some new rawlogdir files at:
ftp:\ftp.earlybirdmaintenance.net

user: earlybir

pass: a123

Also the MAP commands were corrected but it returns:

linux-2i5i:/home/vince/adaptation # ./map_adapt  -meanfn hub4wsj_sc_8k/means  -varfn hub4wsj_sc_8k/variances  -mixwfn hub4wsj_sc_8k/mixture_weights  -tmatfn hub4wsj_sc_8k/transition_matrices  -accumdir .  -mapmeanfn hub4wsj_sc_8kadapt/means  -mapvarfn hub4wsj_sc_8kadapt/variances  -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights  -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
INFO: cmd_ln.c(559): Parsing command line:
./map_adapt \
        -meanfn hub4wsj_sc_8k/means \
        -varfn hub4wsj_sc_8k/variances \
        -mixwfn hub4wsj_sc_8k/mixture_weights \
        -tmatfn hub4wsj_sc_8k/transition_matrices \
        -accumdir . \
        -mapmeanfn hub4wsj_sc_8kadapt/means \
        -mapvarfn hub4wsj_sc_8kadapt/variances \
        -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \
        -maptmatfn hub4wsj_sc_8kadapt/transition_matrices

Current configuration:
[NAME]          [DEFLT] [VALUE]
-accumdir               .,
-bayesmean      yes     yes
-example        no      no
-fixedtau       no      no
-help           no      no
-mapmeanfn              hub4wsj_sc_8kadapt/means
-mapmixwfn              hub4wsj_sc_8kadapt/mixture_weights
-maptmatfn              hub4wsj_sc_8kadapt/transition_matrices
-mapvarfn               hub4wsj_sc_8kadapt/variances
-meanfn                 hub4wsj_sc_8k/means
-mixwfn                 hub4wsj_sc_8k/mixture_weights
-mwfloor        0.00001 1.000000e-05
-tau            10.0    1.000000e+01
-tmatfn                 hub4wsj_sc_8k/transition_matrices
-tpfloor        0.0001  1.000000e-04
-varfloor       0.00001 1.000000e-05
-varfn                  hub4wsj_sc_8k/variances

INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
INFO: main.c(430): Reading and accumulating observation counts from .
ERROR: "s3acc_io.c", line 339: Unable to access ./gauden_counts
FATAL_ERROR: "main.c", line 441: Error in reading densities from .

Thank you for your help as it appears progress is being made.

ITPhoenix - 2011-06-26

Sorry the link is: ftp://ftp.earlybirdmaintenance.net

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-26

Changes were made to ALSA ans Pulseaudio and recognition has improved
greatly. Would you please check some new rawlogdir files at:
ftp:\ftp.earlybirdmaintenance.net

No, the recordings are still bad. See how it looks

https://dl-web.dropbox.com/get/Public/a.png?w=27168a88

Compare to proper audio

http://cmusphinx.sourceforge.net/wiki/tutorialconcepts

ERROR: "s3acc_io.c", line 339: Unable to access ./gauden_counts
FATAL_ERROR: "main.c", line 441: Error in reading densities from .

It failed to create the file gauden_counts on previous step

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

The site asked for a login which I do not have. https://dl-
web.dropbox.com/get/Public/a.png?w=27168a88 giving:

Error (403) It seems you don't belong here! You should probably try logging
in?

But I remember when making my adaptation recordings, there were level
problems. I had to speak directly on top the mic to get a fat waveform, or it
would be so thin that obviously they were no good.

Ok, so you were right, it is not the program, it is the OS or drivers. I may
have to obtain a real audio card that has ALSA/Linux support, since SUSE or
anyone else cannot help.

My distro's sound involves KDE desktop, ALSA, Pulseaudio, and a player. A
change in any one of them could have drastic, systemwide affects.

It failed to create the file gauden_counts on previous step

Is there something wrong here and can it be fixed?

I tried MLLR again this time being careful with the commands. First this
returned:

linux-2i5i:/home/vince/adaptation # ./mllr_solve     -meanfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuouau_1s_c_d_dd/means     -varfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances     -outmllrfn mllr_matrix -accumdir .
bash: ./mllr_solve: No such file or directory

Then mllr_solve was copied to my "working directory" which is the adaptation
folder where everything related to the adaptation procedure is located. Re-
running returned:

linux-2i5i:/home/vince/adaptation # ./mllr_solve     -meanfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means     -varfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances     -outmllrfn mllr_matrix -accumdir .
INFO: cmd_ln.c(559): Parsing command line:
./mllr_solve \
        -meanfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means \
        -varfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances \
        -outmllrfn mllr_matrix \
        -accumdir .

Current configuration:
[NAME]          [DEFLT] [VALUE]
-accumdir               .,
-cb2mllrfn      .1cls.  .1cls.
-cdonly         no      no
-example        no      no
-fullvar        no      no
-help           no      no
-meanfn                 /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means
-mllradd        yes     yes
-mllrmult       yes     yes
-moddeffn
-outmllrfn              mllr_matrix
-varfloor       1e-3    1.000000e-03
-varfn                  /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances


INFO: main.c(387): -- 1. Read input mean, (var) and accumulation.
WARN: "s3io.c", line 256: Unable to open /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means for reading; No such file or directory
FATAL_ERROR: "main.c", line 397: Couldn't read /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means

So it is looking for installation files in /usr/local/share. In the tutorial
it was unclear to me what the "working directory" actually was and where it
should be located. At least everything is located in one place. I did
successfully create the recordings with the corresponding .mfc files. All the
other files are there as instructed as well.

I am aware that adaptation is not going to solve my accuracy problem. I just
want to be ready after the driver/software problem is corrected.

Then, after it is proved the engine works acceptably, I will invest the time
and effort in my application. This appears to be what amounts to Dragon for
Linux, or hopefully IBMs Watson ( I may need some help here).

Thank you very much for your help. There seems to be hope!!

ITPhoenix - 2011-06-27

Ran:

./mk_s2sendump \ -pocketsphinx yes \ -moddeffn hub4wsj_sc_8kadapt/mdef.txt \ -mixwfn hub4wsj_sc_8kadapt/mixture_weights \ -sendumpfn hub4wsj_sc_8kadapt/sendump

with no errors or complaints. Recognition seems slightly improved, except at
the initial utterance. The beginning of sentences usually return bizarre
results as always, the latter parts are fairly good. But "excuse me" almost
invariably returns perfectly. I have never seen "hello" by itself come back.
It always returns "both" or something else.

Further research into the OS/driver issue on my machine is pointing to
possible APIC issues, although it may still be the driver as well.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2011-06-27

Try this link

http://dl.dropbox.com/u/26073448/a.png

Or just use wavesurfer to explore your raw files:

https://sourceforge.net/projects/wavesurfer/

As for missing files, they are indeed missing. You need to specify the path
properly, then the file will be successfully processed.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ITPhoenix - 2011-07-08

Success!! A Creative X-FiTitanium Fatal1ty Pro was installed and solved the
mic input level and distortion problem. Pocketsphinx works with guestimated
90% accuracy OOTB!!

For the record, I did not have the linking correct. For openSUSE create a text
file called local.conf in the subdirectory /etc/ld.so.conf.d containing just
the lines:

/usr/local/lib include /etc/ld.so.conf.d/*.conf

Then as root run:

ldconfig

I reinstalled Pocketsphinx after this and the install proceeded without any
warnings and make check approved everything.
Running

pkg-config –cflags –libs pocketsphinx sphinxbase

to confirm installation, returned the proper information according to Building
Application notes.

It appears this modification to ld.so.conf.d is to be made BEFORE Pocketsphinx
is installed.

Proceeding to my application, the literature and code in Sphinx4 appears to be
where to go, since the goal is passing Total Turing Test. Sphinx4 could be
made faster with fast processor and fast SSD on dedicated system board (or
two).

I will start new thread on programming after further research in language
processing textbook and learning some Java.

If you have any pointers here, kindly let me know............

Thank you so much for your patience and help.!!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

map_adapt problem Linux x64

Speech Recognition Toolkit

Forums

Help

map_adapt problem Linux x64 document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

map_adapt problem Linux x64