Menu

map_adapt problem Linux x64

Help
ITPhoenix
2011-06-21
2012-09-22
  • ITPhoenix

    ITPhoenix - 2011-06-21

    Adaptation recordings were made and no problems until this point which I do
    not understand:

    The -agc none parameter is very important. Make sure the arguments here match
    the parameters in feat.params file inside the acoustic model folder. Please
    not that not all the parameters from feat.param is supported by bw, only a few
    of them. bw for example doesn't suppport upperf or other feature extraction
    params. But those which supported should match.

    So I proceeded to MLLR which failed, but I did not record the errors.

    I then tried MAP and got this:

    linux-2i5i:/home/vince/adaptation # ./map_adapt >     -meanfn hub4wsj_sc_8k/means >     -varfn hub4wsj_sc_8k/variances >     -mixwfn hub4wsj_sc_8k/mixture_weights >     -tmatfn hub4wsj_sc_8k/transition_matrices >     -accumdir . >     -mapmeanfn hub4wsj_sc_8kadapt/means >     -mapvarfn hub4wsj_sc_8kadapt/variances >     -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights >     -maptmatfn hub4wsj_sc_8kadapt/transition_matricesINFO: cmd_ln.c(559): Parsing command line:
    ./map_adapt hub4wsj_sc_8k/means hub4wsj_sc_8k/variances hub4wsj_sc_8k/mixture_weights hub4wsj_sc_8k/transition_matrices . hub4wsj_sc_8kadapt/means hub4wsj_sc_8kadapt/variances hub4wsj_sc_8kadapt/mixture_weights hub4wsj_sc_8kadapt/transition_matrices
    
    ERROR: "cmd_ln.c", line 614: Unknown argument name 'hub4wsj_sc_8k/means'
    ERROR: "cmd_ln.c", line 705: cmd_ln_parse_r failed
    ERROR: "cmd_ln.c", line 754: cmd_ln_parse failed, forced exit
    

    Note: These lines

    export LD_LIBRARY_PATH=/usr/local/lib
    
    export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
    

    needed to be entered every time the program was run. So I created a file
    called local.conf in the subdirectory /etc/ld.so.conf.d containing just the
    line /usr/local/lib. That is,

    Contents of /etc/ld.so.conf.d/local.conf:

    /usr/local/lib
    

    This works every time.

    I also have reason to suspect the onboard sound card and inexpensive desktop
    microphone are causing problems with noise and limited input level. The system
    does work except only about 2% of all results are accurate.

    Is there a way to compensate for the substandard equipment such as "beam
    tuning" and such? Increasing the gain caused so much noise the engine thought
    there was input.

    Any help appreciated.

     
  • Nickolay V. Shmyrev

    Adaptation recordings were made and no problems until this point which I do
    not understand:

    This paragraph was corrected. I hope it's more clear now.

    I then tried MAP and got this:

    If you want to resolve the issue you have you just need to read the output of
    the command. It told you you didn't specify the command correctly. The command
    must be

    map_adapt \
    -meanfn hub4wsj_sc_8k/means \
    -varfn hub4wsj_sc_8k/variances \
    -mixwfn hub4wsj_sc_8k/mixture_weights \
    -tmatfn hub4wsj_sc_8k/transition_matrices \
    -accumdir . \
    -mapmeanfn hub4wsj_sc_8kadapt/means \
    -mapvarfn hub4wsj_sc_8kadapt/variances \
    -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \
    -maptmatfn hub4wsj_sc_8kadapt/transition_matrices

    And not

    /map_adapt hub4wsj_sc_8k/means hub4wsj_sc_8k/variances
    hub4wsj_sc_8k/mixture_weights hub4wsj_sc_8k/transition_matrices .
    hub4wsj_sc_8kadapt/means hub4wsj_sc_8kadapt/variances
    hub4wsj_sc_8kadapt/mixture_weights hub4wsj_sc_8kadapt/transition_matrices

    The command shouldn't have redirection symbols inside. You can learn more
    about shell commands and options reading the shell manual

    The system does work except only about 2% of all results are accurate. Is
    there a way to compensate for the substandard equipment such as "beam tuning"
    and such? Increasing the gain caused so much noise the engine thought there
    was input.

    I don't think that your hypothesis about noise or substandard equipment
    matters. Real issue is in some other place. To improve accuracy you need to
    provide more information what are you trying to do. What command are you
    running, what speech are you trying to recognize, what results do you get. You
    need to be as precise as possible, it will help you to get the solution
    quickly.

     
  • ITPhoenix

    ITPhoenix - 2011-06-22

    Thank you for responding.

    I use

    pocketsphinx_continuous
    

    to start the program.

    I am recognizing English of the Northeast United States. The kind that
    pronounces Rs hard. In other words, not NYC and not Massachusetts, and
    certainly not British.

    Here is are typical, consistent results. The spoken word(s) are listed first
    and the result second:

    Hello 000000000: both

    What is wrong with you INFO: ngram_search.c(474): Resized score stack to
    200000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 10000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 20000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 40000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 80000 entries
    INFO: ngram_search.c(474): Resized score stack to 400000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 160000 entries
    INFO: ngram_search.c(474): Resized score stack to 800000 entries
    INFO: ngram_search.c(466): Resized backpointer table to 320000 entries
    pocketsphinx_continuous: feat.c:362: feat_array_alloc: Assertion `nfr > 0'
    failed.
    Aborted

    (This happens very infrequently and sometimes returns to the "READY" prompt
    after a minute or two)

    Restart.........

    What time is it 000000000: couple times it

    Are you having trouble 000000001: are to keep having trouble bit

    Inconsistent 000000002: it is that a a systems to have

    And so on......

    I tried the MAP commands above and now come up with this:

    linux-2i5i:/home/vince/adaptation # ./map_adapt \ -meanfn hub4wsj_sc_8k/means \ -varfn hub4wsj_sc_8k/variances \ -mixwfn hub4wsj_sc_8k/mixture_weights \ -tmatfn hub4wsj_sc_8k/transition_matrices \ -accumdir . \ -mapmeanfn hub4wsj_sc_8kadapt/means \ -mapvarfn hub4wsj_sc_8kadapt/variances \ -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \ -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
    INFO: cmd_ln.c(559): Parsing command line:
    ./map_adapt  -meanfn hub4wsj_sc_8k/means  -varfn hub4wsj_sc_8k/variances  -mixwfn hub4wsj_sc_8k/mixture_weights  -tmatfn hub4wsj_sc_8k/transition_matrices  -accumdir .  -mapmeanfn hub4wsj_sc_8kadapt/means  -mapvarfn hub4wsj_sc_8kadapt/variances  -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights  -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
    
    ERROR: "cmd_ln.c", line 614: Unknown argument name ' -meanfn'
    ERROR: "cmd_ln.c", line 705: cmd_ln_parse_r failed
    ERROR: "cmd_ln.c", line 754: cmd_ln_parse failed, forced exit
    

    Again, much thanks for your interest.

     
  • Nickolay V. Shmyrev

    Here is are typical, consistent results. The spoken word(s) are listed first
    and the result second:

    Please run

    pocketsphinx_continuous -rawlogdir .
    

    Note the dot (current dir) after rawlogdir. It will also dump audio it's
    trying to recognize to a file. Pack the files into archive and upload them to
    a public file sharing. Give here a link

    I am recognizing English of the Northeast United States. The kind that
    pronounces Rs hard. In other words, not NYC and not Massachusetts, and
    certainly not British.

    It doesn't matter. Most probable reason is that the default language model
    provided is not really suitable to recognize the text you are trying to
    recognize. You need to build your own language model or to download some
    generic one like lm_giga.

    ./map_adapt \ -meanfn hub4wsj_sc_8k/means

    In shell \ is used to escape sequences and join the lines. This way you pass
    the argumnt " -meanfn" to the shell command (note the space which is escaped
    by backslash). when I provided you the command in previous post it meant to be
    multiline command. If you want to enter it as a single line use:

     ./map_adapt  -meanfn hub4wsj_sc_8k/means  -varfn hub4wsj_sc_8k/variances  -mixwfn ...
    

    Without backslashes.

     
  • ITPhoenix

    ITPhoenix - 2011-06-22

    Ok. 5 utterances @ http://www.flickr.com/photos/56063668@N07/

    I have not yet run any adaptation scripts yet. These are from v. 0.7 release,
    out of the box. The .raw format had to be converted to PNG-24 in Photoshop on
    Windows 7 since nothing on openSUSE could do it. Let me know if they need
    improvement.

    Thank you for the CLI pointers.

     
  • ITPhoenix

    ITPhoenix - 2011-06-22
     
  • Nickolay V. Shmyrev

    Sorry, i wanted to get your raw files. What I am supposed to do with that
    flickr link?

     
  • Nickolay V. Shmyrev

    File sharing resource is for example http://dropbox.com

     
  • ITPhoenix

    ITPhoenix - 2011-06-22

    ftp://ftp.earlybir.w06.winhost.com/

    username: earlybir

    passwd: a123

     
  • Nickolay V. Shmyrev

    Hm, looking on your files it seems that adaptation will not help you

    You have some issues with the driver or with pocketsphinx sound input API. The
    audio is no recorded properly, it contains skips and jumps. Most likely some
    issue with the driver

    1. Can you record audio at all? Outside pocketsphinx. Can you record audio through ALSA api with arecord? Can you record with pulseaudio with parecord?
    2. Which pocketsphinx/sphinxbase version are you using
    3. In sphinxbase snapshot we implemented pulseaudio API. Can you try it?
     
  • ITPhoenix

    ITPhoenix - 2011-06-23

    I have successfully recorded with Audacity for the adaptation recordings and
    they played back well but the recording level required speaking directly into
    the mic, otherwise weak. I will have to check the others. I do not like the
    ALSA driver and mixer. There just seems to be something funny about it.
    openSUSE is notorious for sound difficulties, in general. No help from ALSA,
    there is nobody home, so to speak.

    pocketsphinx 0.7 sphinxbase 0.7

    I do not know what sphinxbase snapshot is or what to do with it if I found it.

    Here is my system info in case it helps:

    OS: openSUSE 11.4 x86_64
    Kernel: Linux 2.6.37.6-0.5-desktop
    Desktop: KDE 4.6.00 rel 6
    Machine: HP xw9400 AMD 64 Opteron
    Chipset: nVidia nForce Pro 3600 and 3050 (proprietary Tyan Thunder)
    Drive: OCZ Vertex 60 GB dedicated system--single boot
    RAM: 4 GB ECC

    Video: nVidia GT200 (GeForce 210) 512 MB
    2D Driver: nouveau
    3D Driver: swarst (no 3D acceleration) (7.10)

    Audio: Onboard "card 0": nVidia MCP55 Analog Stereo
    Onboard "card 1": nVidia Corproation Digital Stereo
    Chip: Realtek ALC262
    Alsa Driver: v. 1.0.23

     
  • ITPhoenix

    ITPhoenix - 2011-06-23

    Update:

    Reinstalled ALSA drivers..no improvement.

    Ran ALSA diagnostics. Some problems seem to be present. Notably ACPI...may
    have to disable in BIOS. And, Clocksource tsc unstable, ALC262 codec not
    ready, etc. I will have to research further. Disabling ACPI is easy and has
    been known to cause problems in Linux distros if left activated. Maybe IRQ
    adjustments, too. I noticed there was a swap which may be part of the
    diagnostic routine, but if not, there should be no swapping since the system
    has 4GB RAM.

    ALSA dmesg available here: http://www.alsa-
    project.org/db/?f=e88fdf992b5297289437ddfa9e4e6206fcce27b8

    It is quite extensive.

     
  • ITPhoenix

    ITPhoenix - 2011-06-24

    Update: Ran arecord test .wav recording. The playback was terrible, could
    hardly understand, sounded broken up.!

     
  • Nickolay V. Shmyrev

    Try to build sphinxbase with OSS or pulseaudio support. Maybe those systems
    will work better for you.

     
  • ITPhoenix

    ITPhoenix - 2011-06-24

    I will try rebuilding sphinxbase but I do not know how to add support for OSS
    or Pulseaudio.

    I did try

    pocketsphinx_continuous -adcdev [device]
    

    with both of the only two devices, 0 and 1 and it returns

    ad_oss.c(103): Failed to open audio device(0): No such file or directory
    FATAL_ERROR: "continuous.c", line 242: Failed top open audio device
    

    for example.

     
  • ITPhoenix

    ITPhoenix - 2011-06-26

    Changes were made to ALSA ans Pulseaudio and recognition has improved greatly.
    Would you please check some new rawlogdir files at:
    ftp:\ftp.earlybirdmaintenance.net

    user: earlybir

    pass: a123

    Also the MAP commands were corrected but it returns:

    linux-2i5i:/home/vince/adaptation # ./map_adapt  -meanfn hub4wsj_sc_8k/means  -varfn hub4wsj_sc_8k/variances  -mixwfn hub4wsj_sc_8k/mixture_weights  -tmatfn hub4wsj_sc_8k/transition_matrices  -accumdir .  -mapmeanfn hub4wsj_sc_8kadapt/means  -mapvarfn hub4wsj_sc_8kadapt/variances  -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights  -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
    INFO: cmd_ln.c(559): Parsing command line:
    ./map_adapt \
            -meanfn hub4wsj_sc_8k/means \
            -varfn hub4wsj_sc_8k/variances \
            -mixwfn hub4wsj_sc_8k/mixture_weights \
            -tmatfn hub4wsj_sc_8k/transition_matrices \
            -accumdir . \
            -mapmeanfn hub4wsj_sc_8kadapt/means \
            -mapvarfn hub4wsj_sc_8kadapt/variances \
            -mapmixwfn hub4wsj_sc_8kadapt/mixture_weights \
            -maptmatfn hub4wsj_sc_8kadapt/transition_matrices
    
    Current configuration:
    [NAME]          [DEFLT] [VALUE]
    -accumdir               .,
    -bayesmean      yes     yes
    -example        no      no
    -fixedtau       no      no
    -help           no      no
    -mapmeanfn              hub4wsj_sc_8kadapt/means
    -mapmixwfn              hub4wsj_sc_8kadapt/mixture_weights
    -maptmatfn              hub4wsj_sc_8kadapt/transition_matrices
    -mapvarfn               hub4wsj_sc_8kadapt/variances
    -meanfn                 hub4wsj_sc_8k/means
    -mixwfn                 hub4wsj_sc_8k/mixture_weights
    -mwfloor        0.00001 1.000000e-05
    -tau            10.0    1.000000e+01
    -tmatfn                 hub4wsj_sc_8k/transition_matrices
    -tpfloor        0.0001  1.000000e-04
    -varfloor       0.00001 1.000000e-05
    -varfn                  hub4wsj_sc_8k/variances
    
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/means [1x3x256 array]
    INFO: s3gau_io.c(166): Read hub4wsj_sc_8k/variances [1x3x256 array]
    INFO: s3mixw_io.c(116): Read hub4wsj_sc_8k/mixture_weights [5150x3x256 array]
    INFO: s3tmat_io.c(115): Read hub4wsj_sc_8k/transition_matrices [50x3x4 array]
    INFO: main.c(430): Reading and accumulating observation counts from .
    ERROR: "s3acc_io.c", line 339: Unable to access ./gauden_counts
    FATAL_ERROR: "main.c", line 441: Error in reading densities from .
    

    Thank you for your help as it appears progress is being made.

     
  • ITPhoenix

    ITPhoenix - 2011-06-26

    Sorry the link is: ftp://ftp.earlybirdmaintenance.net

     
  • Nickolay V. Shmyrev

    Changes were made to ALSA ans Pulseaudio and recognition has improved
    greatly. Would you please check some new rawlogdir files at:
    ftp:\ftp.earlybirdmaintenance.net

    No, the recordings are still bad. See how it looks

    https://dl-web.dropbox.com/get/Public/a.png?w=27168a88

    Compare to proper audio

    http://cmusphinx.sourceforge.net/wiki/tutorialconcepts

    ERROR: "s3acc_io.c", line 339: Unable to access ./gauden_counts
    FATAL_ERROR: "main.c", line 441: Error in reading densities from .

    It failed to create the file gauden_counts on previous step

     
  • ITPhoenix

    ITPhoenix - 2011-06-26

    The site asked for a login which I do not have. https://dl-
    web.dropbox.com/get/Public/a.png?w=27168a88
    giving:

    Error (403) It seems you don't belong here! You should probably try logging
    in?

    But I remember when making my adaptation recordings, there were level
    problems. I had to speak directly on top the mic to get a fat waveform, or it
    would be so thin that obviously they were no good.

    Ok, so you were right, it is not the program, it is the OS or drivers. I may
    have to obtain a real audio card that has ALSA/Linux support, since SUSE or
    anyone else cannot help.

    My distro's sound involves KDE desktop, ALSA, Pulseaudio, and a player. A
    change in any one of them could have drastic, systemwide affects.

    It failed to create the file gauden_counts on previous step

    Is there something wrong here and can it be fixed?

    I tried MLLR again this time being careful with the commands. First this
    returned:

    linux-2i5i:/home/vince/adaptation # ./mllr_solve     -meanfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuouau_1s_c_d_dd/means     -varfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances     -outmllrfn mllr_matrix -accumdir .
    bash: ./mllr_solve: No such file or directory
    

    Then mllr_solve was copied to my "working directory" which is the adaptation
    folder where everything related to the adaptation procedure is located. Re-
    running returned:

    linux-2i5i:/home/vince/adaptation # ./mllr_solve     -meanfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means     -varfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances     -outmllrfn mllr_matrix -accumdir .
    INFO: cmd_ln.c(559): Parsing command line:
    ./mllr_solve \
            -meanfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means \
            -varfn /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances \
            -outmllrfn mllr_matrix \
            -accumdir .
    
    Current configuration:
    [NAME]          [DEFLT] [VALUE]
    -accumdir               .,
    -cb2mllrfn      .1cls.  .1cls.
    -cdonly         no      no
    -example        no      no
    -fullvar        no      no
    -help           no      no
    -meanfn                 /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means
    -mllradd        yes     yes
    -mllrmult       yes     yes
    -moddeffn
    -outmllrfn              mllr_matrix
    -varfloor       1e-3    1.000000e-03
    -varfn                  /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances
    
    
    INFO: main.c(387): -- 1. Read input mean, (var) and accumulation.
    WARN: "s3io.c", line 256: Unable to open /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means for reading; No such file or directory
    FATAL_ERROR: "main.c", line 397: Couldn't read /usr/local/share/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means
    

    So it is looking for installation files in /usr/local/share. In the tutorial
    it was unclear to me what the "working directory" actually was and where it
    should be located. At least everything is located in one place. I did
    successfully create the recordings with the corresponding .mfc files. All the
    other files are there as instructed as well.

    I am aware that adaptation is not going to solve my accuracy problem. I just
    want to be ready after the driver/software problem is corrected.

    Then, after it is proved the engine works acceptably, I will invest the time
    and effort in my application. This appears to be what amounts to Dragon for
    Linux, or hopefully IBMs Watson ( I may need some help here).

    Thank you very much for your help. There seems to be hope!!

     
  • ITPhoenix

    ITPhoenix - 2011-06-27

    Ran:

     ./mk_s2sendump \
        -pocketsphinx yes \
        -moddeffn hub4wsj_sc_8kadapt/mdef.txt \
        -mixwfn hub4wsj_sc_8kadapt/mixture_weights \
        -sendumpfn hub4wsj_sc_8kadapt/sendump
    

    with no errors or complaints. Recognition seems slightly improved, except at
    the initial utterance. The beginning of sentences usually return bizarre
    results as always, the latter parts are fairly good. But "excuse me" almost
    invariably returns perfectly. I have never seen "hello" by itself come back.
    It always returns "both" or something else.

    Further research into the OS/driver issue on my machine is pointing to
    possible APIC issues, although it may still be the driver as well.

     
  • Nickolay V. Shmyrev

    Try this link

    http://dl.dropbox.com/u/26073448/a.png

    Or just use wavesurfer to explore your raw files:

    https://sourceforge.net/projects/wavesurfer/

    As for missing files, they are indeed missing. You need to specify the path
    properly, then the file will be successfully processed.

     
  • ITPhoenix

    ITPhoenix - 2011-07-08

    Success!! A Creative X-FiTitanium Fatal1ty Pro was installed and solved the
    mic input level and distortion problem. Pocketsphinx works with guestimated
    90% accuracy OOTB!!

    For the record, I did not have the linking correct. For openSUSE create a text
    file called local.conf in the subdirectory /etc/ld.so.conf.d containing just
    the lines:

    /usr/local/lib 
    include /etc/ld.so.conf.d/*.conf
    

    Then as root run:

    ldconfig
    

    I reinstalled Pocketsphinx after this and the install proceeded without any
    warnings and make check approved everything.
    Running

    pkg-config –cflags –libs pocketsphinx sphinxbase
    

    to confirm installation, returned the proper information according to Building
    Application notes.

    It appears this modification to ld.so.conf.d is to be made BEFORE Pocketsphinx
    is installed.

    Proceeding to my application, the literature and code in Sphinx4 appears to be
    where to go, since the goal is passing Total Turing Test. Sphinx4 could be
    made faster with fast processor and fast SSD on dedicated system board (or
    two).

    I will start new thread on programming after further research in language
    processing textbook and learning some Java.

    If you have any pointers here, kindly let me know............

    Thank you so much for your patience and help.!!

     

Log in to post a comment.