I've been following the Robust Group tutorial using sphinx2 and the an4 database in a linux environment. I've had a good share of problems on the way but managed to hack my way through most of them. However, when I try to do a preliminary decode using the slave.pl script file, I get the following error:
> perl scripts_pl/decode/slave.pl
MODULE: DECODE Decoding using models previously trained
Decoding 130 segments starting at 0 (part 1 of 1)
Using files: stack smashing detected : /afs/ec.auckland.ac.nz/users/a/p/apar116/unixhome/ResearchProject/ResearcMaterial/SPHINX/tutorial/an4/bin/sphinx2_batch terminated
0% ERROR: ERROR: "lm_3g.c", line 1057: '<UNK>' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'HALL' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'LANE' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'MEMORY' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'TWELVTH' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'WEAN' is in LM unigrams but not in dictionary
Finished
Can't open /afs/ec.auckland.ac.nz/users/a/p/apar116/unixhome/ResearchProject/ResearcMaterial/SPHINX/tutorial/an4/result/an4-1-1.match
SENTENCE ERROR: 0.000% (0/0) WORD ERROR RATE: 0.000% (0/0)
The an4-1-1.log file among other things has the following details,
WARNING: "lm_3g.c", line 1066: 6 LM words not in dict; ignored
INFO: lm_3g.c(1075): bo_wt(</s>) changed from -99.0000 to -99.0000
INFO: lm_3g.c(1081): prob(<s>) changed from -99.0000 to -99.0000
INFO: lm_3g.c(1119): prob(<s>,</s>) changed from 0.0000 to -99.0000
INFO: lm_3g.c(1727): 10.00 = Language Weight
INFO: lm_3g.c(1728): 1.00 = Unigram Weight
INFO: lm_3g.c(1729): -16095 = LOG (Insertion Penalty (0.20))
INFO: lm_3g.c(1223): LM("") added
INFO: lm_3g.c(1153): Adding 0 initial OOV words to LM
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/...tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.ccode
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/.../tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.d2code
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/.../tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.p3code
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/.../tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.xcode
Aborted
When I tried to open the files in an4.cd_semi_1000.s2models with a normal audio player in linux, the ones listed above opened fine, but the others couldn't be recognised. I wonder if the files are corrupted somehow, opening them in binary format shows them to be far smaller than the ones that were successfully decoded...
All the previous steps from the tutorial seem to be working fine, without any serious errors... and I'm really stuck on this..
Would really appreciate any help....
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm using Ubuntu 6.10 as my operating system. I'm not sure what compiler you're talking about exactly, but I'm using perl v5.8.8 and gcc v.4.1.2 as my c compiler.
Is there any way you know of to debug the bug I inadvertently found? :)
Thanks heaps!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've been following the Robust Group tutorial using sphinx2 and the an4 database in a linux environment. I've had a good share of problems on the way but managed to hack my way through most of them. However, when I try to do a preliminary decode using the slave.pl script file, I get the following error:
> perl scripts_pl/decode/slave.pl
MODULE: DECODE Decoding using models previously trained
Decoding 130 segments starting at 0 (part 1 of 1)
Using files: stack smashing detected : /afs/ec.auckland.ac.nz/users/a/p/apar116/unixhome/ResearchProject/ResearcMaterial/SPHINX/tutorial/an4/bin/sphinx2_batch terminated
0% ERROR: ERROR: "lm_3g.c", line 1057: '<UNK>' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'HALL' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'LANE' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'MEMORY' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'TWELVTH' is in LM unigrams but not in dictionary
ERROR: ERROR: "lm_3g.c", line 1057: 'WEAN' is in LM unigrams but not in dictionary
Finished
Can't open /afs/ec.auckland.ac.nz/users/a/p/apar116/unixhome/ResearchProject/ResearcMaterial/SPHINX/tutorial/an4/result/an4-1-1.match
SENTENCE ERROR: 0.000% (0/0) WORD ERROR RATE: 0.000% (0/0)
The an4-1-1.log file among other things has the following details,
WARNING: "lm_3g.c", line 1066: 6 LM words not in dict; ignored
INFO: lm_3g.c(1075): bo_wt(</s>) changed from -99.0000 to -99.0000
INFO: lm_3g.c(1081): prob(<s>) changed from -99.0000 to -99.0000
INFO: lm_3g.c(1119): prob(<s>,</s>) changed from 0.0000 to -99.0000
INFO: lm_3g.c(1727): 10.00 = Language Weight
INFO: lm_3g.c(1728): 1.00 = Unigram Weight
INFO: lm_3g.c(1729): -16095 = LOG (Insertion Penalty (0.20))
INFO: lm_3g.c(1223): LM("") added
INFO: lm_3g.c(1153): Adding 0 initial OOV words to LM
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/...tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.ccode
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/.../tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.d2code
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/.../tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.p3code
INFO: areadint.c(95): Byte reversing /afs/ec.auckland.ac.nz/.../tutorial/an4/model_parameters/an4.cd_semi_1000.s2models/AA.xcode
Aborted
When I tried to open the files in an4.cd_semi_1000.s2models with a normal audio player in linux, the ones listed above opened fine, but the others couldn't be recognised. I wonder if the files are corrupted somehow, opening them in binary format shows them to be far smaller than the ones that were successfully decoded...
All the previous steps from the tutorial seem to be working fine, without any serious errors... and I'm really stuck on this..
Would really appreciate any help....
I'm using Ubuntu 6.10 as my operating system. I'm not sure what compiler you're talking about exactly, but I'm using perl v5.8.8 and gcc v.4.1.2 as my c compiler.
Is there any way you know of to debug the bug I inadvertently found? :)
Thanks heaps!
Congratulations, you have found a bug in Sphinx2! Looks like there is some kind of buffer overflow or other illegal memory access in the code.
What compiler and operating system are you using?