Hello,
I'm training an Arabic acoustic model using pocketsphinx, I tried to run experiments, Everything goes fine and I got results. However, in result folder: the sentence in ALIGN file not in the Arabic language (the word in square form). Although sentence in Match File in Arabic language. How to get align file in Arabic language??
I attached match file and align file .
I highly appreciate your cooperation and assistance.
Thank
Hello, I tried to update to 5prealph version, but got the same align file, also I tried to change encode type in script word align. pl use encoding 'utf8'; to use encoding 'utf16 or ANSI,
I have this error "Can't locate object method "cat_decode" via package "Encode::Unicode" at /usr/local/lib/sphinxtrain/scripts/decode/word_align.pl line 18.
"
What type of encoding I can use?? how to got sentences in align file in Arabic language
Last edit: safia hammad 2015-11-05
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
also I tried to change encode type in script word align. pl
use encoding 'utf8'; to use encoding 'utf16 or ANSI,
This has no relation to your problem, it specifies source code encoding, not encoding of the input file.
I have this error "Can't locate object method "cat_decode" via package "Encode::Unicode" at /usr/local/lib/sphinxtrain/scripts/decode/word_align.pl line 18.
This error tells that your perl installation is not complete. I have no idea how did you install perl, but you probably need to consider proper installation of it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thank you @Nickolay V. Shmyrev
"This has no relation to your problem," ..what about my problem ??this problem specific in arabic language or any other language . how to got align file in arabic language , I've tried many ways but i didnt get any solution
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I installed cygwin and i installed perl
when i write in Cygwin Terminal " perl -- version " (Is it right or not ??? ) :
$ perl --version
This is perl 5, version 14, subversion 4 (v5.14.4) built for cygwin-thread-multi
(with 14 registered patches, see perl -V for more detail)
Copyright 1987-2013, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The error you quoted corresponds to the case where you changed utf-8 to utf-16, are you sure you get the error about cat_decode with the original file?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yes okay , Nickolay Are u have any idea why i got align file in this form ( attached align file ) . or how to got align file in arabic ?? also i tried open file in notebad++ and chaneg unicode but i didnt get any solution
@Nickolay V. Shmyrev how to got the hypothesis and the reference file. Are they same the Match file in result folder ??
also, I tried using latest sphinxtrain from GitHub. , but I got this align file ( attached ).
the sentence in this form : ( \xe1\xf3\xc7 \xca\xf5\xd4\xf3\xc7\xe6\xf6\xd1\xf5\xe6\xc7 )
Hello,
I'm training an Arabic acoustic model using pocketsphinx, I tried to run experiments, Everything goes fine and I got results. However, in result folder: the sentence in ALIGN file not in the Arabic language (the word in square form). Although sentence in Match File in Arabic language. How to get align file in Arabic language??
I attached match file and align file .
I highly appreciate your cooperation and assistance.
Thank
Last edit: safia hammad 2015-08-30
Anyone have any idea about this programe
looks like it was a bug that was fixed in https://github.com/cmusphinx/sphinxtrain/commit/6fcd71ec7091c1f06ea76c9d4171b8b939e9811d -when i upgrade to the latest code of sphinxtrain, it works for me.
Hello, I tried to update to 5prealph version, but got the same align file, also I tried to change encode type in script word align. pl
use encoding 'utf8'; to use encoding 'utf16 or ANSI,
I have this error "Can't locate object method "cat_decode" via package "Encode::Unicode" at /usr/local/lib/sphinxtrain/scripts/decode/word_align.pl line 18.
"
What type of encoding I can use?? how to got sentences in align file in Arabic language
Last edit: safia hammad 2015-11-05
This has no relation to your problem, it specifies source code encoding, not encoding of the input file.
This error tells that your perl installation is not complete. I have no idea how did you install perl, but you probably need to consider proper installation of it.
thank you @Nickolay V. Shmyrev
"This has no relation to your problem," ..what about my problem ??this problem specific in arabic language or any other language . how to got align file in arabic language , I've tried many ways but i didnt get any solution
I wrote you that the problem is related to perl installation it is not related to arabic language or any language.
I installed cygwin and i installed perl
when i write in Cygwin Terminal " perl -- version " (Is it right or not ??? ) :
$ perl --version
This is perl 5, version 14, subversion 4 (v5.14.4) built for cygwin-thread-multi
(with 14 registered patches, see perl -V for more detail)
Copyright 1987-2013, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
Ok, share your current word_align.pl file
thank you
this is word_align.pl
The error you quoted corresponds to the case where you changed utf-8 to utf-16, are you sure you get the error about cat_decode with the original file?
You can also try to update to the latest sphintrain from github, it has different word_align.pl with compatibility fixes.
i got error only when i changed utf8 to utf16 or to ANSI , i tried change to get sentences in align file in arabic languge
You should not change anything in the script to align files in arabic language.
yes okay , Nickolay Are u have any idea why i got align file in this form ( attached align file ) . or how to got align file in arabic ?? also i tried open file in notebad++ and chaneg unicode but i didnt get any solution
To get help on this issue you need to provide the hypothesis and the reference file.
You also need to try with the latest sphinxtrain from github.
@Nickolay V. Shmyrev how to got the hypothesis and the reference file. Are they same the Match file in result folder ??
also, I tried using latest sphinxtrain from GitHub. , but I got this align file ( attached ).
the sentence in this form : ( \xe1\xf3\xc7 \xca\xf5\xd4\xf3\xc7\xe6\xf6\xd1\xf5\xe6\xc7 )
Last edit: safia hammad 2015-11-09
Share result/match or the test transcription file.
this is match and test transcription file.
Our tools expect input to be UTF-8 encoding, your files are encoded to cp1256. You need to use UTF-8 encoding for all your files.