CMU Sphinx / Forums / Help: g2p-seq2seq command raises error -model.params not exists.

Cindy Chang - 2019-02-21

I have downloaded mandarin model from sourceforge.net on date 2019.02.20.

g2p-seq2seq --decode wordlist_date.txt --model_dir /home/cindy951357/PycharmProjects/pocketsphinx/zh_cn.cd_cont_5000/

Download link

Full errors:

Traceback (most recent call last):
File "/usr/local/bin/g2p-seq2seq", line 11, in <module>
load_entry_point('g2p-seq2seq==6.2.2a0', 'console_scripts', 'g2p-seq2seq')()</module>

File "/usr/local/lib/python3.6/dist-packages/g2p_seq2seq-6.2.2a0-py3.6.egg/g2p_seq2seq/app.py", line 116, in main
params.hparams = g2p_trainer_utils.load_params(FLAGS.model_dir)

File "/usr/local/lib/python3.6/dist-packages/g2p_seq2seq-6.2.2a0-py3.6.egg/g2p_seq2seq/g2p_trainer_utils.py", line 260, in load_params
raise Exception("File {} not exists.".format(params_file_path))

Exception: File /home/cindy951357/PycharmProjects/pocketsphinx/zh_cn.cd_cont_5000/model.params not exists.

How can I generate model.params if the official zipped folder doesn't contain it?
Please help, thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2019-02-21
  
  g2p-seq2seq will not work for Chinese because Hanzi is not very predictable, you have to look for specialized tool for CC-CEDICT.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Cindy Chang - 2019-02-21
    
    Thank you. From your recommendation, I found This CEDICT project; it only romanizes Hanzi but I think what I need are "phonemes". How to generate "phonemes" from romanized Hanzi? Can I give the romanized Hanzi like "jintian" to 2p-seq2seq command?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2019-02-21
      
      There is a dictionary in zh_cn.cd_cont_5000, you can simply use it, it is derived from cedict, you will not get anything new from the original source. To transcribe new words you simply concatenate the phonetic transcription for individual symbols.
      
      You have to write a Python script to convert romanized Hanzi to phonemes used by cmusphinx model if you are interested in doing that.
      
      g2p-seq2seq is totally irrelevant for Chinese, I told you above.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

g2p-seq2seq command raises error -model.params not exists.

Speech Recognition Toolkit

Forums

Help

g2p-seq2seq command raises error -model.params not exists. document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

g2p-seq2seq command raises error -model.params not exists.