My problems are: I am trying to build a robust ASR system for product order. The product number are around few thouthands, the addresses are around ten thousands. These order will be made by some constrained pattern like : I want 10 Nike shoses to be delivered at XXX(address) at XXX(time). Currently I am trying use language model ->fst to do decode.
I wonder what is the best way to get a good ASR performance?
1>I can generate the whole possible sentences and then train a 3gram language model... but Number_of_productsNumber_of_addressesNumber_of_time could be very huge
2>To modify the backoff number of "I want" , "delivered at" , (Actually I don't know how to do this, just a feeling that this might work)
3> Is there any fix grammar (or classes based language model can be used in fst?)
Sorry for this rookie question. Any suggestions will be greatly apprieciated.
Best
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
These order will be made by some constrained pattern like : I want 10 Nike shoses to be delivered at XXX(address) at XXX(time). Currently I am trying use language model ->fst to do decode.
People never use constrained pattern, if you look on real-life logs you never know what will you hear there.
I can generate the whole possible sentences and then train a 3gram language model... but Number_of_productsNumber_of_addressesNumber_of_time could be very huge
You do not need to generate every possible sentence, just a reasonable amount of examples is ok. And those examples must be distributed ideally by the true frequency requiests go. Ideally they will follow already known search queries. Google is doing that for example, they train a language model from web search queries. If you don't have such, you might want to collect it yourself. For bootstrap you can use some artificially created data, but it does not matter much, because it is only for bootstrap.
Is there any fix grammar (or classes based language model can be used in fst?)
You can convert class-based language model to FST, but in our practice class-based model does not bring much, you can simply train a plain language model.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, but overall discounting is not quite straightforward for queries, so it is better to tune backoff on some small real-life set. You do not need much, a couple of hundred queries is enough to select best discounting.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
My problems are: I am trying to build a robust ASR system for product order. The product number are around few thouthands, the addresses are around ten thousands. These order will be made by some constrained pattern like : I want 10 Nike shoses to be delivered at XXX(address) at XXX(time). Currently I am trying use language model ->fst to do decode.
I wonder what is the best way to get a good ASR performance?
1>I can generate the whole possible sentences and then train a 3gram language model... but Number_of_productsNumber_of_addressesNumber_of_time could be very huge
2>To modify the backoff number of "I want" , "delivered at" , (Actually I don't know how to do this, just a feeling that this might work)
3> Is there any fix grammar (or classes based language model can be used in fst?)
Sorry for this rookie question. Any suggestions will be greatly apprieciated.
Best
People never use constrained pattern, if you look on real-life logs you never know what will you hear there.
You do not need to generate every possible sentence, just a reasonable amount of examples is ok. And those examples must be distributed ideally by the true frequency requiests go. Ideally they will follow already known search queries. Google is doing that for example, they train a language model from web search queries. If you don't have such, you might want to collect it yourself. For bootstrap you can use some artificially created data, but it does not matter much, because it is only for bootstrap.
You can convert class-based language model to FST, but in our practice class-based model does not bring much, you can simply train a plain language model.
Hi Nickolay
Thank you very much for your quick reply.
So, I'll :
1> Collect reasonable real queries
2> Add the full list of product names and the addresses
3> Train a language model outof it.
I guess the backoff probablities will act as a key point to give unseen queries a reasonable probabilites.
All the best
Xiaofeng
Yes, but overall discounting is not quite straightforward for queries, so it is better to tune backoff on some small real-life set. You do not need much, a couple of hundred queries is enough to select best discounting.
Thank you very much!!