Hello, I am trying to modify pocketsphinx so it will answer requests with chunks of audio data for my home automation project that I'm creating in C#. Since I want to use multiple microphones, then use voice activity detection on them to cut the audio chunks to send to PocketSphinx off the stream after a little noise reduction.
I use NAudio to collect audio data which is a C# byte[], and I've got an array of values 0-255. Then I transfer it via NamedPipe to PocketSphinx process. The problem is that those values do not match the values that are in int16 adbuf[] array when standard sources of sound are used. What should I do then? Is NAudio standard microphone capture format valid for sphinx at all, or am I doing something wrong?
Last edit: Michael Wityk 2017-06-08
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
PocketSphinx off the stream after a little noise reduction.
Noise reduction usually reduces the accuracy since it corrupts speech signal.
I use NAudio to collect audio data which is a C# byte[], and I've got an array of values 0-255. Then I transfer it via NamedPipe to PocketSphinx process.
I've been trying to implement them all day on windows, but I get "Additional information: Unable to find an entry point named 'CSharp_Pocketsphinx_Decoder_DefaultConfig' in DLL 'pocketsphinxwrap" exception. I compiled pocketsphinxwrap.c and sphinxbasewrap.c into .dll libraries and all the .cs files into another .dll library that uses those two, but I can't get it working. Do you have any kind of tips or instructions for a windows deploy?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, I am trying to modify pocketsphinx so it will answer requests with chunks of audio data for my home automation project that I'm creating in C#. Since I want to use multiple microphones, then use voice activity detection on them to cut the audio chunks to send to PocketSphinx off the stream after a little noise reduction.
I use NAudio to collect audio data which is a C# byte[], and I've got an array of values 0-255. Then I transfer it via NamedPipe to PocketSphinx process. The problem is that those values do not match the values that are in int16 adbuf[] array when standard sources of sound are used. What should I do then? Is NAudio standard microphone capture format valid for sphinx at all, or am I doing something wrong?
Last edit: Michael Wityk 2017-06-08
Noise reduction usually reduces the accuracy since it corrupts speech signal.
We actually have C# bindings working:
https://github.com/cmusphinx/pocketsphinx/tree/master/swig/csharp
Figure out the capture format and try to convert it to a standard format supported by CMUSphinx. CMUsphinx expects mono 16khz 16bit PCM data.
Thank you for your response,
I've been trying to implement them all day on windows, but I get "Additional information: Unable to find an entry point named 'CSharp_Pocketsphinx_Decoder_DefaultConfig' in DLL 'pocketsphinxwrap" exception. I compiled pocketsphinxwrap.c and sphinxbasewrap.c into .dll libraries and all the .cs files into another .dll library that uses those two, but I can't get it working. Do you have any kind of tips or instructions for a windows deploy?