I have a pythons tcp socket running on ubuntu and a java client running on windows. The java client sends an audio inputstream to the python application. The problem is im getting no hypothesis somehow from the decoder. At first i developed everything in python without a tcp socket. I used pyaudio to get the input stream from my microphone, i read the data from the stream and passed them to the decoder to process it. That worked very well. Now since im receiving a input stream from the client and pass the data into the decoder nothing is recognized at all.
Client
Audioformat for record:
static final AudioFormat DEFAULT_AUDIO_FORMAT = new AudioFormat(16000f, 16, 1, true, false);
public static void main(String[] args) {
try {
AudioRecorder ar = new AudioRecorder();
ar.setAudioSystem(new ASWrapper());
AudioInputStream is = ar.startRecording();
Socket socket = new Socket("192.168.98.37", 9876);
OutputStream os = socket.getOutputStream();
byte buffer[] = new byte[512];
while (true) {
int read = is.read(buffer);
ByteBuffer byteBuffer = ByteBuffer.wrap(buffer, 0, read);
byteBuffer.order(ByteOrder.LITTLE_ENDIAN);
os.write(byteBuffer.array(), 0, read);
os.flush();
}
} catch (IOException e) {
e.printStackTrace();
}
}
Server
importsocketimportpyaudioimportwaveimporttimeimportosfrompocketsphinx.pocketsphinximport*fromsphinxbase.sphinxbaseimport*CHUNK=4096FORMAT=pyaudio.paInt16CHANNELS=1RATE=16000WAVE_OUTPUT_FILENAME="server_output.wav"WIDTH=2frames=[]p=pyaudio.PyAudio()stream=p.open(format=FORMAT,channels=CHANNELS,rate=RATE,output=True,input=True,frames_per_buffer=CHUNK)HOST=''# Symbolic name meaning all available interfacesPORT=9876# Arbitrary non-privileged portmodeldir="./"datadir="../../../test/data"# Create a decoder with certain modelconfig=Decoder.default_config()config.set_string('-hmm',os.path.join(modeldir,'de-de/de-de'))config.set_string('-dict',os.path.join(modeldir,'de-de/train.dic'))config.set_string('-kws',os.path.join(modeldir,'de-de/kws.txt'))#config.set_string('-lm', os.path.join(modeldir, 'de-de/train.lm'))#config.set_string('-logfn', os.path.join(modeldir, '/dev/null'))s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)s.bind((HOST,PORT))s.listen(1)conn,addr=s.accept()decoder=Decoder(config)print('Connected by',addr)data=conn.recv(CHUNK)whiledata!='':# just to hear the sound qualitystream.write(data)decoder.start_utt()decoder.process_raw(data,False,False)hypothesis=decoder.hyp()ifhypothesis!=None:print('Best hypothesis: ',time.time(),hypothesis.hypstr," model score: ",hypothesis.best_score," confidence: ",hypothesis.prob)decoder.end_utt()data=conn.recv(CHUNK)stream.stop_stream()stream.close()p.terminate()conn.close()
Last edit: unrated 2018-09-11
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have a pythons tcp socket running on ubuntu and a java client running on windows. The java client sends an audio inputstream to the python application. The problem is im getting no hypothesis somehow from the decoder. At first i developed everything in python without a tcp socket. I used pyaudio to get the input stream from my microphone, i read the data from the stream and passed them to the decoder to process it. That worked very well. Now since im receiving a input stream from the client and pass the data into the decoder nothing is recognized at all.
Client
Audioformat for record:
Server
Last edit: unrated 2018-09-11
Make sure you transfer data properly. Extra zeroes, byte order switch are very frequent mistakes. You can dump stream data and listen for it.