Menu

Continuous: nonstop talking OK, silence not

Help
Halle
2010-04-03
2012-09-22
<< < 1 2 (Page 2 of 2)
  • Nickolay V. Shmyrev

    So is there any issue now or everythign is ok?

     
  • Halle

    Halle - 2010-04-06

    No, it still never gets to "Listening", sorry I didn't mention that. This is
    what is going on after the calibration finishes: it does about 250 loops of
    calibration in which each has the following values throughout ad_read:

    In ad_read, length before read (max * 2) is 512
    In ad_read, length after read (actual bytes read) 512
    In ad_read, length on return (half bytes read) is 256

    Which looks reasonable. Then, it prints "READY..." and if I am talking at the
    time that "READY..." prints, I get "Listening..." (which never returns a hyp)
    and the following values in and out of ad_read:

    In ad_read, length before read (max * 2) is 8192
    In ad_read, length after read (actual bytes read) 8192
    In ad_read, length on return (half bytes read) is 4096

    If I am not talking when "READY..." prints, I never get "Listening..." and
    these are the kinds of values that are going in and out of ad_read (k is just
    me monitoring what is being returned from cont_ad_read):

    k is 0

    2010-04-06 13:28:38.617 Continuous In cont_ad_read_internal, about to do the
    first ad_read
    In ad_read, length before read (max * 2) is 54472
    In ad_read, length after read (actual bytes read) 54472
    In ad_read, length on return (half bytes read) is 27236

    2010-04-06 13:28:38.618 Continuous In cont_ad_read_internal, about to do the
    second ad_read
    In ad_read, length before read (max * 2) is 63488
    In ad_read, length after read (actual bytes read) 63488
    In ad_read, length on return (half bytes read) is 31744

    k is 0

    2010-04-06 13:28:38.720 Continuous In cont_ad_read_internal, about to do the
    first ad_read
    In ad_read, length before read (max * 2) is 67584
    In ad_read, length after read (actual bytes read) 67584
    In ad_read, length on return (half bytes read) is 33792

    2010-04-06 13:28:38.721 Continuous In cont_ad_read_internal, about to do the
    second ad_read
    In ad_read, length before read (max * 2) is 50688
    In ad_read, length after read (actual bytes read) 50688
    In ad_read, length on return (half bytes read) is 25344

    k is 0

    2010-04-06 13:28:38.822 Continuous In cont_ad_read_internal, about to do the
    first ad_read
    In ad_read, length before read (max * 2) is 80384
    In ad_read, length after read (actual bytes read) 80384
    In ad_read, length on return (half bytes read) is 40192

    2010-04-06 13:28:38.823 Continuous In cont_ad_read_internal, about to do the
    second ad_read
    In ad_read, length before read (max * 2) is 37888
    In ad_read, length after read (actual bytes read) 37888
    In ad_read, length on return (half bytes read) is 18944

     
  • Halle

    Halle - 2010-04-06

    An interesting thing is that if I comment out cont_ad_calib() from my
    utterance loop, the results are exactly the same.

     
  • Halle

    Halle - 2010-04-06

    OK, I have some improvement, and a little clue about where the issue might
    lie. For simplicity's sake I've switched over to a function that reads packets
    instead of bytes so I have a 1:1 relationship between what is going into
    ad_read and what is supposed to go out (I think - correct me if I'm wrong
    about that). As expected, this didn't change any results. I also noticed that
    my ad_read had gotten too simple because it wasn't returning correct values
    when there was an EOF outcome during reads, so I fixed that. This got things
    working again back to the extent that I can always get "Stopped listening,
    please wait..." followed by a result after the very first utterance as long as
    I am talking while continuous starts up. This is what is in my ad_read now:

    UInt32 length = max;
    UInt32 numBytes;
    OSStatus status = AudioFileReadPackets ( r->recorder->GetAudioFileID(),
    false,
    &numBytes,
    NULL,
    0,
    &length,
    buf) ;

    if (status == -39 && r->recording==0) { // status -39 is EOF, in this case
    while not recording which shouldn't be happening
    return -1;

    } else if (status != 0) { //status 0 is success, other possibilities are an
    EOF, a parameter error or something else
    if(status = -39 && r->recording==1) {
    if(length < 0) return -1;// this isn't really happening
    else return length;
    } else if (status == -50){ //bad parameter, this isn't really happening
    return -1;
    } else { // an unknown error, this hasn't happened to date
    printf("unknown error is %d", (int)status);
    return -1;
    }

    } else {
    if(length < 0) return -1; // this isn't really happening
    else return length;
    }
    return 0;

    The next thing I tried was changing the number of buffers my audio file uses.
    I have been using between one and three buffers of a half second in duration
    through most of this testing. I changed it to 16 just for the purpose of
    testing. So now what happens is that if I'm speaking when continuous starts, I
    can speak with silences of as much as a couple of seconds, and recognition is
    pretty good. If I have silences of longer than that, it will get into the loop
    where it can no longer recognize any speech. So, maybe silence in the middle
    of the buffer file is OK, but silence at the beginning or end is causing
    breakage.

     
  • Halle

    Halle - 2010-04-06

    OK, I do think this is about the construction of my driver, I'm going to work
    on it some more and see where I get. Thanks Nickolay!

     
  • Halle

    Halle - 2010-04-06

    OK, all good now - it was the starting packet offset; it needs to keep moving
    forward the amount that has been read until there's an utterance.

     
  • Halle

    Halle - 2010-04-06

    BTW, really appreciate the help getting my two other mistakes fixed Nickolay
    -- I don't think I would have figured out what was wrong with the starting
    packet if the other causes of weirdness hadn't been fixed first.

     
  • Nickolay V. Shmyrev

    Nice it's working now. Let's hope such code will land in sphinxbase trunk one
    day.

     
  • Halle

    Halle - 2010-04-07

    Sure thing, once I've gotten my projects out and had some time to standardize
    it a bit.

     
  • Tom Raic

    Tom Raic - 2010-04-27

    Hey do you think you guy's can send me the A/D implementation for CoreAudio?
    Do I only have to re-write that one ad_read function? Trying to port this into
    an iPhone library. It compiles, but obviously I don't have access to the audio
    input devices.

    Any help would be appreciated. Thanks.

    Tom Raic
    tom@whistlebox.com

     
<< < 1 2 (Page 2 of 2)

Log in to post a comment.