Menu

#2378 ZRTP crashes

Fixed
nobody
None
Medium
Defect
2013-07-18
2013-05-28
Anonymous
No

Originally created by: privus...@gmail.com

What steps will reproduce the problem?
1. Make a ZRTP call between 2 CSipSimple clients running latest nightly build
2. After some random time (from a few seconds to a few minutes - at most) ZRTP crashes and the call hangs up. CS registers again immediately on the PBX.
3. Sometimes I get the infamous PJ_EEOF error too at seemingly random times. This could be related.

What is the expected output? What do you see instead?
I expect the call to continue until I hang it up, not suddenly dropping the call and unregistering then re-registering CS on the PBX.

What version of the product are you using? On what device / operating
system?

Samsung Galaxy S3 LTE (i9305) running LiquidSmooth JB 2.4 Official (Android 4.2.2) with CS latest nightly build (2239). I've also tested with earlier CS builds and the problem persists.

Please provide any additional information below.

Like I said above, I also have the PJ_EEOF error happen randomly, so this ZRTP crash could also be related to that.

I see in the logs tons of "zrtp_android.c  ZRTP warning message: Dropping packet because SRTP replay check failed!" messages until the call eventually drops.

It's not a network issue as the call works well until the crash and both clients are on the same LAN with a good connection.

This may or may not be due to running a custom ROM (which is very stable) and/or Android 4.2.2, as evidenced in issues 2316 and 2280 (which is now closed?).

I'm running Freeswitch (very recent build) in bypass media mode so FS shouldn't even touch the RTP, so I think that is not the problem.

Here are the logs. http://pastebin.com/bgQiiuaU

Related

Tickets: #2280
Tickets: #2399

Discussion

<< < 1 2 (Page 2 of 2)
  • Anonymous

    Anonymous - 2013-07-02

    Originally posted by: r3gis...@gmail.com

    It could be interesting to try the following :

    1) Uninstall video plugin.
    2) follow step of comment #21:
    ---
    switch to expert mode (https://code.google.com/p/csipsimple/wiki/ExpertSettingMode?wl=en#General_settings) and go in settings > media > Threads count for media. If you see 2 here (usually means you have a dual core CPU on your phone), change the value to 1.
    ---
    3) follow steps of https://code.google.com/p/csipsimple/wiki/HowToCollectLogs to collect logs and send if the crash can be reproduced once media thread count has been changed to 1.

    Thanks !

     
  • Anonymous

    Anonymous - 2013-07-02

    Originally posted by: q...@mt2014.com

    I will email it to your gmail as a txt

     
  • Anonymous

    Anonymous - 2013-07-02

    Originally posted by: q...@mt2014.com

    Logs enabled on both devices, 2 changed to 1 on both, ZRTP connection online... lets.. wait....

     
  • Anonymous

    Anonymous - 2013-07-02

    Originally posted by: q...@mt2014.com

    Ok, it works. It doesn't crash but I did not test voice quality yet. But above helps. It requires engineering, of course but at least we narrowed the problem of the thread race. Threads are not synchronized probably and they try to access resources in a wrong order.

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: privus...@gmail.com

    I can also confirm that setting the media threads to 1 (default was 2)
    seems to have resolved the ZRTP crash on my quad-core i9305 (S3 LTE).
    I tested making calls through my FS server to a Groundwire client in both
    wifi and 3G, and the call seemed OK.

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    Yes, I tested 1.00.00-trunk today for some time and it looks good. It must be some kind of software race between threads. I think this issue will be solved in some days but there is additional problem. The sound quality.

    1. Noise is very big and only amplified rather then reduced, suppressed by some algorithm
    2. There is something wrong with microphone gain
    3. Acoustic echo canceler mostly doesn’t work

    Acoustic echo can be erratically passed through the microphone-side software because there is a problem with the data synchronization between threads and thread execution times. This is very important thing because when a caller Alice speaks then the Bob’s speaker reproduce the voice. This voice must be recorded within 20ms-100ms window and stored in Bob’s memory for microphone processing. When this sound reaches the Bob’s microphone by the air, inside the plastic enclosure or bounced from the wall or ceiling then this sound is delayed by the speed of sound and the distance. Inside the plastic enclosure it can be millisecond but in a bigger room it can be 50ms or more.
    Then the software must subtract reproduced and stored samples from Bob’s speaker from samples from his microphone. Because Alice would hear herself + Bob’s voice.
    If the software is inside the thread or worse in multiple threads then the execution time is unknown and the mutual synchronization is unknown. It can cause total failure in echo and noise cancellation.

    All these software processes must be very precisely clocked and calculated.

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    I don’t know if I can help the author but I have some thoughts about echo processing.

    Imagine you have a chip with ADC and DAC inside. Integrated circuit.
    I will write pseudocode:
    Imagine sampling rate 16kHz 16bit raw data to and from this integrated circuit.

    playback() {
      DAC = Pbuffer[];
    }

    recording() {
      Rbuffer[] = ADC;
    }

    cancell_echo() {
      output_buffer[] = Rbuffer[] - Pbuffer[];
    }
    So first you need to setup a timer at the frequency of 16kHz and you must collect 20ms of data samples so it will be 320 samples. (20ms for ear speaker) (for )

    on16kHzTick () {
      if (i<320) {
        playback() ;
        recording() ;
        i++;
      }
      if (I == 320) {
        cancell_echo();
        I = 0
      }
    }

    And it must not be divided into 2 threads. Because thread execution time is unknown and no benefit in general from it.
    It should be done as “on timer thread”

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    Of course the echo cancellation is not so simple as above :-) But the idea is good. Low level echo processing must be done on the separate thread or at the thread of the low level timer running at desired sampling rate. Buffers are of course doubled for buffer flipping between 20ms chunks of data. In general echo processing must not be interrupted or delayed or divided into more threads. Echo cancellation must be conducted on raw voice samples. If somebody uses wrong data, wrong buffers for echo processing or with a jitter between them then it fails. Then echo canceler doesn’t work.

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    ACD samples must mach DAC samples. They must be collected at the same time and exactly at the same time, I can say for each 16 bit sample recorded must be each sample played back. So 1 sample in and 1 sample out at each tick of the timer. Of course the chip, the integrated circuit may have a buffer for some samples so you can reduce the number of ticks of this 16kHz timer. If the ADC/DAC processor has a buffer for 16 samples in each direction then you have to setup a timer to 1kHz.

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    But if you switch to hands free mode then the sample buffer for the echo processing must be larger. Sometimes 100ms is not enough. Everything depends on the algorithm for the echo cancellation. But the reproduced sound must start in one buffer from its beginning while microphone samples will include this played back sound plus the acoustic delay.

    So speaker reproduces but in microphone buffer is this AAA but delayed:
    |--AAA-------------------|  from speaker
    |-------AAA--------------|  bounced from wall to the microphone

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: r3gis...@gmail.com

    * The fix has just been pushed to [r2269]. So will be built automatically tonight with that fixed.

    @qtqtqt.. :
    * To get all changelog by version you can have a look on the "source" tab > "changes" sub-tab of this website
    * Thanks for you thoughts about echo processing. However, it's not something managed inside CSipSimple project. CSipSimple takes benefit of various other opensource component and is mainly focused on android only.
    For SIP it's "pjsip" library that is used. You can join the pjsip mailing list to dig deeper in technical parts of this sip/media stack.
    For ZRTP, it's ZRTPCPP/ZRTP4PJ as you understood in this thread.
    For echo cancellation there is various options in csipsimple settings that you can select :
       - Simple (a simple implementation from pjsip developers);
       - Speex (the implementation that comes from speex codec that has builtin echo-canceller);
       - WebRTC (that comes from WebRTC project, the killer media stack from Google that will be shipped in all web browser soon -- and that is known to be a very advanced echo canceller with a good quality/cpu processing ratio).
    There is also various other opensource component, but I let you read the wiki about others.
    If you have problems with echo cancellation, you can have a look to the issue 119. There is various tips that could be interesting to try to get settings that fit the best the devices audio layer constraints.
    If you'd like to contribute an echo canceller implementation you will be welcome. There is a pretty simple interface to implement to be considered as an echo canceller for pjsip/pjmedia stack (that's what I did to integrate webRTC echo canceller and that's not so hard even if I never studied echo cancelers ;) )

     

    Related

    Commit: [r2269]
    Tickets: #119

  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    Yes, but think of it what I say :-) I can’t analyze / hack your sources. The project is too big for me as a beginner :-) It takes time. But if you mess with in and out buffers for raw samples and if they are not synchronized bit to bit then the echo cancellation will not work at all.
    This second buffer is for collecting normal speak plus bounced DELAYED echo. They can’t be processed in separate threads.
    I am not so deep inside smart phone hardware because I do something else but maybe some ADC/DAC chips already have echo canceling inside the hardware? Maybe it is enough to enable it and forget about problem in software?

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    >> or echo cancellation there is various options
    >> in csipsimple settings that you can select...

    OK, OK, OK but you can’t process echo even by speex EC if you don’t do it near the analog to digital converter, near the hardware chip. Do you understand me? If you think that after some thread processing and delays caused by thread execution you can still process echo then you are wrong :-)
    Port of EC to Android must be done in a clever way.

     
  • Anonymous

    Anonymous - 2013-07-03

    Originally posted by: q...@mt2014.com

    Which file in the project does include sampling and echo canceling? Please show me the link to the source file. I will analyze it. The project is too big to judge where it is located.

    Codecs are another story. They can be used later but echo canceling not. Echo must be processed immediately as it happens. You playback samples so you make a copy of them at the same time as you collect samples from the microphone. If you collect 320 samples or more as you need then you call cancel_echo();
    If you fail to do it right then you have echo.

     
  • Anonymous

    Anonymous - 2013-07-15

    Originally posted by: q...@mt2014.com

    Hi,

    What is the status of the "media threads count" bug in new releases ?
    Should we use 1 or 2 ?

     
  • Anonymous

    Anonymous - 2013-07-15

    Originally posted by: r3gis...@gmail.com

    Normally it can be set to 2 safely now.

     
  • Anonymous

    Anonymous - 2013-07-17

    Originally posted by: privus...@gmail.com

    Yes, I can confirm setting it to 2 works well now.

     
  • Anonymous

    Anonymous - 2013-07-18

    Originally posted by: q...@mt2014.com

    I can't confirm full functionality. After setting from 1 back to 2 there are random problems.

    1. Sometimes I can't establish connection
    2. Sometimes "no reason" disconnections
    3. Noise

    There is much better then it was before but still there is something to fix.

    I went back to 1 since I don't see the need for 2 threads. It works well with 1 on SGS2 and SGSNote2.

     
<< < 1 2 (Page 2 of 2)

Log in to post a comment.