During our most recent data-collection, it was interesting to note that there was no issue with the concurrent call handling, rather the problem was that HALEF was crashing and had to be restarted no less than 4 times. In the period July 4 to 6, there were 22 calls terminating with NULL (no random code) for the following reasons:
no reaction at 3rd audio input (see Ticket https://sourceforge.net/p/halef/tickets/55):
b4ff7da5e8cbd682433efb2d8f02c2e7@141.31.8.62
dfab4df655fbaba66cabf8974dffce56@141.31.8.62
recognizer result is null (see Ticket https://sourceforge.net/p/halef/tickets/60):
2615b3aeaa752ab3de8722d314430e84@141.31.8.62
4bdbdee3ec165b69cb4f59d66384f390@141.31.8.62
presumably user hangup
6bb786dba2e3364d69f10b67ce5376ba@141.31.8.62
6911617fb25fb428b35a88f22ed4aa15@141.31.8.62
no RTP stream received (see Ticket https://sourceforge.net/p/halef/tickets/61):
7ff0cd4ff09c9345d9ac1c74e31f889c@141.31.8.62
36a00b646edf8af3786657281598f2bc@141.31.8.62
7bd36fa2780a89275847a8284cf4d2ba@141.31.8.62
8c760ded627b5792a3df96190bcf92a4@141.31.8.62
21b1a8cab630a51f0c139cf284fdb747@141.31.8.62
802116f89f5d5eb6a665403232e1cddd@141.31.8.62
23936f70d4bf8c5a146e62a5902cec0d@141.31.8.62
bbdafef2e6bc79253304773ff9ecd86e@141.31.8.62
60b2fbeea831e8ccf4e1c31edfccb809@141.31.8.62
146585232b569ac56f96dcf567818026@141.31.8.62
196fd6a6e9298fdca580a92e9d7eb60a@141.31.8.62
1f3ca58420df44e0ff71a545ac198660@141.31.8.62
963efac2494a557615c0bd86d63cd756@141.31.8.62
dce5c3d6f5738ed9fe47a0d26fded6c9@141.31.8.62
6a9778e2997922ef458ca87ea5dde6e9@141.31.8.62
457c5548e958c8b618841fc00b098f2f@141.31.8.62
And these were the five restarts that happened in the aforementioned period:
select * from hl where message like '%VoiceXML interpreter%started%' and databasedate between '20150704' and '20150707';
+---------+---------------------+--------------+------------------------------------------+
| ID | databasedate | machineIP | message |
+---------+---------------------+--------------+------------------------------------------+
| 2893284 | 2015-07-05 01:25:43 | 141.31.8.62 | VoiceXML interpreter 0.7.6.EA started !. |
| 2894590 | 2015-07-05 01:27:47 | 141.31.8.62 | VoiceXML interpreter 0.7.6.EA started !. |
| 2964441 | 2015-07-05 06:18:19 | 141.31.8.62 | VoiceXML interpreter 0.7.6.EA started !. |
| 2980252 | 2015-07-05 17:59:59 | 141.31.8.62 | VoiceXML interpreter 0.7.6.EA started !. |
| 3027186 | 2015-07-06 02:44:42 | 141.31.8.62 | VoiceXML interpreter 0.7.6.EA started !. |
+---------+---------------------+--------------+------------------------------------------+
It is interesting to analyze the effectiveness of these restarts, though:
1/2. The issue "no RTP stream received (see Ticket https://sourceforge.net/p/halef/tickets/61)" happened for the first time at 2015-07-04 20:47:49 (call ID 457c5548e958c8b618841fc00b098f2f@141.31.8.62) and kept Halef in a dysfunctional state until the restart at 2015-07-05 01:25:43 followed by another restart right after at 2015-07-05 01:27:47. The late reaction caused 16 calls into Ext 3333 to fail.
Another restart happened at 2015-07-05 06:18:19 with no apparent need (the last call into Ext 3333 at 2015-07-05 05:20:29, call ID f578a9af0cd15b98772981a6f894be46@141.31.8.62, was good).
The restart at 2015-07-05 17:59:59 followed a single NULL call into Ext 3333 at 2015-07-05 10:14:09 (call ID dfab4df655fbaba66cabf8974dffce56@141.31.8.62) of type "no reaction at 3rd audio input (see Ticket https://sourceforge.net/p/halef/tickets/55)".
The same happened again at 2015-07-06 02:44:42 where after a single NULL call into Ext 3333 at 2015-07-06 01:13:42 (call ID b4ff7da5e8cbd682433efb2d8f02c2e7@141.31.8.62) of type "no reaction at 3rd audio input (see Ticket https://sourceforge.net/p/halef/tickets/55)" was followed by a restart.
In conclusion:
What I observed on listening to calls made to ext. 3333 (interview item during recent crowdsourcing data collection) is that this hangup by the system immediately after playing a prompt occurred 5 times (run-length-encoded), as opposed to about 3 times (run-length-encoded) for the case where the system was still waiting for (or timed-out for lack of) user response. Out of these 5 cases, 4 occurred after the 1st prompt.
Diff:
Diff:
Diff:
Diff:
Diff:
Diff:
btw, David and Patrick, I've been trying to replicate the problems described in bugs #54 and #55, with limited or no success so far
[9:49:37 AM] Vikram Ramanarayanan: but i made an observation; not sure how helpful it might be
[9:50:52 AM] Vikram Ramanarayanan: when you hang up on JVXML abruptly in the middle of a call for some reason, you see something like:
[9:50:54 AM] Vikram Ramanarayanan: [java] 2015-07-13 18:30:15,023 DEBUG client.SpeechClientImpl [EventScannerThread]: Stopped Recognition, MRCP response received:
[java] MRCP/2.0 86 1436805012888 200 COMPLETE
[java] Channel-Identifier:14e610fc01c@speechrecog
[java]
[java]
[java] 2015-07-13 18:30:15,023 DEBUG env.StandardEnvironment [7bc3289c-3875-427e-94f9-98219a3ac3a6]: Adding [systemProperties] PropertySource with lowest search precedence
[java] 2015-07-13 18:30:15,030 DEBUG sip.SipAgent [EventScannerThread]: ------------- SENDING A SIP RESPONSE ---------------
[java] Sending a SIP Response. Status: 200, OK
[java] Content-Length: 0
[java]
[java] To: sip:7704@141.31.8.62:5090;tag=1595306267
[java]
[java] Via: SIP/2.0/UDP 141.31.8.161:5060;branch=z9hG4bK68d67dbd
[java]
[java] CSeq: 103 BYE
[java]
[java] Call-ID: 10e6fa945f58b1fe5117c8a3283130aa@141.31.8.161:5060
[java]
[java]
[9:52:00 AM] Vikram Ramanarayanan: followed by some kind of cleanup that JVXML executes so that you finally see something like this in the log:
[9:52:02 AM] Vikram Ramanarayanan: [java] 2015-07-13 18:30:18,151 INFO interpreter.JVoiceXmlSession [7bc3289c-3875-427e-94f9-98219a3ac3a6]: ...session closed
[java] 2015-07-13 18:30:18,157 ERROR mrcpv2.SipCallManager [7bc3289c-3875-427e-94f9-98219a3ac3a6]: no session given. unable to cleanup session
[java] 2015-07-13 18:30:19,957 DEBUG sip.SipListenerImpl [EventScannerThread]: Got a transaction terminated event
[java] 2015-07-13 18:30:21,142 DEBUG sip.SipListenerImpl [EventScannerThread]: Got a transaction terminated event
[java] 2015-07-13 18:30:22,931 INFO sip.SipListenerImpl [EventScannerThread]: Got a dialog terminated event
[java] 2015-07-13 18:30:22,948 INFO sip.SipListenerImpl [EventScannerThread]: Got a dialog terminated event
[java] 2015-07-13 18:30:46,944 DEBUG sip.SipListenerImpl [EventScannerThread]: Transaction Time out
[java] 2015-07-13 18:30:46,950 DEBUG sip.SipServer [EventScannerThread]: Timeout occurred
[java] 2015-07-13 18:30:47,037 DEBUG sip.SipListenerImpl [EventScannerThread]: Got a transaction terminated event
[java] 2015-07-13 18:30:47,442 DEBUG sip.SipListenerImpl [EventScannerThread]: Got a transaction terminated event
[9:53:05 AM] Vikram Ramanarayanan: what i seem to remember is when HALEF crashed due to either bugs #54 or #55, I saw that the last entry in the logs was what I posted earlier, i.e., the JVXML session might not have been closed correctly
[9:53:24 AM] Vikram Ramanarayanan: and that cleanup was not executed no matter what
[9:54:03 AM] Vikram Ramanarayanan: perhaps you already knew this, but i just thought I'd post this in case it was helpful in understanding how to resolve the bugs
[9:54:53 AM] David Suendermann: No, I did not know about this yet.
[9:56:40 AM] Vikram Ramanarayanan: i'm not sure if all these issues are resolved in the latest version of JVXML's mrcpv2 implementation
[9:56:56 AM] Vikram Ramanarayanan: we will at some point need to upgrade to that...
[9:57:13 AM] David Suendermann: That is for sure
[9:57:19 AM] Vikram Ramanarayanan: but then again, i'm not sure the cleanup/session closing might be the issue
[9:57:29 AM] Vikram Ramanarayanan: just an observation.
I do not currently believe that the cleanup issue caused any of the problems we have been seeing. When one looks into how the cleanup issue is related with NULL calls, this is what we have:
select * from hl where message like '%unable to cleanup session%' and databasedate between '20150704' and '20150707';
select * from completeCalls where timestamp > '20150704';
+---------+---------------------+--------------+---------------------------------------------+
| ID | databasedate | machineIP | rand code | extension |
+---------+---------------------+--------------+---------------------------------------------+
| 2848682 | 2015-07-04 18:36:43 | 141.31.8.62 | 6055 | 7709
| 2849903 | 2015-07-04 18:37:25 | 141.31.8.62 | NULL | 7709
| 2851037 | 2015-07-04 18:38:22 | 141.31.8.62 | NULL | 3333
| 2861597 | 2015-07-04 19:48:24 | 141.31.8.62 | 8393 | 3333
| 2876561 | 2015-07-05 01:16:26 | 141.31.8.62 | NULL | 7709
| 2897492 | 2015-07-05 01:28:58 | 141.31.8.62 | NULL | 7709
| 2906304 | 2015-07-05 01:39:37 | 141.31.8.62 | 6010 | 3333
| 2908332 | 2015-07-05 02:16:59 | 141.31.8.62 | NULL | 7709
| 2921184 | 2015-07-05 04:27:46 | 141.31.8.62 | NULL | 3333
| 2930984 | 2015-07-05 04:35:05 | 141.31.8.62 | 4569 | 3333
| 2940212 | 2015-07-05 04:42:27 | 141.31.8.62 | 8409 | 3333
| 2966407 | 2015-07-05 06:19:20 | 141.31.8.62 | NULL | 7709
| 2990515 | 2015-07-05 22:31:23 | 141.31.8.62 | 8367 | 3333
| 3000173 | 2015-07-05 23:05:54 | 141.31.8.62 | 8450 | 3333
| 3037460 | 2015-07-06 03:05:32 | 141.31.8.62 | 9039 | 3333
| 3048671 | 2015-07-06 03:33:04 | 141.31.8.62 | 3872 | 3333
| 3058787 | 2015-07-06 03:49:52 | 141.31.8.62 | 6695 | 3333
| 3068380 | 2015-07-06 05:17:59 | 141.31.8.62 | 5836 | 3333
+---------+---------------------+--------------+------+
I am not sure if 'unable to cleanup session' is the correct search if you want to search for calls which dont clean up properly.