Menu

#434 Java thread sometimes is not returned from AS400.isConnectionAlive(7) method

Connection
open
Zhang Ze
None
5
2020-06-18
2020-06-18
No

Hello guys,

initially I'd like to describe my problem at support forum; but, unfortunately, I've found nothing at the moment besides this link (https://sourceforge.net/p/jt400/bugs/432/). So, I'm writing here.

I used the JTOpen tool in my small project for monitoring the System i (https://share.zabbix.com/operating-systems/ibm-i-i5-os-os-400-for-ibm-system-i-as-400/zabbix-agent-emulator-for-as-400-platform).

My program is running on System i directly. It uses the AS400.isConnectionAlive(i) call to check an availability of all services (in loop, where "i" variable is stepped from 0 to 7). The problem is that sometimes (about once per month, more or less) the thread performing this checks just hangs. Using debugging output from my program, I localized that problem is occuring every time when "i=7" (i.e. AS400.SIGNON (www.ibm.com)). The debugging output from different thread shows that the problem's thread Thread.isAlive() method returns "true" and stack trace is the following:

com.ibm.as400.access.DataStream.write(DataStream.java:327)
com.ibm.as400.access.SignonEndServerReq.write(SignonEndServerReq.java:39)
com.ibm.as400.access.AS400NoThreadServer.send(AS400NoThreadServer.java:136)
com.ibm.as400.access.AS400ImplRemote.signonDisconnect(AS400ImplRemote.java:3470)
com.ibm.as400.access.AS400ImplRemote.disconnect(AS400ImplRemote.java:622)
com.ibm.as400.access.AS400.disconnectService(AS400.java:1547)
as400.metric.As400Metrics$10.process(As400Metrics.java:460)
as400.thread.ZabbixAgent.process(ZabbixAgent.java:25)
as400.thread.ActiveCheck.processCommonCheck(ActiveCheck.java:695)
as400.thread.ActiveCheck.processActiveChecks(ActiveCheck.java:730)
as400.thread.ActiveCheck.run(ActiveCheck.java:801)

However, return from the AS400.isConnectionAlive(i) method is never occured.
The only way to repair this problem (at the moment) is to restart the entire Java process.
At the same time, this problem occurs relatively rare: this check is planned every minute, so hangs once per month (3-5 weeks) is about 1 hangs per 43000 successful checks. The most annoying is that I could not reproduce this problem intentionally: I could only wait for a next time.

What can I perform for fixing this problem, please?
What else information could be useful to collect next time?

Some additional info about configuration:

IBM Corporation Java version "1.8.0_221"
Java(TM) SE Runtime Environment (build 8.0.5.40 - pap6480sr5fp40-20190807_01(SR5 FP40))
IBM J9 VM (build 2.9, JRE 1.8.0 OS/400 ppc64-64-Bit Compressed References 20190802_424001 (JIT enabled, AOT enabled)
OpenJ9   - 106f6ce
OMR      - fe07f6f
IBM      - af2a365)
Open Source Software, JTOpen 9.8, codebase 5770-SS1 V7R3M0.00 built=20190304 @Z0
OS Version: V7R3M0

Thank you in advance!

Discussion

  • Constantin Oshmyan

    Sorry, I quoted incorrect stack trace. That stack trace occured when I tried to call the
    AS400.disconnectService(7) method from a different thread (in hope to get some exception and to continue then). Unfortunately, it does not resolve problem also: the thread calling AS400.disconnectService(7) method just hangs also.

    The correct stack trace for initial call of AS400.isConnectionAlive(7) method is the following:

    java.net.SocketOutputStream.socketWrite0(Native Method)
    java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:122)
    java.net.SocketOutputStream.write(SocketOutputStream.java:154)
    com.ibm.as400.access.DataStream.write(DataStream.java:329)
    com.ibm.as400.access.SignonPingReq.write(SignonPingReq.java:33)
    com.ibm.as400.access.AS400NoThreadServer.send(AS400NoThreadServer.java:136)
    com.ibm.as400.access.AS400NoThreadServer.sendAndDiscardReply(AS400NoThreadServer.java:116)
    com.ibm.as400.access.AS400ImplRemote.isConnectionAlive(AS400ImplRemote.java:2323)
    com.ibm.as400.access.AS400.isConnectionAlive(AS400.java:2695)
    as400.metric.As400Metrics$10.process(As400Metrics.java:438)
    as400.thread.ZabbixAgent.process(ZabbixAgent.java:25)
    as400.thread.ActiveCheck.processCommonCheck(ActiveCheck.java:691)
    as400.thread.ActiveCheck.processActiveChecks(ActiveCheck.java:726)
    as400.thread.ActiveCheck.run(ActiveCheck.java:797)
    
     
  • John Eberhard

    John Eberhard - 2020-06-18
    • assigned_to: Zhang Ze
     
  • John Eberhard

    John Eberhard - 2020-06-18

    Assigning owner.

     

Log in to post a comment.

MongoDB Logo MongoDB