0.2.9 latest code, running cpu at 100%, and out of memory.
3STHSTTYPE 02:03:36:347430000 GMT j9mm.126 - at 326DC760 java/lang/Thread.run()V, jit 31FA6144, pc 31A79B80
3STHSTTYPE 02:03:36:347426000 GMT j9mm.126 - at 37F2EEF8 com/sshtools/j2ssh/transport/TransportProtocolCommon.run()V, jit 320DB310, pc 38807784
3STHSTTYPE 02:03:36:347416000 GMT j9mm.126 - at 37F2F128 com/sshtools/j2ssh/transport/TransportProtocolCommon.startBinaryPacketProtocol()V, jit 31F396B4, pc 319A33F4
3STHSTTYPE 02:03:36:347412000 GMT j9mm.126 - at 37F2F1D8 com/sshtools/j2ssh/transport/TransportProtocolCommon.processMessages()Lcom/sshtools/j2ssh/transport/SshMessage;, jit 31EF9F4C, pc 31921FAC
3STHSTTYPE 02:03:36:347405000 GMT j9mm.126 - at 37F331B8 com/sshtools/j2ssh/transport/TransportProtocolInputStream.readMessage()[B, jit 320DE1DC, pc 3880AE00
3STHSTTYPE 02:03:36:347398000 GMT j9mm.126 - at 3279A7A0 java/io/ByteArrayOutputStream.write([BII)V, jit 31DB7130, pc 3163AA20
3STHSTTYPE 02:03:36:347390000 GMT j9mm.101 - J9AllocateIndexableObject() returning NULL! 1073741840 bytes requested for object of class 326DA4E8 from memory space '' id=00000000
3STHSTTYPE 02:03:36:346716000 GMT j9mm.53 - GlobalGC end: workstackoverflow=0 overflowcount=0 weakrefs=12195 soft=2826 phantom=247 finalizers=491 newspace=60029432/60397568 oldspace=434149576/1006632960 loa=0/0
HERE is the fix in the SshMessageStore.java class.
This was endlessly looping because the timeout passed was 0, and if an EOF happened when trying to get a message, then it would never be able to get the message, and it looped forever, 100% max the cpu, and then the server dies with out of memory exception.
This works great!
/**
* <p>
* Get a message from the store. This method will block until a message
* with an id matching the supplied filter arrives, the specified timeout
* is reached or the message store closes. The message is removed from the
* store.
* </p>
*
* @param messageIdFilter an array of message ids that are acceptable.
* @param timeout the maximum number of milliseconds to block before
* returning.
*
* @return the next available message
*
* @throws MessageStoreEOFException if the message store is closed
* @throws MessageNotAvailableException if the message is not available
* after a timeout
* @throws InterruptedException if the thread is interrupted
*
* @since 0.2.0
*/
public synchronized SshMessage getMessage(int[] messageIdFilter, int timeout) throws MessageStoreEOFException, MessageNotAvailableException, InterruptedException {
if ((messages.size() <= 0) && isClosed) {
throw new MessageStoreEOFException();
}
if (messageIdFilter == null) {
return nextMessage();
}
//if timeout less than 0, set to 0.
if (timeout < 0) {
timeout = 0;
}
SshMessage msg=null;//the message
int tries=0;//number of tries to get message
int tryLimit=50;//number MAX to try to get message
while ((messages.size() > 0) || !isClosed) {
// lookup the message
msg = lookupMessage(messageIdFilter, true);
if (msg != null) {
return msg;
} else {
// If the number of tires exceed the limit, throw the exception, can't wait forever, or enless loop and bring down the application.
if(tries>tryLimit) {
throw new MessageNotAvailableException();
}
}
// Now wait
if (!isClosed) {
//releases the lock on this method.
wait((timeout == 0) ? interrupt : timeout);
//Yes, this is not the best solution, but it will work.
//if we already tried 25 times, and don't have a message, start sleeping a second the next 25 seconds, to give 25 seconds for a response.
if(tries>25){
Thread.sleep(1000);
}
}
tries++;
}
throw new MessageStoreEOFException();
}
SSHMessageStore java class with fix
Thanks for pointing this out. Indeed, there could be a problem.
How about changing
if (!firstPass && (timeout > 0)) {
throw new MessageNotAvailableException();
}
to
if (!firstPass) {
throw new MessageNotAvailableException();
}
?
Would it solve the issue? Comments are welcome.
I don't want to change the code as I am not sure whether we'll experience side effects or not.