Menu

New user: weird persistent error every 20-21 minutes

Help
jll
2019-03-07
2019-03-18
  • jll

    jll - 2019-03-07

    Hi, just getting started with AS2 as an alternative to my existing OFTP2 setup (using Mendelson OFTP2) and am having some initial success (have yet to get my partner to actually connect to me, but my internal testing looks good with production certs, etc.), but must have fouled something up because whenever I have OpenAS2 running, every 20 or 21 minutes, I'll get this error coming out:

    2019-03-07 10:11:35.611 ERROR AS2SenderModule: Pending information file detected that is past max wait time due to unknown failure: /usr/local/bin/openas2/bin/../config/../data/pendinginfoMDN3/OPENAS2-05032019134450-0600-a452d1bf-8608-45cc-b2c3-d762605e4cf7@local_partner [null]
    2019-03-07 10:11:35.612 ERROR AS2Util: Error moving file to sent folder: Source '/usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@local_partner' does not exist [null]
    2019-03-07 10:11:35.612 ERROR java.io.FileNotFoundException: Source '/usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@local_partner' does not exist
    at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:2995)
    at org.openas2.util.IOUtil.moveFile(IOUtil.java:199)
    at org.openas2.util.AS2Util.cleanupFiles(AS2Util.java:742)
    at org.openas2.processor.sender.AS2SenderModule.detectFailedSentMessages(AS2SenderModule.java:715)
    at org.openas2.processor.sender.AS2SenderModule$1.run(AS2SenderModule.java:732)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    Any idea what might be causing this and how it could be fixed? I suppose it may've happened when I hard broke out of a running session (using ctrl-c, probably) rather than using the 'exit' command from the console that I'm using to run OpenAS2 (back when I was making several configuration changes and needed to keep restarting the application).

    Thanks to the developers and maintainers for sharing this product, and for any help you can offer.

     

    Last edit: jll 2019-03-07
  • Christopher Broderick

    CTRL-C is fine to ext the app - it tries to do a co-ordinated shutdown of active threads to allow them to finish processing either way.
    Can you confirm if there is a file in the folder it references:
    /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/

     

    Last edit: Christopher Broderick 2019-03-07
  • jll

    jll - 2019-03-07

    The directory is empty by the time I check it. I've also tried creating an empty file or one with 'test' in it or something in that path and with the same filename and it does not seem to make any difference.

    I just noticed also on the most recent occurrence, I might see this message just after the above cited messages, as the final one for that 21m block):

    2019-03-07 11:46:19.901 FINE AS2Util: deleted /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@local_partner [null]

     
  • jll

    jll - 2019-03-07

    What I'm wondering is, what's maintaining the state of this ScheduledFutureTask concept? I checked the only table I found in the H2 database and emptied that at one point thinking it might solve the issue, but it seemed to have no effect.

    Should that technique have maybe worked? Is there some other place i can go to try to figure out why OpenAS2 keeps trying to do something with that file? I'm assuming there's some sort of corruption somewhere due to my early testing and possibly changing of configuration while some file was still in a pending state.

     
  • Christopher Broderick

    The error is occurring because there is an attempt to send a file using ASYNC MDN response mode to your partner and although the partner accepts the file, there is no MDN response received from the partner and eventually the error you are seeing occurs because the system waits for a defined period of time for a response and then eventually times out and tries to move the pending file to an error ditrectory and the error occurs at this point.

    It may be any one of:
    1. your partner actually does not support ASYNC MDN and you should try a synchronous MDN
    2. you need to specify the correct MDN response URL that gets sent to the partner so that your receiver is targeted by your partner
    3. you must enable the Async MDN receiver in your config.

    The error is actually self recovering in that it tries to move the failed sent message to an error folder but because it fails on that task it then deletes the file.

    The key is the 2nd line in the error message you pasted:
    \2019-03-07 10:11:35.612 ERROR AS2Util: Error moving file to sent folder: Source '/usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@local_partner' does not exist [null]

    ... specifically "Error moving file to sent folder:" (though "sent" is misleading as it should actually say "error")

    All that said, it is unclear why it is failing to find the file when it successfully deletes it 6 seconds later and woul probably require adding additional debug logging to figure it out.

     
  • Christopher Broderick

    IOf you enable TRACE level logging it will provide insight into why the error is occurring

     
  • jll

    jll - 2019-03-11

    Thanks Chris, finally turned that on and captured this. Didn't seem to help me much more, but possibly you'll see something I'm missing:

    '#>log setlevel TRACE
    OK:

    '#>2019-03-11 16:37:04.553 FINEST AS2Util: RETRY COUNT from pending info file: 2
    2019-03-11 16:37:04.554 FINEST AS2Util: Data retrieved from Pending info file:
    Original MIC: Y0sCextp4SQtQNU+MSs7SsdxD1W+gfKJtUlEbvZ3i+4=, SHA256
    Retry Count: 2
    Original file name : test1
    Pending message file : /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER
    Error directory: /usr/local/bin/openas2/bin/../config/../data/toAny/error
    Sent directory:
    Attributes: {pendinginfo=/usr/local/bin/openas2/bin/../config/../data/pendinginfoMDN3/OPENAS2-05032019134320-0600-83c045d0-4a17-4496-90e0-bcaeb28a7ae0@ME_PARTNER, errordir=/usr/local/bin/openas2/bin/../config/../data/toAny/error, filename=test1, destination_ip=as2.partner.com, PRIOR_MSG_ID=OPENAS2-05032019134150-0600-4caba483-0b09-447b-92d7-7deb9d984f0e@ME_PARTNER, destination_port=8443, pendingfilename=/usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER, sentdir=, status=pending} [null]
    2019-03-11 16:37:04.556 ERROR AS2SenderModule: Pending information file detected that is past max wait time due to unknown failure: /usr/local/bin/openas2/bin/../config/../data/pendinginfoMDN3/OPENAS2-05032019134450-0600-a452d1bf-8608-45cc-b2c3-d762605e4cf7@ME_PARTNER [null]
    2019-03-11 16:37:04.556 FINEST AS2Util: Deleting pendinginfo file : /usr/local/bin/openas2/bin/../config/../data/pendinginfoMDN3/OPENAS2-05032019134320-0600-83c045d0-4a17-4496-90e0-bcaeb28a7ae0@ME_PARTNER [null]
    2019-03-11 16:37:04.560 FINEST AS2Util: deleted /usr/local/bin/openas2/bin/../config/../data/pendinginfoMDN3/OPENAS2-05032019134320-0600-83c045d0-4a17-4496-90e0-bcaeb28a7ae0@ME_PARTNER [null]
    2019-03-11 16:37:04.560 FINEST AS2Util: deleted /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER.object [null]
    2019-03-11 16:37:04.560 FINEST AS2Util: Cleaning up pending file : OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER from pending folder : /usr/local/bin/openas2/bin/../config/../data/pendingMDN3 [null]
    2019-03-11 16:37:04.561 FINEST IOUtil: Atomic move for /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER failed with [java.nio.file.NoSuchFileException: /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER -> /usr/local/bin/openas2/bin/../config/../data/toAny/error/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER.507eaa4d-6320-4278-8f26-70a99d03ffd2], trying alternatives to move to /usr/local/bin/openas2/bin/../config/../data/toAny/error/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER.507eaa4d-6320-4278-8f26-70a99d03ffd2
    2019-03-11 16:37:04.561 ERROR AS2Util: Error moving file to sent folder: Source '/usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER' does not exist [null]
    2019-03-11 16:37:04.562 ERROR java.io.FileNotFoundException: Source '/usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER' does not exist
    at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:2995)
    at org.openas2.util.IOUtil.moveFile(IOUtil.java:199)
    at org.openas2.util.AS2Util.cleanupFiles(AS2Util.java:742)
    at org.openas2.processor.sender.AS2SenderModule.detectFailedSentMessages(AS2SenderModule.java:715)
    at org.openas2.processor.sender.AS2SenderModule$1.run(AS2SenderModule.java:732)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

    2019-03-11 16:37:04.563 FINE AS2Util: deleted /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/OPENAS2-05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450@ME_PARTNER [null]

     
  • Christopher Broderick

    Are you using soft links anywhere in that path:

    /usr/local/bin/openas2/bin/../config/../data/pendingMDN3/
    
     
  • jll

    jll - 2019-03-12

    Christopher, you're good. I am using a softlink, the openas2 points to openas2-2.7.0 in the same local/bin directory. I'm curious, how did you guess that? I didn't see any reference to the actual underlying dir in the log output above.

    Is that somehow the source of the problem? Should I not run OpenAS2 from a soft linked directory like that?

    Or was this just a fluke? It's possible, e.g., that when I first started it to do some preliminary testing that I was not running it in the softlinked dir and then later shut it down and did start it in the soft linked one.

     
  • Christopher Broderick

    There were problems with moving files across networked file systems a few years ago that I fixed and was trying to think of possible file sytem "anomolies" that could be stopping the move working and that was the only one I know I have never specifically tested. Currently there is no logic to detect a softlinked path and exchange for the "hard" path to do the move.

    I think that depending on where you start the app from and/or whether you navigated to the startup folder via the softlinked directory before executing the startup script will affect the link used for %home%

     

    Last edit: Christopher Broderick 2019-03-12
  • jll

    jll - 2019-03-12

    Okay, so to see if this would help, I shut it down and restarted from the true path, hoping that might do it, but I still see the error, and I'm seeing it still referencing the previous soft linked path.

    So, short of blowing away my installation and starting again, what's the proper fix to this? How is OpenAS2 maintaining the state that there was this old unresolved file? It doesn't appear to be in the H2 database, or is it?

    Also, would you recommend not running it from a soft linked dir, I assume?

     
  • Christopher Broderick

    You will have to edit the bin/start-openas2.sh file either change the 6th lione down from

    binDir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
    

    to

    binDir="/some/direct/path/to/bin/dir"
    

    Or change theis loine in the same file

    EXTRA_PARMS="$EXTRA_PARMS -Dopenas2.config.file=${binDir}/../config/config.xml"
    

    to

    EXTRA_PARMS="$EXTRA_PARMS -Dopenas2.config.file=/some/path/to/config/config.xml"
    

    If you can tell me exactly how you have set up the soft link(s) I can try to replicate your issue on my system and then see if there is a way to cater for this in code.

     
    • jll

      jll - 2019-03-13

      Christopher, I have OpenAS2 unzipped into:

      /usr/local/bin/openas2server-2.7.0

      The directory listing of it and the symlink look like:

      lrwxrwxrwx  1 root root   20 Mar  5 10:53 openas2 -> openas2server-2.7.0/
      drwxr-xr-x  9 jll  jll  4.0K Mar  8 14:15 openas2server-2.7.0
      

      so I just created it as:

      ln -s openas2server-2.7.0 openas2
      

      from within that directory.

       
  • Christopher Broderick

    The unresolved file should be logged in the H2 database but will be in a status of "awaiting_mdn".
    The next release will fix this to update the status to indicate a failure.

     
  • Christopher Broderick

    I have not been able to replicate on my system.

    Can you confirm the host OS name and version as well as the file system type you are using.

     
  • jll

    jll - 2019-03-13

    Ubuntu 16.04, ext4

     
  • Christopher Broderick

    Can you confirm if changing the config.xml to the actual path worked?

     
    • jll

      jll - 2019-03-14

      I made the change in start-openas2.sh that you suggested earlier, replacing it with the full hardcoded path. I didn't change anything in config.xml to troubleshoot this.

      The error still happened, but now that I've made that change and started it from the true path\bin directory, and not via the symlinked one, I see the true path displayed in the FINE/ERROR messages that reference this missing file.

       
  • Christopher Broderick

    In the case of asynchronous MDN responses, the application will wait for a fixed amount of time (default is 4560 seconds) for the partner to respond with an asynchronous MDN and then produce a fail message and move the sent file to the error directory as specified in the configuration.
    To change the wait time before the application decides the partner will not respond can be done using a property in the config.xml file using the attribute name "as2_mdn_response_max_wait_seconds". For example:
    <properties as2_mdn_response_max_wait_seconds="”600”"></properties>

     
  • jll

    jll - 2019-03-14

    I'm still not sure how to cancel this one file, because I'm assuming that the whole issue may've started due to something unusual I did when I was first installing and conducting tests, that's not likely to come up in production.

    All I'd really like to do is clear this one issue from continually popping up. It seems like it may not be an easy fix to actually track down why OpenAS2 is behaving like this. When I examine MSG_METADATA with this query:

    SELECT * FROM msg_metadata where status = 'awaiting_mdn'
    

    I get no results. There are several entries from my other tests though. When I run this query, to try to find any entry matching the original MSG_ID with the above GUID (?) in it:

    SELECT * FROM msg_metadata where PRIOR_MSG_ID like '%05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450%'
    OR MSG_ID like '%05032019134029-0600-ffa88a2d-1c19-427e-b524-95b453347450%'
    

    I also get no results. I just can't understand where OpenAS2 is finding a reference to this message (from March 5th, it appears).

     
  • Christopher Broderick

    Can you provide the putput for the following 2 commands shortly after sending a file to the partner (before the error occurs):

    ls -l <OpenAS2InstallDir>/Data*
    ls -l <OpenAS2InstallDir>/Data/*
    
     

    Last edit: Christopher Broderick 2019-03-14
  • Christopher Broderick

    ... or at any time for that matter since the OpenAS2 system seems to have a file in the resend folder and and I am guessing no retry limit or there is a hole in the processing logic for Async MDN failures somewhere..

     
  • jll

    jll - 2019-03-18

    Ugh, when I did your above 'ls -l' command, I did notice a file matching the name of the one I've had the ongoing problems with. It was in:

    /usr/local/bin/openas2server-2.7.0/data/pendinginfoMDN3

    I shut down OpenAS2, rm'd it and started back up. I think the recurring message has gone away now. I could've sworn I did that exact test at least once or twice before when seeing this, or rather what I definitely remember doing is creating a file matching the name OpenAS2 was erroring on, hoping that would solve the issue.

    I suppose it's also possible that something about the symlink setup that I was running (and am not now) may've impacted the resolution, but for now it looks like I'm fixed.

    Thanks much for your help and sorry it took so much back and forth.

     

Log in to post a comment.