Menu

wrapper.on_exit behavior

2017-06-28
2017-06-28
  • Chase Barrett

    Chase Barrett - 2017-06-28

    Hello,

    I want to prevent the wrapper controller from restarting my app on particular, unrecorverable boot up errors, and I think the wrapper.on_exit property is the configuration I need. Unfortunately, I can't get it to work the way I'd like.

    Here's what I have in my wrapper.conf:

    wrapper.on_exit.0=SHUTDOWN
    wrapper.on_exit.1=SHUTDOWN
    wrapper.on_exit.2=SHUTDOWN
    wrapper.on_exit.default=RESTART
    

    In my application's main(), this is how I'm detecting my unrecoverable error:

            try {
                ...
            } catch (LicenseException e) {
                log.error(e.getMessage());
                System.exit(1);
            } catch (SomeOtherException u) {
                log.error(e.getMessage());
                System.exit(2);
            }
    

    When I encounter a LicenseException, however, the log shows that it's restarting based on the default exit code rule:

    INFO|7835/0|17-06-27 18:29:00|2017-06-27 18:29:00.629 ERROR ${sys:PID} --- [           main] c.f.b.s.MyApp           : conf/license.lic (No such file or directory)
    INFO|wrapper|17-06-27 18:29:02|exit code linux process 256
    INFO|wrapper|17-06-27 18:29:02|restart process due to default exit code rule
    INFO|wrapper|17-06-27 18:29:02|restart internal RUNNING
    INFO|wrapper|17-06-27 18:29:02|stopping process with pid/timeout 7835 45000
    INFO|wrapper|17-06-27 18:29:02|killing 7835
    INFO|wrapper|17-06-27 18:29:02|process exit code: 256
    

    The wrapper then proceeds to restart five more times before it gives up. Am I using the wrapper.on_exit property correctly?

    BTW, this is on Mac/OSX, but I'm seeing similar behavior on Amazon Linux and RHEL.

    Thanks,
    Chase

     
  • Chase Barrett

    Chase Barrett - 2017-06-28

    I retract my comment about Amazon Linux. After cleaning up my application code, it's working as expected:

    INFO|wrapper|17-06-28 17:10:54|started process 31203
    INFO|wrapper|17-06-28 17:10:54|started process with pid 31203
    INFO|31203/0|17-06-28 17:10:56|[INFO] StandardFileSystemManager - Using "/tmp/vfs_cache" as temporary files store.
    INFO|31203/0|17-06-28 17:10:59|2017-06-28 17:10:59.357 ERROR ${sys:PID} --- [           main] c.f.b.MyApp           : Unable to locate or read the license file conf/license.lic
    INFO|wrapper|17-06-28 17:10:59|waitpid 31203 256
    INFO|wrapper|17-06-28 17:10:59|exit code posix process: 256 application: 0
    INFO|wrapper|17-06-28 17:10:59|restart process due to default exit code rule
    INFO|wrapper|17-06-28 17:10:59|shutdown wrapper due to exit code rule
    INFO|wrapper|17-06-28 17:10:59|shutdown wrapper due to exit code rule
    INFO|wrapper|17-06-28 17:10:59|Shutting down Wrapper
    

    I'll test RHEL, Ubuntu, SUSE, and Windows next, and will report back. The problem remains, however with Mac/OSX

     
  • Chase Barrett

    Chase Barrett - 2017-06-30

    I tested RHEL, Amazon Linux, Ubuntu, SUSE, OSX, and Windows. Of the six, only Windows is behaving as I believe it should based on the documentation. I narrowed down the configuration to isolate the issue, and found out that Amazon is not working as expected in spite of my comment above.

    So here's the setup...on all six platforms, I used four different wrapper.conf configurations, and for each one, triggered the license problem illustrated in the java snippet above to produce a system exit code of 1.

    First configuration:

    # wrapper.on_exit.0=SHUTDOWN
    wrapper.on_exit.default=RESTART
    

    Here, all six platforms performed as I expected, in that they restarted when they processed 1 as the exit code. So far, so good.

    Second configuration. Uncommented the 0 code line, which should not change anything, since I'm exiting with a code of 1:

    wrapper.on_exit.0=SHUTDOWN
    wrapper.on_exit.default=RESTART
    

    Here are the responses for this configuration:

    • RHEL - shutdown (incorrect)
    • Amazon - shutdown (incorrect)
    • SUSE - shutdown (incorrect)
    • Ubuntu - shutdown (incorrect)
    • OSX - restarts
    • Windows - restarts

    Third configuration. Changed the 0 to 1, which is what I believe I need based on the error I'm trying to trap:

    wrapper.on_exit.1=SHUTDOWN
    wrapper.on_exit.default=RESTART
    

    Here are the responses for this configuration:

    • RHEL - restarts (incorrect)
    • Amazon - restarts (incorrect)
    • SUSE - restarts (incorrect)
    • Ubuntu - restarts (incorrect)
    • OSX - restarts (incorrect)
    • Windows - shutdown

    Fourth configuration. Since OSX and the four Linux boxes all make mention of processing a 256 exit code in the wrapper console output, I tried the following configuration:

    wrapper.on_exit.256=SHUTDOWN
    wrapper.on_exit.default=RESTART
    

    Here are the responses for this configuration:

    • RHEL - restarts (incorrect)
    • Amazon - restarts (incorrect)
    • SUSE - restarts (incorrect)
    • Ubuntu - restarts (incorrect)
    • OSX - shutdown (maybe this is correct??)
    • Windows - restarts

    Since the Windows box is behaving predictably, it seems to me we have a bug with at least the OSX and Linux boxes.

    PS - I apologize for posting this in the "Open Discussion" forum. I noticed the "Help" forum only after my initial post. Feel free to move it if you're able.

     

    Last edit: Chase Barrett 2017-06-30
  • rzo

    rzo - 2017-07-08

    hello,

    which yajsw release are you using ? jvm ? OS verions, 32/64 bit ?

    -- Ron

     
  • Chase Barrett

    Chase Barrett - 2017-07-14

    Hi Ron,

    The YAJSW release is 12.08, and all the platforms are 64 bit. Here are the other details on the platforms:

    OS OS Version JVM
    Red Hat Enterprise Linux 7.3 Oracle JRE v8 u131
    Amazon Linux 2017.03.1 Oracle JRE v8 u131
    Ubuntu Server 16.04 LTS Oracle JRE v8 u131
    SUSE Linux Enterprise Server 12 SP2 Oracle JRE v8 u131
    OSX 10.11.6 Oracle JDK v8 u131
    Windows Professional 10 Oracle JDK v8 u121
     
  • rzo

    rzo - 2017-07-19

    hello,

    thanks for the feedback.
    I will need some time to try to reproduce this.
    Q: how is

    wrapper.control set ?

    -- Ron

     
  • Chase Barrett

    Chase Barrett - 2017-07-25

    I don't have that property set in my wrapper.conf file.

    Thanks,
    Chase

     
  • Brad Hawthorne

    Brad Hawthorne - 2017-08-02

    I've been experiencing what I think is a related problem.

    Our service on occasion will terminate with unexplained 'posix process' codes of 256 that cause the service wrapper to shutdown. eg

    17-08-02 14:05:34|waitpid 17533 256
    17-08-02 14:05:34|exit code posix process: 256 application: 0
    17-08-02 14:05:34|restart process due to default exit code rule
    17-08-02 14:05:34|shutdown wrapper due to exit code rule
    17-08-02 14:05:34|shutdown wrapper due to exit code rule
    

    Looking into the source code, 'exit code posix process' is the status value returned by waitpid(). From the docs the low 8 bits are the signal number if the app received a signal, and the high 8 bits are the app exit code.

    256 would then describe a non-signalled exit with an exit code of 1. But the service wrapper seems to decode this incorrectly, assumes exit code 0 (eg. an intentional shutdown) and does not restart our service.

    PosixProcess.java handles this result code as follows:

                        // Exited Normally
                        if (WIFEXITED(code) != 0)
                            _exitCode = WEXITSTATUS(code);
                        // Exited Ab-Normally
                        else
                            _exitCode = 0;
    

    with WIFEXITED defined as

        public int WIFEXITED(int code) {
            return (code & 0xFF);
        }
    

    which seems incorrect to me. WIFEXITED should return nonzero/true if the low byte is zero. This can be seen in the various implementations of waitstatus.h out there I found by Googling.

     
  • rzo

    rzo - 2017-09-03

    hello,

    thanks for reporting the issue and taking your time to look into this.
    With the next release I propose to add a new configuration property:

    wrapper.on_signal.default = RESTART
    wrapper.on_signal.9 = SHUTDOWN

    on_exit will use WEXITSTATUS.
    on_signal will use WTERMSIG.

    Thus in your usecase the application will be correctly restarted.

    comments ? suggestions ?

    -- Ron

     
    • Brad Hawthorne

      Brad Hawthorne - 2017-09-19

      Yes that should work, thanks.

       

Log in to post a comment.