Application not restarted after abort

Help
Daniel
2014-04-01
2014-05-05
  • Daniel
    Daniel
    2014-04-01

    Hello Ron,

    I have developed a Java app that is working fine in Windows, but that is giving me a lot of headhaches in Linux (Ubuntu). After some time, it stops without throwing any exception.

    Apart from using YASW to launch it as a service, I want to use it to control when the application stops, and restart it automatically. At the same time, I would expect to get some information regarding reason my application stops, maybe showing some information on YASW log file (not sure if this is possible).

    I have configured YASW using following wrapper.conf file (all commented lines are removed):

    wrapper.tmp.path = ${jna_tmpdir}
    wrapper.java.app.mainclass=

    wrapper.debug=true

    wrapper.console.loglevel=DEBUG

    wrapper.logfile=${wrapper_home}\/..\/logs\/wrapper.log
    wrapper.logfile.format=LPTM
    wrapper.logfile.loglevel=DEBUG
    wrapper.logfile.maxsize=1m
    wrapper.logfile.maxfiles=25

    wrapper.ntservice.name=autocloudAgent
    wrapper.ntservice.displayname=autocloudAgent
    wrapper.ntservice.description=autocloudAgent

    wrapper.daemon.run_level_dir=${if (new File('\/etc\/rc0.d').exists()) return '\/etc\/rcX.d' else return '\/etc\/init.d\/rcX.d'}

    wrapper.tray = false

    wrapper.on_exit.0=SHUTDOWN
    wrapper.on_exit.1=RESTART
    wrapper.on_exit.2=RESTART
    wrapper.on_exit.3=RESTART
    wrapper.on_exit.default=RESTART

    wrapper.java.monitor.heap = true
    wrapper.java.monitor.heap.threshold.percent = 90
    wrapper.java.monitor.heap.restart=true

    wrapper.java.monitor.gc=MYGC: {0\,number\,integer}\, {1\,number\,integer}\, {2\,number\,integer}
    wrapper.java.monitor.gc.interval=60

    wrapper.java.monitor.deadlock = true
    wrapper.filter.trigger.deadlock.restart=wrapper.java.monitor.deadlock: DEADLOCK IN THREADS:
    wrapper.filter.action.deadlock.restart=RESTART

    wrapper.script.RUN=scripts/sendMail.gv
    wrapper.script.RUN.args=mail@mail.com, Autocloud Agent - Servicio lanzado, Servicio lanzado

    wrapper.script.ABORT=scripts/sendMail.gv
    wrapper.script.ABORT.args = mail@mail.com, Autocloud Agent - Servicio abortado, Servicio abortado

    wrapper.script.STOP=scripts/sendMail.gv
    wrapper.script.STOP.args = mail@mail.com, Autocloud Agent - Servicio parado, Servicio parado

    wrapper.script.RESTART=scripts/sendMail.gv
    wrapper.script.RESTART.args = mail@mail.com, Autocloud Agent - Servicio reiniciado, Servicio reiniciado

    wrapper.script.SHUTDOWN=scripts/sendMail.gv
    wrapper.script.SHUTDOWN.args = mail@mail.com, Autocloud Agent - Wrapper Shutdown, Wrapper shutdown

    wrapper.filter.trigger.exception=Exception
    wrapper.filter.script.exception=scripts/sendMail.gv
    wrapper.filter.script.exception.args= mail@mail.com, Autocloud Agent - Excepción, Detectada excepción en autocloud Agent $process.triggerLine -> restarting
    wrapper.filter.action.exception=RESTART

    placeHolderSoGenPropsComeHere=
    wrapper.java.app.jar = autocloudAgent.jar
    wrapper.java.command = java
    wrapper.java.additional.1 = -Dlog4j.configuration=file:${wrapper_home}\/..\/config\/log4j.properties
    wrapper.java.additional.2 = -Djavax.net.ssl.trustStore="/usr/local/java/jdk1.7.0_51/jre/lib/security/cacerts"
    wrapper.java.additional.3 = -Djavax.net.ssl.trustStorePassword=pwd
    wrapper.java.additional.4 = -XX:-HeapDumpOnOutOfMemoryError
    wrapper.java.additional.5 = -XX:HeapDumpPath=${wrapper_home}\/..\/logs/java.hprof
    wrapper.java.additional.6 = -XX:ErrorFile=${wrapper_home}\/..\/logs\/hs_err.log

    I detect that the application aborts when I get an email about it, associated to property wrapper.script.ABORT. I would expect application restarting after it, but after that mail I get one of Wrapper Shutdown, associated to entry wrapper.script.SHUTDOWN.

    Log file shows following information:

    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|waitpid 4468 36608
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|exit code posix process: 36608 application: 0
    INFO|4468/0|autocloudAgent|14-03-31 21:05:59|Controller State: LOGGED_ON -> WAITING_CLOSED
    INFO|4468/0|autocloudAgent|14-03-31 21:05:59|session closed -> waiting
    INFO|4468/0|autocloudAgent|14-03-31 21:05:59|process terminated
    INFO|4468/0|autocloudAgent|14-03-31 21:05:59|Controller State: WAITING_CLOSED -> PROCESS_KILLED
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|restart process due to default exit code rule
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|shutdown wrapper due to exit code rule
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|giving up after 0 retries
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|set state RUNNING->STATE_ABORT
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|stop from Thread yajsw.controller-2 reason: USER
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|process not in state RUNNING -> Delaying stop
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|set state STATE_ABORT->IDLE
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|shutdown wrapper due to exit code rule
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|stop from Thread yajsw.controller-2 reason: WRAPPER SHUTDOWN
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|set state IDLE->STATE_SHUTDOWN
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|Shutting down Wrapper
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|sending mail to
    INFO|wrapper|autocloudAgent|14-03-31 21:05:59|sending mail to
    INFO|wrapper|autocloudAgent|14-03-31 21:06:02|mail sent
    INFO|wrapper|autocloudAgent|14-03-31 21:06:02|mail sent

    There is one line showing restart process will be done, but just after it there is one showing wrapper will shutdown.

    Is this the correct behavior when application aborts? Or am I configuring something wrong on wrapper.conf file to get the behaviour I'm expecting?

    One more thing: I have set wrapper.debug=true, but there isn't more information of log file. Shouldn't be shown some more information? I was expecting to have some information regarding app abortion on log file, but I'm not sure if it is possible.

    Regards,
    Daniel

     
  • rzo
    rzo
    2014-04-01

    hello,

    try setting:

    wrapper.control = APPLICATION

    and removing:

    wrapper.on_exit.0=SHUTDOWN

    -- Ron

     
  • Daniel
    Daniel
    2014-04-03

    Hello Ron,

    tested and it is working, app is restarted when it aborts, and I could see an uncaught exception message.

    Thanks,
    Daniel

     
  • Daniel
    Daniel
    2014-04-28

    Hello Ron,

    I come back to this post, because I'm experiencing some issues related to app restarting. I use same post, because wrapper.conf is the one I posted originally, just included the change of wrapper.control = APPLICATION you commented.

    The issue is that the application controlled by YASW is randomly restarting (application shouldn't stop ever), and no exception is shown, neither in app logs or in wrapper logs. At the end wrapper logs during restart are attached.

    wrapper.debug is set to TRUE, and memory is controlled, just to check if this is the problem, but nothing is shown.

    OS is Ubuntu 12.04, and Java is Oracle 7.

    Do you see something else that could be set in wrapper to check whether reason stopping the application can be catched?

    Best regards,
    Daniel


    INFO|wrapper|14-04-25 16:48:07|Controller State: LOGGED_ON -> WAITING_CLOSED
    INFO|wrapper|14-04-25 16:48:07|session closed -> waiting
    INFO|wrapper|14-04-25 16:48:07|waitpid 12587 36608
    INFO|wrapper|14-04-25 16:48:07|exit code posix process: 36608 application: 0
    INFO|wrapper|14-04-25 16:48:07|process terminated
    INFO|wrapper|14-04-25 16:48:07|Controller State: WAITING_CLOSED -> PROCESS_KILLED
    INFO|wrapper|14-04-25 16:48:07|restart process due to default exit code rule
    INFO|wrapper|14-04-25 16:48:07|restart internal RUNNING
    INFO|wrapper|14-04-25 16:48:07|set state RUNNING->RESTART
    INFO|wrapper|14-04-25 16:48:07|restarting 1 time
    INFO|wrapper|14-04-25 16:48:07|set state RESTART->RESTART_STOP
    INFO|wrapper|14-04-25 16:48:07|stopping process with pid/timeout 12587 75000
    INFO|wrapper|14-04-25 16:48:07|Controller State: PROCESS_KILLED -> USER_STOP
    INFO|wrapper|14-04-25 16:48:07|killing 12587
    INFO|wrapper|14-04-25 16:48:07|sending mail to xxxxx
    INFO|12587/1|14-04-25 16:48:08| gobler terminated OUTPUT 12587 java.io.IOException: Stream closed
    INFO|12587/1|14-04-25 16:48:08|gobler terminated OUTPUT 12587
    INFO|wrapper|14-04-25 16:48:08|process exit code: 0
    INFO|wrapper|14-04-25 16:48:08|set state RESTART_STOP->RESTART_WAIT
    INFO|12587/1|14-04-25 16:48:08|gobler terminated ERROR 12587
    INFO|wrapper|14-04-25 16:48:09|mail sent xxxxx
    INFO|wrapper|14-04-25 16:48:13|set state RESTART_WAIT->RESTART_START
    INFO|wrapper|14-04-25 16:48:13|starting Process
    INFO|wrapper|14-04-25 16:48:13|starting controller
    INFO|wrapper|14-04-25 16:48:13|Controller State: USER_STOP -> UNKNOWN
    INFO|wrapper|14-04-25 16:48:13|Controller State: UNKNOWN -> WAITING
    INFO|wrapper|14-04-25 16:48:13|binding successfull
    INFO|wrapper|14-04-25 16:48:13|controller started
    INFO|wrapper|14-04-25 16:48:13|_osProcess destroyed
    INFO|wrapper|14-04-25 16:48:13|working dir
    INFO|wrapper|14-04-25 16:48:13|create script: scripts/sendMail.gv
    INFO|wrapper|14-04-25 16:48:13|found script scripts/sendMail.gv
    INFO|wrapper|14-04-25 16:48:13|spawning wrapped process
    INFO|wrapper|14-04-25 16:48:13|exec:java .......
    INFO|wrapper|14-04-25 16:48:13|started process 1875
    INFO|wrapper|14-04-25 16:48:13|started process with pid 1875
    INFO|wrapper|14-04-25 16:48:13|waiting for termination of process
    INFO|wrapper|14-04-25 16:48:13|set state RESTART_START->RUNNING
    INFO|wrapper|14-04-25 16:48:13|sending mail to xxxxxx
    INFO|1875/1|14-04-25 16:48:15|[INFO] StandardFileSystemManager - Using "/tmp/vfs_cache" as temporary files store.
    INFO|1875/1|14-04-25 16:48:15|APP user name
    INFO|1875/1|14-04-25 16:48:15|APP working dir=.....
    INFO|1875/1|14-04-25 16:48:15|APP java version=1.7.0_51
    INFO|1875/1|14-04-25 16:48:15|APP class ....
    INFO|1875/1|14-04-25 16:48:15|APP library path=/usr/java/packages/lib/i386:/lib:/usr/lib
    INFO|1875/1|14-04-25 16:48:15|Application args: no args
    INFO|1875/1|14-04-25 16:48:15|mainClass/jar/script: /autocloudAgent.jar/null
    INFO|1875/1|14-04-25 16:48:15|external stop false
    INFO|1875/1|14-04-25 16:48:15|args:
    INFO|1875/1|14-04-25 16:48:15|monitor deadlock: start
    INFO|1875/1|14-04-25 16:48:15|monitor heap: start
    INFO|1875/1|14-04-25 16:48:15|terminated WrapperManager.init()
    INFO|1875/1|14-04-25 16:48:15|connecting to port 15003
    INFO|wrapper|14-04-25 16:48:15|Controller State: WAITING -> ESTABLISHED
    INFO|wrapper|14-04-25 16:48:15|Controller State: ESTABLISHED -> LOGGED_ON

     
  • rzo
    rzo
    2014-04-28

    hello,

    the log shows that the tcp/ip communication between the spawned process and the wrapper is disconnected. then it logs that the application has terminated. the exit code of the application is 0.

    There is no further information on why the application has stopped.

    Is there anything else in the log, before the disconnect ?

    What yajsw release are you using ?

    -- Ron

     
  • Daniel
    Daniel
    2014-04-29

    Hello Ron,

    There is nothing on the log between the init and the disconnect, just some System.out of the application.

    I'm using release stable-11.11.

    Thanks,
    Daniel

     
  • Daniel
    Daniel
    2014-04-29

    Just one comment:

    There isn't any System.exit(0) in the application, I don't know why exit code 0 is thrown (unless it is thrown by some library used by the application)

     
  • rzo
    rzo
    2014-04-29

    hello,

    there are no known issues for release 11.11

    maybe you should check the syslog for any indication.

    it could be that for some reason the system or another process is sending a INT signal to the application.

    is the application running as daemon or using runConsole ?

    -- Ron

     
  • Daniel
    Daniel
    2014-04-30

    Hello Ron,

    there is nothing on syslog, anyhow I'll wait until it aborts again to check one more time.

    Application is running using runConsole, I'll try as daemon just for double checking.

    Daniel

     
  • rzo
    rzo
    2014-05-03

    hello,

    if you are running it as a console maybe you could add the following configuration:

    wrapper.java.additional.xrs = -Xrs

    this will block signals from stopping your application, this is automatically set for services/daemons.

    -- Ron

     
  • Daniel
    Daniel
    2014-05-05

    Hello Ron,

    I'll try with that option, although I think it won't solve the problem. I have done some tests running yasw as a daemon, and it also stops.

    It seems somehow the problem is related to Ubuntu, as same app has been tested in Windows and Debian, and it is working fine.

    Daniel