Menu

YAJSW occasionally fail to sending command to wrapper manager

Help
bob
2017-01-17
2017-01-22
  • bob

    bob - 2017-01-17

    In my environment(Linux 2.6.32/x86_64), YAJSW wrapper fail to sending command(Such as STOP) to paird wrapper manager, wihch cause by application unable to stop properly.

    in nornally case, when i call stop script, it give me the logs :

    INFO|wrapper|17-01-17 16:43:42|stopping process with pid/timeout 19911 330000
    INFO|19911/0|17-01-17 16:43:42|wrapper manager received stop command
    INFO|wrapper|17-01-17 16:43:53|waitpid 19911 0
    INFO|wrapper|17-01-17 16:43:53|exit code posix process: 0 application: 0
    INFO|wrapper|17-01-17 16:43:53|killing 19911
    INFO|wrapper|17-01-17 16:43:53|process exit code: 0
    INFO|wrapper|17-01-17 16:43:53|shutdown wrapper due to exit code rule
    

    However, somethings it give me such logs(and my application are unable to stop properly):

    INFO|wrapper|17-01-17 16:41:13|stopping process with pid/timeout 18624 330000
    INFO|18624/0|17-01-17 16:41:26|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:41:31|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:41:36|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:41:41|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:41:46|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:41:51|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:41:56|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:42:01|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:42:06|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:42:11|app already logged on -> rejecting new connection
    INFO|18624/0|17-01-17 16:42:16|app already logged on -> rejecting new connection
    

    the logs of wrong case(unable to stop app) in debug mode(wrapper.debug = true):

    **Application started**
    ...
    INFO|16251/0|17-01-17 16:38:52|Controller State: ESTABLISHED -> LOGGED_ON
    INFO|16251/0|17-01-17 16:38:52|Correct key
    INFO|16251/0|17-01-17 16:38:56|Wrong key -> closing session
    INFO|16251/0|17-01-17 16:38:56|Controller State: LOGGED_ON -> WAITING_CLOSED
    INFO|16251/0|17-01-17 16:38:56|session closed -> waiting
    INFO|16251/0|17-01-17 16:38:56|session closed
    INFO|16251/0|17-01-17 16:38:56|try reconnect
    INFO|16251/0|17-01-17 16:39:01|Controller State: WAITING_CLOSED -> ESTABLISHED
    INFO|16251/0|17-01-17 16:39:01|connecting to port 15003
    INFO|16251/0|17-01-17 16:39:01|WrapperManager: channel connected, sending key
    ...(repeated logs of above)
    
    **invoke stop script**
    
    INFO|wrapper|17-01-17 16:39:10|stop from Thread Thread-3 reason: USER
    INFO|wrapper|17-01-17 16:39:10|set state RUNNING->STATE_USER_STOP
    INFO|wrapper|17-01-17 16:39:10|stopping process with pid/timeout 16251 330000
    INFO|16251/0|17-01-17 16:39:10|Controller State: WAITING_CLOSED -> USER_STOP
    INFO|16251/0|17-01-17 16:39:10|Controller State: USER_STOP -> WAITING_CLOSED
    INFO|16251/0|17-01-17 16:39:10|session closed -> waiting
    INFO|16251/0|17-01-17 16:39:11|Controller State: WAITING_CLOSED -> ESTABLISHED
    INFO|16251/0|17-01-17 16:39:11|connecting to port 15003
    INFO|16251/0|17-01-17 16:39:11|WrapperManager: channel connected, sending key
    INFO|16251/0|17-01-17 16:39:12|Controller State: ESTABLISHED -> LOGGED_ON
    INFO|16251/0|17-01-17 16:39:12|Correct key
    INFO|16251/0|17-01-17 16:39:16|Wrong key -> closing session
    INFO|16251/0|17-01-17 16:39:16|session closed
    FINEST|16251/0|17-01-17 16:39:16|try reconnect
    INFO|16251/0|17-01-17 16:39:21|app already logged on -> rejecting new connection
    INFO|16251/0|17-01-17 16:39:21|unexpected connection closed: /127.0.0.1:52058
    FINEST|16251/0|17-01-17 16:39:21|connecting to port 15003
    FINEST|16251/0|17-01-17 16:39:21|WrapperManager: channel connected, sending key
    FINEST|16251/0|17-01-17 16:39:21|session closed
    FINEST|16251/0|17-01-17 16:39:21|try reconnect
    INFO|16251/0|17-01-17 16:39:26|app already logged on -> rejecting new connection
    FINEST|16251/0|17-01-17 16:39:26|connecting to port 15003
    INFO|16251/0|17-01-17 16:39:26|unexpected connection closed: /127.0.0.1:52060
    FINEST|16251/0|17-01-17 16:39:26|WrapperManager: channel connected, sending key
    FINEST|16251/0|17-01-17 16:39:26|session closed
    FINEST|16251/0|17-01-17 16:39:26|try reconnect
    
    **... (repeated util timeout)**
    

    the logs of normal case in debug mode:

    INFO|4503/0|17-01-17 17:09:06|Controller State: WAITING_CLOSED -> ESTABLISHED
    INFO|4503/0|17-01-17 17:09:06|connecting to port 15003
    INFO|4503/0|17-01-17 17:09:06|WrapperManager: channel connected, sending key
    INFO|4503/0|17-01-17 17:09:07|Controller State: ESTABLISHED -> LOGGED_ON
    INFO|4503/0|17-01-17 17:09:07|Correct key
    INFO|wrapper|17-01-17 17:09:08|stop from Thread Thread-3 reason: USER
    INFO|wrapper|17-01-17 17:09:08|set state RUNNING->STATE_USER_STOP
    INFO|wrapper|17-01-17 17:09:08|stopping process with pid/timeout 4503 330000
    INFO|4503/0|17-01-17 17:09:08|Controller State: LOGGED_ON -> USER_STOP
    INFO|4503/0|17-01-17 17:09:08|controller sending a stop command
    INFO|4503/0|17-01-17 17:09:09|controller sending a stop command
    INFO|4503/0|17-01-17 17:09:09|controller sending a stop command
    INFO|4503/0|17-01-17 17:09:09|Controller State: USER_STOP -> WAITING_CLOSED
    INFO|4503/0|17-01-17 17:09:09|session closed -> waiting
    INFO|4503/0|17-01-17 17:09:09|session closed -> waiting
    INFO|4503/0|17-01-17 17:09:09|wrapper manager received stop command
    INFO|wrapper|17-01-17 17:09:19|waitpid 4503 0
    INFO|wrapper|17-01-17 17:09:19|exit code posix process: 0 application: 0
    INFO|wrapper|17-01-17 17:09:19|killing 4503
    INFO|4503/0|17-01-17 17:09:19|process terminated
    INFO|4503/0|17-01-17 17:09:19|Controller State: WAITING_CLOSED -> PROCESS_KILLED
    INFO|wrapper|17-01-17 17:09:19|giving up after 0 retries
    INFO|wrapper|17-01-17 17:09:19|stop from Thread yajsw.controller-1 reason: USER
    INFO|wrapper|17-01-17 17:09:19|process not in state RUNNING -> Delaying stop
    INFO|4503/0|17-01-17 17:09:20|gobler terminated OUTPUT 4503
    INFO|wrapper|17-01-17 17:09:20|process exit code: 0
    INFO|wrapper|17-01-17 17:09:20|set state STATE_USER_STOP->IDLE
    INFO|wrapper|17-01-17 17:09:20|removed lock file /home/path/wrapper/bootstrap.lock
    INFO|wrapper|17-01-17 17:09:20|removed java.pid file /home/path/wrapper/agent.bootstrap.pid
    INFO|wrapper|17-01-17 17:09:20|removed pid file /home/path/wrapper/wrapper.bootstrap.pid
    INFO|wrapper|17-01-17 17:09:20|shutdown wrapper due to exit code rule
    INFO|wrapper|17-01-17 17:09:20|set state IDLE->STATE_SHUTDOWN
    

    It seems sometimes the wrapper is unable to send STOP command to wrapperManager.

    YAJSW: yajsw-stable-12.08
    OS   : Linux/2.6.32-220.23.2.ali878.el6.x86_64/amd64
    JVM  : Oracle Corporation/1.8.0_66
    

    Any help would be appreciated!

     

    Last edit: bob 2017-01-17
  • bob

    bob - 2017-01-17

    BTW: my application is a simply main method with wrapper.exit_on_main_terminate = -1, and using shutdown hook to close.( expect wrapperManager to call system.exit(0) )

     

    Last edit: bob 2017-01-17
  • rzo

    rzo - 2017-01-22

    hello,

    it seems that there are multiple wrapped processes trying to connect to the same wrapper.
    before proceeding make sure that all java applications are stopped.

    did you set the wrapper.control configuration property ? it should be either TIGHT or APPLICATION

    could it be that you are invoking start and stop in parallel ?

    -- Ron

     
  • bob

    bob - 2017-05-02

    Hi Ron, sorry for so lately repsonse.

    i have tried various options for wrapper.control property, but i am sure the property didn't effect the phenomenon, i still face the problem of shutdown processing.

     
  • bob

    bob - 2017-05-02

    And i am sure we call the wrapper only once. i think the repearted reconnect of wrapped processes is caused by client's retries: org.rzo.yajsw.app.WrapperManagerImpl#reconnect

     

    Last edit: bob 2017-05-02

Log in to post a comment.