Menu

#395 Linuxcncrsh stop responding

2.6
closed-fixed
nobody
linuxcncrsh (1)
1
2014-11-16
2014-10-02
No

Linuxcncrsh does not respond any more after several mode switching.
On 2.6.3 :
- start axis and linuxcncrsh server
- power on machine in Axis and set machine origins
- play this scenario :
> HELLO EMC CLIENT 1.1
HELLO ACK EMCNETSVR 1.1
> SET ECHO OFF
SET ECHO OFF
> SET ENABLE EMCTOO
> SET MODE MDI
SET MODE NAK
> SET MODE MDI
SET MODE NAK
> SET MODE MANUAL
SET MODE NAK
> GET MODE
MODE MANUAL
> SET MODE MANUAL
> SET MODE MDI
> SET MODE MANUAL
> SET MODE MDI
> SET MDI G21 G40 G64 G91
> SET MDI G1 F1522 Y1
> GET MODE
=> NO ANSWER

As you can see, it is impossible to set mode at the beginning.

And after a manual mode, mdi mode does not respond: the two MDI commands are not executed by the server, even if there is no error.
LinuxCNCRSH does not respond any more, no answer to "GET MODE"

This issue was not present in 2.5.

LinuxCNC is running on Ubuntu 12.04.5LTS, CNC Machine is a 4 axes router (XYZA).

Discussion

  • Sebastian Kuzminsky

    The change in behavior is apparently intentional, and implemented by
    commit 33aa697a:

     Add a check for certain 'Set' commands.
    
     Add a check for certain 'Set' commands to make sure they cannot be
     issued when the machine state is other than 'ON'.
    

    So try this sequence instead:

     > hello EMC client 1.1
     HELLO ACK i 1.1
     > set echo off
     set echo off
     > set enable EMCTOO
     > set estop off
     > set machine on
     > set mode manual
     > get mode
     MODE MANUAL
     > set mode mdi
     get mode
     MODE MDI
    
     
    • Sharkillator

      Sharkillator - 2014-10-02

      Thank you for your answer.

      It works until you do :

      set mode manual
      get mode
      MODE MANUAL
      set mode mdi
      get mode
      MODE MDI
      set mode manual
      get mode
      => NO ANSWER

      This test a bit stupid written like that but this is the same problem with commands between mode switching and we use a software that need to be able to send manual and mdi commands very often.

       
  • Sebastian Kuzminsky

    I confirm that mode setting via linuxcncrsh (as described by OP) works
    in 2.5 and does not work in 2.6.

    We have a linuxcncrsh test in our test suite, and it passes on both 2.5
    and 2.6. It does something very similar:

     hello EMC mt 1.0
     set enable EMCTOO
     set set_wait done
     set mode manual
    

    If i remove the 'set set_wait done' line, then the 'set mode manual' on
    the next line fails. Not sure why yet, but i'll look in to it. In the
    meanwhile, try adding 'set set_wait done' after enable and see if that
    works around the problem.

     
    • Sharkillator

      Sharkillator - 2014-10-02

      Same problem here. It works until you do : manual, mdi, manual

       
  • Sharkillator

    Sharkillator - 2014-10-02

    Sorry, message was not where it should have been. corrected ;-)

     

    Last edit: Sharkillator 2014-10-02
  • Jeff Epler

    Jeff Epler - 2014-10-02

    In my testing, I found that when the last 'SET MODE MDI' is issued,
    linuxcncrsh's thread for that connection stops forever in
    emcCommandWaitReceived, similar to the following:

    0 0x00007ffff6c88e93 in select () at ../sysdeps/unix/syscall-template.S:81

    1 0x00007ffff7baaa7a in esleep (seconds_to_sleep=0.10000000000000001) at libnml/os_intf/_timer.c:92

    2 0x00000000004060e6 in emcCommandWaitReceived (serial_number=15) at emc/usr_intf/shcom.cc:298

    3 0x0000000000406470 in sendManual () at emc/usr_intf/shcom.cc:504

    4 0x0000000000402fd4 in setMode (context=0x61cdf0, s=0x61ce61 "MANUAL") at emc/usr_intf/emcrsh.cc:891

    5 commandSet (context=context@entry=0x61cdf0) at emc/usr_intf/emcrsh.cc:1259

    6 0x0000000000405993 in parseCommand (context=context@entry=0x61cdf0) at emc/usr_intf/emcrsh.cc:2641

    7 0x0000000000405abe in readClient (arg=0x61cdf0) at emc/usr_intf/emcrsh.cc:2707

    8 0x00007ffff777c0a4 in start_thread (arg=0x7ffff6ba9700) at pthread_create.c:309

    9 0x00007ffff6c8fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

    2 0x00000000004060e6 in emcCommandWaitReceived (serial_number=15) at emc/usr_intf/shcom.cc:298

    (gdb) list
    [inside what is effectively a 'while forever' loop]
    294 if (emcStatus->echo_serial_number == serial_number) {
    295 return 0;
    296 }
    297
    298 esleep(EMC_COMMAND_DELAY);
    299 end += EMC_COMMAND_DELAY;
    (gdb) p emcStatus->echo_serial_number
    $1 = 17

    linuxcncrsh has issued command #15 to switch to manual mode. axis notices the switch and issues
    two commands of its own, making the final serial number be 17 instead of 15:
    Issuing EMC_TASK_SET_MODE -- (+504,+24, +21, +1,)
    Issuing EMC_TASK_PLAN_SYNCH -- (+516,+24, +0,)
    Issuing EMC_TRAJ_SET_TELEOP_ENABLE -- (+230,+24, +20, +0,)

    linuxcnc's design of using the 'echo_serial_number' to determine whether a
    command has been accepted has always been fragile. linuxcncrsh needs to change
    what it's doing here, because when it is not the only UI, the algorithm used in
    emcCommandWaitReceived will hang like this.

     
  • Sharkillator

    Sharkillator - 2014-10-03

    I've just tested another thing -> jog
    You can't stop jog because if you do a "set jog 0 100" for example, it starts but linuxcncrsh stop responding after that, so you can't stop jog with "set jog_stop 0".
    Jog is done in manual mode.

     
  • Sebastian Kuzminsky

    Jeff, it looks to me like halui and Axis have the same "while echo_serial != my_serial" wait loop (emcWaitCommandReceived() in emcmodule.cc and halui.cc). So not exclusively a linuxcncrsh problem, but a race in several (all?) our UIs.

     
  • Jeff Epler

    Jeff Epler - 2014-11-16
    • status: open --> closed-fixed
     
  • Jeff Epler

    Jeff Epler - 2014-11-16

    We believe this bug will be fixed in linuxcnc 2.7.