Linuxcncrsh does not respond any more after several mode switching.
On 2.6.3 :
As you can see, it is impossible to set mode at the beginning.
And after a manual mode, mdi mode does not respond: the two MDI commands are not executed by the server, even if there is no error.
LinuxCNCRSH does not respond any more, no answer to "GET MODE"
This issue was not present in 2.5.
LinuxCNC is running on Ubuntu 12.04.5LTS, CNC Machine is a 4 axes router (XYZA).
The change in behavior is apparently intentional, and implemented by
commit 33aa697a:
So try this sequence instead:
Thank you for your answer.
It works until you do :
This test a bit stupid written like that but this is the same problem with commands between mode switching and we use a software that need to be able to send manual and mdi commands very often.
I confirm that mode setting via linuxcncrsh (as described by OP) works
in 2.5 and does not work in 2.6.
We have a linuxcncrsh test in our test suite, and it passes on both 2.5
and 2.6. It does something very similar:
If i remove the 'set set_wait done' line, then the 'set mode manual' on
the next line fails. Not sure why yet, but i'll look in to it. In the
meanwhile, try adding 'set set_wait done' after enable and see if that
works around the problem.
Same problem here. It works until you do : manual, mdi, manual
Sorry, message was not where it should have been. corrected ;-)
Last edit: Sharkillator 2014-10-02
In my testing, I found that when the last 'SET MODE MDI' is issued,
linuxcncrsh's thread for that connection stops forever in
emcCommandWaitReceived, similar to the following:
0 0x00007ffff6c88e93 in select () at ../sysdeps/unix/syscall-template.S:81
1 0x00007ffff7baaa7a in esleep (seconds_to_sleep=0.10000000000000001) at libnml/os_intf/_timer.c:92
2 0x00000000004060e6 in emcCommandWaitReceived (serial_number=15) at emc/usr_intf/shcom.cc:298
3 0x0000000000406470 in sendManual () at emc/usr_intf/shcom.cc:504
4 0x0000000000402fd4 in setMode (context=0x61cdf0, s=0x61ce61 "MANUAL") at emc/usr_intf/emcrsh.cc:891
5 commandSet (context=context@entry=0x61cdf0) at emc/usr_intf/emcrsh.cc:1259
6 0x0000000000405993 in parseCommand (context=context@entry=0x61cdf0) at emc/usr_intf/emcrsh.cc:2641
7 0x0000000000405abe in readClient (arg=0x61cdf0) at emc/usr_intf/emcrsh.cc:2707
8 0x00007ffff777c0a4 in start_thread (arg=0x7ffff6ba9700) at pthread_create.c:309
9 0x00007ffff6c8fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
2 0x00000000004060e6 in emcCommandWaitReceived (serial_number=15) at emc/usr_intf/shcom.cc:298
(gdb) list
[inside what is effectively a 'while forever' loop]
294 if (emcStatus->echo_serial_number == serial_number) {
295 return 0;
296 }
297
298 esleep(EMC_COMMAND_DELAY);
299 end += EMC_COMMAND_DELAY;
(gdb) p emcStatus->echo_serial_number
$1 = 17
linuxcncrsh has issued command #15 to switch to manual mode. axis notices the switch and issues
two commands of its own, making the final serial number be 17 instead of 15:
Issuing EMC_TASK_SET_MODE -- (+504,+24, +21, +1,)
Issuing EMC_TASK_PLAN_SYNCH -- (+516,+24, +0,)
Issuing EMC_TRAJ_SET_TELEOP_ENABLE -- (+230,+24, +20, +0,)
linuxcnc's design of using the 'echo_serial_number' to determine whether a
command has been accepted has always been fragile. linuxcncrsh needs to change
what it's doing here, because when it is not the only UI, the algorithm used in
emcCommandWaitReceived will hang like this.
I've just tested another thing -> jog
You can't stop jog because if you do a "set jog 0 100" for example, it starts but linuxcncrsh stop responding after that, so you can't stop jog with "set jog_stop 0".
Jog is done in manual mode.
Jeff, it looks to me like halui and Axis have the same "while echo_serial != my_serial" wait loop (emcWaitCommandReceived() in emcmodule.cc and halui.cc). So not exclusively a linuxcncrsh problem, but a race in several (all?) our UIs.
We believe this bug will be fixed in linuxcnc 2.7.