One of our (LibreELEC) users reported that irsend / lircd would sometimes hang. This was with lirc 0.9.4d and I could now also reproduce this on lirc 0.10.1.
I seem to be able to trigger the issue by invoking irsend 2 times immediately after each other in combination with either a repeat setting (min_repeat) in lircd.conf or when using a repeat setting with irsend (eg -# 3).
Here's my test setup:
lircd.conf file (sony-repeat.conf):
begin remote name Sony-RM-U305A bits 12 flags SPACE_ENC|CONST_LENGTH eps 30 aeps 100 header 2400 600 one 1200 600 zero 600 600 gap 45000 toggle_bit_mask 0x0 frequency 38000 min_repeat 3 begin codes KEY_VIDEO 0x441 KEY_POWER 0xA81 KEY_DVD 0xBE1 KEY_VOLUMEUP 0x481 KEY_VOLUMEDOWN 0xC81 KEY_MUTE 0x281 end codes end remote
lircd started with the following options
/opt/lirc-0.10.1/sbin/lircd --driver=default --device=/dev/lirc0 -n -o /tmp/lirc -P /tmp/lircd.pid -Dtrace sony-repeat.conf
running irsend 2 times:
/opt/lirc-0.10.1/bin/irsend -d /tmp/lirc SEND_ONCE Sony-RM-U305A KEY_POWER ; /opt/lirc-0.10.1/bin/irsend -d /tmp/lirc SEND_ONCE Sony-RM-U305A KEY_POWER
At this point the second irsend is hanging and lircd output looks like this:
Warning: cannot open /opt/lirc-0.10.1/etc/lirc/lirc_options.conf lircd-0.10.1[1976]: Trace: lircd: Opening log, level: Trace lircd-0.10.1[1976]: Debug: No systemd fd found lircd-0.10.1[1976]: Trace: started server socket lircd-0.10.1[1976]: Trace: parsing remote lircd-0.10.1[1976]: Info: Using remote: Sony-RM-U305A. lircd-0.10.1[1976]: Trace: lengths: 45000 45000 24000 26400 lircd-0.10.1[1976]: Trace: config file read lircd-0.10.1[1976]: Notice: lircd(default) ready, using /tmp/lirc lircd-0.10.1[1976]: Trace: registering local client lircd-0.10.1[1976]: Notice: accepted new client on /tmp/lirc lircd-0.10.1[1976]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read lircd-0.10.1[1976]: Info: Cannot configure the rc device for /dev/lirc0 lircd-0.10.1[1976]: Trace: driver supports sending lircd-0.10.1[1976]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER" lircd-0.10.1[1976]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER , args: Sony-RM-U305A KEY_POWER, once: 1 lircd-0.10.1[1976]: Trace: alarm in 25626 usecs lircd-0.10.1[1976]: Trace: alarm in 25701 usecs lircd-0.10.1[1976]: Trace: alarm in 25720 usecs lircd-0.10.1[1976]: Debug: Sending success lircd-0.10.1[1976]: Info: removed client lircd-0.10.1[1976]: Trace: registering local client lircd-0.10.1[1976]: Notice: accepted new client on /tmp/lirc lircd-0.10.1[1976]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read lircd-0.10.1[1976]: Info: Cannot configure the rc device for /dev/lirc0 lircd-0.10.1[1976]: Trace: driver supports sending lircd-0.10.1[1976]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER" lircd-0.10.1[1976]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER , args: Sony-RM-U305A KEY_POWER, once: 1 lircd-0.10.1[1976]: Trace: alarm in 10 usecs
The "alarm in 10 usecs" looks rather suspicious to me, delay between repeats should be about 25ms.
When removing "min_repeat" from the config file I can provoke the hang by setting the repeat option on the second irsend invocation (it doesn't need to be set on the first one).
/opt/lirc-0.10.1/bin/irsend -d /tmp/lirc SEND_ONCE Sony-RM-U305A KEY_POWER ; /opt/lirc-0.10.1/bin/irsend -d /tmp/lirc -# 2 SEND_ONCE Sony-RM-U305A KEY_POWER
irsend hangs and I see the same "alarm in 10 usecs" message in lircd output
lircd-0.10.1[2007]: Trace: lircd: Opening log, level: Trace lircd-0.10.1[2007]: Debug: No systemd fd found lircd-0.10.1[2007]: Trace: started server socket lircd-0.10.1[2007]: Trace: parsing remote lircd-0.10.1[2007]: Info: Using remote: Sony-RM-U305A. lircd-0.10.1[2007]: Trace: lengths: 45000 45000 24000 26400 lircd-0.10.1[2007]: Trace: config file read lircd-0.10.1[2007]: Notice: lircd(default) ready, using /tmp/lirc lircd-0.10.1[2007]: Trace: registering local client lircd-0.10.1[2007]: Notice: accepted new client on /tmp/lirc lircd-0.10.1[2007]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read lircd-0.10.1[2007]: Info: Cannot configure the rc device for /dev/lirc0 lircd-0.10.1[2007]: Trace: driver supports sending lircd-0.10.1[2007]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER" lircd-0.10.1[2007]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER , args: Sony-RM-U305A KEY_POWER, once: 1 lircd-0.10.1[2007]: Debug: Sending success lircd-0.10.1[2007]: Info: removed client lircd-0.10.1[2007]: Trace: registering local client lircd-0.10.1[2007]: Notice: accepted new client on /tmp/lirc lircd-0.10.1[2007]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read lircd-0.10.1[2007]: Info: Cannot configure the rc device for /dev/lirc0 lircd-0.10.1[2007]: Trace: driver supports sending lircd-0.10.1[2007]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER 2" lircd-0.10.1[2007]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER 2 , args: Sony-RM-U305A KEY_POWER 2, once: 1 lircd-0.10.1[2007]: Trace: alarm in 10 usecs
I've tested this on a Raspberry Pi 2 running Raspbian Stretch with kernel 4.14 and the pwm-ir-tx driver. Reporter was using a MCEUSB IR transceiver, also on RPi.
SInce this a common function used by many users, I tend to think that this is a problem in the RPi's kernel driver. And, the kernel drivers are not part of lirc....
Have you any chance to test this on some other, non-RPi hardware?
Sorry for long delays. I lost lirc for some time, life requiring other things to be done.
EDIT: To clear things out: Unless proven wrong by a test on non-RPi hardware, this bug really belongs to the kernel (or possibly the distro, if they added their own driver).
Last edit: Alec Leamas 2018-02-11
Please note that the issues were NOT observed with the downstream
lirc_rpi kernel driver but with the upstream mceusb and pwm-ir-tx
drivers from the rc-core subsystem.
As ir-ctl is working fine with these kernel drivers a bug in lircd
seems like a likely possibility (although I won't rule out some
other cause).
ATM I can't test lirc stuff on a non-RPi platform but I redid the
steps and had a closer look with gdb.
irsend seems to be hanging in read():
lircd hangs in poll():
A very wild guess would be a race in socket communication, but I
could be wrong as I'm not really familiar with the code.
If you suspect a race as well drop me a line if you want me to do
more testing. I can reliably reproduce the issue here.
Yes, but we are still talking about the RPi, right. And just as a clarification, the kernel drivers are part of the kernel, not lirc.
So, what's really needed here is doing the same thing on some non-RPi hardware. That should clear out if this this is a kernel or lirc problem. Being the lirc maintainer, of course my fist bet is the kernel...
I could now also reproduce this on x86: Debian Stretch, vanilla upstream 64-bit kernel 4.14.17 using the serial_ir driver.
lircd started as in initial report:
then irsend started in a loop:
After about a minute or two lircd is stuck again in poll() and irsend in read()
OK, so this is about litc, agreed. Just to take the simple parts first: could you please verify the runtime linkage using ldd /opt/...irsend and ldd /opt/...lircd. Just to make sure nothing like /usr/lib/llblirc.so..0 .0.0 is imvolved?
ldd output looks fine to me:
Just noticed that when I run lircd with valgrind the lockup happens quite reproducibly here on my x86 box when running 2 irsend commands right after each other (irsend ... ; irsend ...).
Hope this helps you to reproduce the issue
I have also run into this issue "irsend / lircd hang when using repeats". It's not a irsend/lircd hard hang I see. Rather the IR send stalls and never completes.
I believe I have tracked down the root cause of the issue. lircd uses SIGALRM to time the delay between IR repeats. There are race conditions with the delivery time of sigalrm and where sigalrm interrupts and resumes lircd main code execution, which affect irsend hang/stall. As such, the results of the race(s) are highly dependent on the host platform, as well as other activity, including unrelated actively, on the platform. Problem occurrence would also be random.
The race problems found are:
Function send_core() sets the 1st repeat alarm. The sigalrm handler simply sets the global variable alrm = 1. If this alarm signal fires after checking for alarms in mywaitfordata() at
but before
the poll can block indefinitely.
Function dosigalrm() sets the 2nd and subsequent repeat alarms. If this alarm signal fires before dosigalrm() returns, the alarm gets lost after return at
and the IR repeat stalls permanently. If the alarm otherwise occurs before
the poll can block indefinitely.
One of the race issues is easy to replicate by replacing code in function schedule_repeat_timer():
with
for simulating immediate alarm signal.
One possible fix is to increase the minimum 10 usecs alarm time to an unknown, but larger number. But this alters the timing of IR repeats, and is still not a guarantee to avoid the race conditions seen.
I believe a better solution will be to replace the IR repeat sigalrm timing method with using poll timeout and time of day clock comparison. This is already the method in use in the lircd.cpp code for IR release_time. I will post a patch candidate in a subsequent followup for your consideration.
Last edit: asunxx 2018-04-03
[PATCH] lirc: #315 irsend / lircd hang when using repeats
irsend SEND_ONCE IR may stall forever for lircd.conf remote control buttons
with min_repeat setting 2 or higher. irsend SEND_START IR may stall forever
for any remote control buttons.
The root causes of IR send stall are due to race timing of SIGALRM in
lircd, used for IR repeat timing, with the time necessary for lircd code
execution to reach its main poll system call.
Patch replaces lircd IR repeat sigalrm timing method with poll timeout and
time of day clock comparison. Poll timeout method is already in use for
lircd IR release_time functionality. And is immune from the SIGALRM
method's timing race.
Correct possible premature timeout for mywaitfordata(maxusecs) where
maxusecs > 0. Premature timeout may occur when release_time
(and this patch's repeat_time) delay remaining is less than maxusecs
delay remaining.
Signed-off-by: A Sun asunxx@users.sourceforge.net
daemons/lircd.cpp | 100 +++++++++++++++++++++++++++++++++++-------------------
1 file changed, 66 insertions(+), 34 deletions(-)
Last edit: asunxx 2018-04-04
I don't exactly know how to apply this patch. Is this just
patch < <filepath>
and if so, onto which file I need to apply that? Thx in advance.Last edit: Alex L. 2018-12-29
The 0003-lirc...patch file is a patch formatted with "git format-patch ...", and can be applied with "git apply ..."
If git code manager is not applicable, then
patch lircd.cpp < <filepath>
should work. This patch only affects the lircd.cpp file from lirc 0.10.1.Can someone post step by step instructions on how to apply this patch to Raspbian? I dont understand the PATCH command syntax at all. Thanks in advanced.
I'm also experiencing this bug. Has it been fixed in the development version?
I installed the patch in 0.10.1. Previously I almost always had the timout problem when sending SONY 20-bit codes but with the patch installed it works great.
I recently had need to apply this patch to V0.10.1 and found it quite difficult. However, I did succeed and you can find my notes on the process on my website.
The version of LIRC in the Pi respository is still at this version, hence the need. I hope someone finds this useful. I feel that when someone succeeds in fixing a problem it's of little help saying they succeeded without saying how!
Colin - Thank you for the detailed instructions on your website.
irsend
now works for me with the count flag, which my devices require.A couple of comments, though.
1. the wget string for the patch has the filename truncated to
.pa
, where it should be.patch
. wgetting the .pa file pulls a lot of website-related html code, not the patch.2. the arguments to
configure
did not work:../configure ––prefix=/usr ––sysconfdir=/etc
I had to use alternate flags like this:
../configure --p /usr -sysconfdir /etc
3. Lastly,
cd daemons/.lib
should becd daemons/.libs
Again, Thank you for this. I'm running Debian Buster on a Raspberry Pi 3+
Last edit: Jim Kaiser 2022-04-27
Was this patch merged into v0.10.2? If not will there be a pull request created. I suspect this is related to the issue I'm seeing https://sourceforge.net/p/lirc/tickets/375/
Created merge request of patch for 0.10.2 https://sourceforge.net/p/lirc/git/merge-requests/50/
I also was affected by this bug. To summarize: The sigalarm used for scheduling repeat signals was racy with lircd's main
poll
loop. The symptom was: A 2nd irsend (hot on the heels of the 1st irsend) would hang (gdb showed lircd was inpoll
), until you sent a 3rd irsend (and then both the 2nd & 3rd irsends would be processed). I could only reproduce it with drivers/plugins that block while each signal is being sent, such as the kernel driver (I was using pwm-ir-tx) — I couldn't reproduce it with ftdix (which returns to the lircd main loop as soon as it writes the signal to the USB device, but before the USB device has finished sending it).https://sourceforge.net/p/lirc/git/merge-requests/48/ narrowed the window of this race condition, and https://sourceforge.net/p/lirc/git/merge-requests/50/ fixes it (not tested personally, but we had a similar patch we wrote ourselves). Both are now merged to master.
Last edit: David Röthlisberger 2023-03-12
https://sourceforge.net/p/lirc/git/merge-requests/50/ merged, should be fixed
I see that the Milestone for this fix is Future. However, the changed
daemons/lircd.cpp
file may be found here.I downloaded and replaced the copy in the 0.10.2 distribution tarball with this one and
irsend
no longer hangs when using--count=2
(or greater). Thank you for the fix!