One of our (LibreELEC) users reported that irsend / lircd would sometimes hang. This was with lirc 0.9.4d and I could now also reproduce this on lirc 0.10.1.
I seem to be able to trigger the issue by invoking irsend 2 times immediately after each other in combination with either a repeat setting (min_repeat) in lircd.conf or when using a repeat setting with irsend (eg -# 3).
Here's my test setup:
lircd.conf file (sony-repeat.conf):
begin remote
name Sony-RM-U305A
bits 12
flags SPACE_ENC|CONST_LENGTH
eps 30
aeps 100
header 2400 600
one 1200 600
zero 600 600
gap 45000
toggle_bit_mask 0x0
frequency 38000
min_repeat 3
begin codes
KEY_VIDEO 0x441
KEY_POWER 0xA81
KEY_DVD 0xBE1
KEY_VOLUMEUP 0x481
KEY_VOLUMEDOWN 0xC81
KEY_MUTE 0x281
end codes
end remote
lircd started with the following options
/opt/lirc-0.10.1/sbin/lircd --driver=default --device=/dev/lirc0 -n -o /tmp/lirc -P /tmp/lircd.pid -Dtrace sony-repeat.conf
running irsend 2 times:
/opt/lirc-0.10.1/bin/irsend -d /tmp/lirc SEND_ONCE Sony-RM-U305A KEY_POWER ; /opt/lirc-0.10.1/bin/irsend -d /tmp/lirc SEND_ONCE Sony-RM-U305A KEY_POWER
At this point the second irsend is hanging and lircd output looks like this:
Warning: cannot open /opt/lirc-0.10.1/etc/lirc/lirc_options.conf
lircd-0.10.1[1976]: Trace: lircd: Opening log, level: Trace
lircd-0.10.1[1976]: Debug: No systemd fd found
lircd-0.10.1[1976]: Trace: started server socket
lircd-0.10.1[1976]: Trace: parsing remote
lircd-0.10.1[1976]: Info: Using remote: Sony-RM-U305A.
lircd-0.10.1[1976]: Trace: lengths: 45000 45000 24000 26400
lircd-0.10.1[1976]: Trace: config file read
lircd-0.10.1[1976]: Notice: lircd(default) ready, using /tmp/lirc
lircd-0.10.1[1976]: Trace: registering local client
lircd-0.10.1[1976]: Notice: accepted new client on /tmp/lirc
lircd-0.10.1[1976]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read
lircd-0.10.1[1976]: Info: Cannot configure the rc device for /dev/lirc0
lircd-0.10.1[1976]: Trace: driver supports sending
lircd-0.10.1[1976]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER"
lircd-0.10.1[1976]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER
, args: Sony-RM-U305A KEY_POWER, once: 1
lircd-0.10.1[1976]: Trace: alarm in 25626 usecs
lircd-0.10.1[1976]: Trace: alarm in 25701 usecs
lircd-0.10.1[1976]: Trace: alarm in 25720 usecs
lircd-0.10.1[1976]: Debug: Sending success
lircd-0.10.1[1976]: Info: removed client
lircd-0.10.1[1976]: Trace: registering local client
lircd-0.10.1[1976]: Notice: accepted new client on /tmp/lirc
lircd-0.10.1[1976]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read
lircd-0.10.1[1976]: Info: Cannot configure the rc device for /dev/lirc0
lircd-0.10.1[1976]: Trace: driver supports sending
lircd-0.10.1[1976]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER"
lircd-0.10.1[1976]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER
, args: Sony-RM-U305A KEY_POWER, once: 1
lircd-0.10.1[1976]: Trace: alarm in 10 usecs
The "alarm in 10 usecs" looks rather suspicious to me, delay between repeats should be about 25ms.
When removing "min_repeat" from the config file I can provoke the hang by setting the repeat option on the second irsend invocation (it doesn't need to be set on the first one).
/opt/lirc-0.10.1/bin/irsend -d /tmp/lirc SEND_ONCE Sony-RM-U305A KEY_POWER ; /opt/lirc-0.10.1/bin/irsend -d /tmp/lirc -# 2 SEND_ONCE Sony-RM-U305A KEY_POWER
irsend hangs and I see the same "alarm in 10 usecs" message in lircd output
lircd-0.10.1[2007]: Trace: lircd: Opening log, level: Trace
lircd-0.10.1[2007]: Debug: No systemd fd found
lircd-0.10.1[2007]: Trace: started server socket
lircd-0.10.1[2007]: Trace: parsing remote
lircd-0.10.1[2007]: Info: Using remote: Sony-RM-U305A.
lircd-0.10.1[2007]: Trace: lengths: 45000 45000 24000 26400
lircd-0.10.1[2007]: Trace: config file read
lircd-0.10.1[2007]: Notice: lircd(default) ready, using /tmp/lirc
lircd-0.10.1[2007]: Trace: registering local client
lircd-0.10.1[2007]: Notice: accepted new client on /tmp/lirc
lircd-0.10.1[2007]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read
lircd-0.10.1[2007]: Info: Cannot configure the rc device for /dev/lirc0
lircd-0.10.1[2007]: Trace: driver supports sending
lircd-0.10.1[2007]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER"
lircd-0.10.1[2007]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER
, args: Sony-RM-U305A KEY_POWER, once: 1
lircd-0.10.1[2007]: Debug: Sending success
lircd-0.10.1[2007]: Info: removed client
lircd-0.10.1[2007]: Trace: registering local client
lircd-0.10.1[2007]: Notice: accepted new client on /tmp/lirc
lircd-0.10.1[2007]: Debug: Cannot open protocol file: /sys/class/rc/rc0/protocols for read
lircd-0.10.1[2007]: Info: Cannot configure the rc device for /dev/lirc0
lircd-0.10.1[2007]: Trace: driver supports sending
lircd-0.10.1[2007]: Trace: received command: "SEND_ONCE Sony-RM-U305A KEY_POWER 2"
lircd-0.10.1[2007]: Debug: Sending once, msg: SEND_ONCE Sony-RM-U305A KEY_POWER 2
, args: Sony-RM-U305A KEY_POWER 2, once: 1
lircd-0.10.1[2007]: Trace: alarm in 10 usecs
I've tested this on a Raspberry Pi 2 running Raspbian Stretch with kernel 4.14 and the pwm-ir-tx driver. Reporter was using a MCEUSB IR transceiver, also on RPi.
SInce this a common function used by many users, I tend to think that this is a problem in the RPi's kernel driver. And, the kernel drivers are not part of lirc....
Have you any chance to test this on some other, non-RPi hardware?
Sorry for long delays. I lost lirc for some time, life requiring other things to be done.
EDIT: To clear things out: Unless proven wrong by a test on non-RPi hardware, this bug really belongs to the kernel (or possibly the distro, if they added their own driver).
Last edit: Alec Leamas 2018-02-11
Please note that the issues were NOT observed with the downstream
lirc_rpi kernel driver but with the upstream mceusb and pwm-ir-tx
drivers from the rc-core subsystem.
As ir-ctl is working fine with these kernel drivers a bug in lircd
seems like a likely possibility (although I won't rule out some
other cause).
ATM I can't test lirc stuff on a non-RPi platform but I redid the
steps and had a closer look with gdb.
irsend seems to be hanging in read():
lircd hangs in poll():
A very wild guess would be a race in socket communication, but I
could be wrong as I'm not really familiar with the code.
If you suspect a race as well drop me a line if you want me to do
more testing. I can reliably reproduce the issue here.
Yes, but we are still talking about the RPi, right. And just as a clarification, the kernel drivers are part of the kernel, not lirc.
So, what's really needed here is doing the same thing on some non-RPi hardware. That should clear out if this this is a kernel or lirc problem. Being the lirc maintainer, of course my fist bet is the kernel...
I could now also reproduce this on x86: Debian Stretch, vanilla upstream 64-bit kernel 4.14.17 using the serial_ir driver.
lircd started as in initial report:
then irsend started in a loop:
After about a minute or two lircd is stuck again in poll() and irsend in read()
OK, so this is about litc, agreed. Just to take the simple parts first: could you please verify the runtime linkage using ldd /opt/...irsend and ldd /opt/...lircd. Just to make sure nothing like /usr/lib/llblirc.so..0 .0.0 is imvolved?
ldd output looks fine to me:
Just noticed that when I run lircd with valgrind the lockup happens quite reproducibly here on my x86 box when running 2 irsend commands right after each other (irsend ... ; irsend ...).
Hope this helps you to reproduce the issue
I have also run into this issue "irsend / lircd hang when using repeats". It's not a irsend/lircd hard hang I see. Rather the IR send stalls and never completes.
I believe I have tracked down the root cause of the issue. lircd uses SIGALRM to time the delay between IR repeats. There are race conditions with the delivery time of sigalrm and where sigalrm interrupts and resumes lircd main code execution, which affect irsend hang/stall. As such, the results of the race(s) are highly dependent on the host platform, as well as other activity, including unrelated actively, on the platform. Problem occurrence would also be random.
The race problems found are:
Function send_core() sets the 1st repeat alarm. The sigalrm handler simply sets the global variable alrm = 1. If this alarm signal fires after checking for alarms in mywaitfordata() at
but before
the poll can block indefinitely.
Function dosigalrm() sets the 2nd and subsequent repeat alarms. If this alarm signal fires before dosigalrm() returns, the alarm gets lost after return at
and the IR repeat stalls permanently. If the alarm otherwise occurs before
the poll can block indefinitely.
One of the race issues is easy to replicate by replacing code in function schedule_repeat_timer():
with
for simulating immediate alarm signal.
One possible fix is to increase the minimum 10 usecs alarm time to an unknown, but larger number. But this alters the timing of IR repeats, and is still not a guarantee to avoid the race conditions seen.
I believe a better solution will be to replace the IR repeat sigalrm timing method with using poll timeout and time of day clock comparison. This is already the method in use in the lircd.cpp code for IR release_time. I will post a patch candidate in a subsequent followup for your consideration.
Last edit: asunxx 2018-04-03
[PATCH] lirc: #315 irsend / lircd hang when using repeats
irsend SEND_ONCE IR may stall forever for lircd.conf remote control buttons
with min_repeat setting 2 or higher. irsend SEND_START IR may stall forever
for any remote control buttons.
The root causes of IR send stall are due to race timing of SIGALRM in
lircd, used for IR repeat timing, with the time necessary for lircd code
execution to reach its main poll system call.
Patch replaces lircd IR repeat sigalrm timing method with poll timeout and
time of day clock comparison. Poll timeout method is already in use for
lircd IR release_time functionality. And is immune from the SIGALRM
method's timing race.
Correct possible premature timeout for mywaitfordata(maxusecs) where
maxusecs > 0. Premature timeout may occur when release_time
(and this patch's repeat_time) delay remaining is less than maxusecs
delay remaining.
Signed-off-by: A Sun asunxx@users.sourceforge.net
daemons/lircd.cpp | 100 +++++++++++++++++++++++++++++++++++-------------------
1 file changed, 66 insertions(+), 34 deletions(-)
Last edit: asunxx 2018-04-04
I don't exactly know how to apply this patch. Is this just
patch < <filepath>and if so, onto which file I need to apply that? Thx in advance.Last edit: Alex L. 2018-12-29
The 0003-lirc...patch file is a patch formatted with "git format-patch ...", and can be applied with "git apply ..."
If git code manager is not applicable, then
patch lircd.cpp < <filepath>should work. This patch only affects the lircd.cpp file from lirc 0.10.1.Can someone post step by step instructions on how to apply this patch to Raspbian? I dont understand the PATCH command syntax at all. Thanks in advanced.
I'm also experiencing this bug. Has it been fixed in the development version?
I installed the patch in 0.10.1. Previously I almost always had the timout problem when sending SONY 20-bit codes but with the patch installed it works great.
I recently had need to apply this patch to V0.10.1 and found it quite difficult. However, I did succeed and you can find my notes on the process on my website.
The version of LIRC in the Pi respository is still at this version, hence the need. I hope someone finds this useful. I feel that when someone succeeds in fixing a problem it's of little help saying they succeeded without saying how!
Colin - Thank you for the detailed instructions on your website.
irsendnow works for me with the count flag, which my devices require.A couple of comments, though.
1. the wget string for the patch has the filename truncated to
.pa, where it should be.patch. wgetting the .pa file pulls a lot of website-related html code, not the patch.2. the arguments to
configuredid not work:../configure ––prefix=/usr ––sysconfdir=/etcI had to use alternate flags like this:
../configure --p /usr -sysconfdir /etc3. Lastly,
cd daemons/.libshould becd daemons/.libsAgain, Thank you for this. I'm running Debian Buster on a Raspberry Pi 3+
Last edit: Jim Kaiser 2022-04-27
Was this patch merged into v0.10.2? If not will there be a pull request created. I suspect this is related to the issue I'm seeing https://sourceforge.net/p/lirc/tickets/375/
Created merge request of patch for 0.10.2 https://sourceforge.net/p/lirc/git/merge-requests/50/
I also was affected by this bug. To summarize: The sigalarm used for scheduling repeat signals was racy with lircd's main
pollloop. The symptom was: A 2nd irsend (hot on the heels of the 1st irsend) would hang (gdb showed lircd was inpoll), until you sent a 3rd irsend (and then both the 2nd & 3rd irsends would be processed). I could only reproduce it with drivers/plugins that block while each signal is being sent, such as the kernel driver (I was using pwm-ir-tx) — I couldn't reproduce it with ftdix (which returns to the lircd main loop as soon as it writes the signal to the USB device, but before the USB device has finished sending it).https://sourceforge.net/p/lirc/git/merge-requests/48/ narrowed the window of this race condition, and https://sourceforge.net/p/lirc/git/merge-requests/50/ fixes it (not tested personally, but we had a similar patch we wrote ourselves). Both are now merged to master.
Last edit: David Röthlisberger 2023-03-12
https://sourceforge.net/p/lirc/git/merge-requests/50/ merged, should be fixed
I see that the Milestone for this fix is Future. However, the changed
daemons/lircd.cppfile may be found here.I downloaded and replaced the copy in the 0.10.2 distribution tarball with this one and
irsendno longer hangs when using--count=2(or greater). Thank you for the fix!