Hi,
I've got an opnsense router which is using mpd5. The log shows a fatal error, and the process dies. What can be the cause of this?
2020-08-25T13:08:48 ppp[22102]: process 22102 terminated
2020-08-25T13:08:48 ppp[22102]: [opt1_link0] Link: Shutdown
2020-08-25T13:08:48 ppp[22102]: [opt1] Bundle: Shutdown
2020-08-25T13:08:48 ppp[22102]: [opt1] IPV6CP: state change Stopped --> Closed
2020-08-25T13:08:48 ppp[22102]: [opt1] IPV6CP: Close event
2020-08-25T13:08:48 ppp[22102]: [opt1] IFACE: Rename interface pppoe0 to pppoe0
2020-08-25T13:08:48 ppp[22102]: [opt1] IFACE: Down event
2020-08-25T13:08:48 ppp[22102]: [opt1] IFACE: Delete route 0.0.0.0/0 x.x.x.x failed: No such process
2020-08-25T13:08:48 ppp[22102]: [opt1] IPCP: LayerDown
2020-08-25T13:08:48 ppp[22102]: [opt1] IPCP: SendTerminateReq #4
2020-08-25T13:08:48 ppp[22102]: [opt1] IPCP: state change Opened --> Closing
2020-08-25T13:08:48 ppp[22102]: [opt1] IPCP: Close event
2020-08-25T13:08:48 ppp[22102]: [opt1] IFACE: Close event
2020-08-25T13:08:48 ppp[22102]: fatal error, exiting
2020-08-25T13:08:48 ppp[22102]: 0x449dc90e3c0 <_pthread_sigmask+0x530> at /lib/libthr.so.3
2020-08-25T13:08:48 ppp[22102]: 0x33f04957cd0 <sendsignal+0x50> at /usr/local/sbin/mpd5
2020-08-25T13:08:48 ppp[22102]: 0x33f04958129 <doexit+0x289> at /usr/local/sbin/mpd5
2020-08-25T13:08:48 ppp[22102]: caught fatal signal ABRT</doexit+0x289></sendsignal+0x50>
Best regards,
Maarten
What is base system version? And we need full log, this part is not enough. The process aborted itself when it was terminating already due to some external reason most probably.
Is this problem reproducible?
Hi Eugene,
Thanks for your fast response. Unfortunately I'm unable to reproduce this at will. This is the system I run it on:
OPNsense 20.7.1-amd64
FreeBSD 12.1-RELEASE-p8-HBSD
OpenSSL 1.1.1g 21 Apr 2020
In the log records before what I posted were from a previous boot, so that's of no use. I dont know if there is a more detailed log log somewhere. I'll ask the developer of the router.
( https://forum.opnsense.org/index.php?topic=18746.0 )
Last edit: Maarten Bakker 2020-08-27
Not enough information yet. What was uptime of the system when problem occured? Did it work long time or failed just after start?
It's appears on system boot, think that's the only times I've had this issue. When it's online it seems to be stable.
I just made a change to mpd5 code so it writes additional details to its log in such case of failure. The change is committed to FreeBSD Ports as new revision 5.8_11
Please update your copy of mpd5 package (or ask opnsense maintainers do that for you), so that we have needed details if the problem returns.
ok, thanks, I've copied the new version to the router, and let you know when it happens again.
( https://mirrors.xtom.com/freebsd-pkg/FreeBSD%3A11%3Aamd64/latest/All/mpd5-5.8_11.txz )
Hi Eugene,
The fatal error happened again right after a reboot. This is the ppp log from the mpd5, this is version 5.9 now.
ppps.log
Kernel boot timestamp:
This new log shows thjat mpd aborts itself due to poll() system call returning unexpected error EINVAL. This should not happen in FreeBSD with default settings, so you should show this new log to OPNSense developers and ask them if they have any system changes/tweaks comparing vanilla FreeBSD code that could provoke this. Suspects are possibly changed values for sysctl kern.maxfilesperproc and build-time constant FD_SETSIZE.
What is hardware you are running? Do you run ntpd or some other software to syncronize its clock? If you run ntpd, please show contents of its ntp.drift file after at least 30 minutes of up time.
Can you share your mpd.conf?
Also, show output of the command: sysctl kern.maxfilesperproc
Can the mpd crash at boot because the OS updates its time? When booting the OS timestamp is "2019-10-27T09:32:06" and it looks like the daemon crashes when the OS switches to the correct time "2020-09-29T10:45:11".
It's possible but it would mean a bug in the kernel.
So, can you reproduce this reliably or semi-reliably at least by rebooting the box? What hardware is it?
No it's not at every boot unfortunately, it's a hyperv vm.
If it's a VM guest machine, then why it's emulated RTC (real-time clock chip) contains wrong time at start?
Hmm, I have the time syncronisation integration service disabled, because I want to synchronise it using ntp, and not have the host and the ntpd fight eachother. I guess when booting up, hyperv might supply a wrong RTC or none at all. (I dont know)
I could not reproduce the problem with vanilla FreeBSD system. You should either try running FreeBSD instead of OPNSense, or direct the problem to OPNSense developers. Again, the problem's due to system call poll() returning EINVAL and that should not happen.
Or maybe you could insert "ktrace -if /var/tmp/ktrace.out mpd5 ..." into startup script /usr/local/etc/rc.d/mpd5, wait until problem reproduces and upload contents of ktrace.out (preferably compressed). This should help to see what really happens. If you can do that, also add to /etc/crontab:
@reboot sleep 300 && ktrace -C
This will keep ktrace.out from growing indefinitely and still keep first 5 minutes of debugging data.
Last edit: Eugene Grosbein 2020-10-15
Feedback timeout (4 months).