MPD: FreeBSD PPP daemon / Bugs / #68 fatal error

Eugene Grosbein - 2020-08-27

What is base system version? And we need full log, this part is not enough. The process aborted itself when it was terminating already due to some external reason most probably.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eugene Grosbein - 2020-08-27

Is this problem reproducible?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maarten Bakker - 2020-08-27

Hi Eugene,

Thanks for your fast response. Unfortunately I'm unable to reproduce this at will. This is the system I run it on:

OPNsense 20.7.1-amd64
FreeBSD 12.1-RELEASE-p8-HBSD
OpenSSL 1.1.1g 21 Apr 2020

In the log records before what I posted were from a previous boot, so that's of no use. I dont know if there is a more detailed log log somewhere. I'll ask the developer of the router.

( https://forum.opnsense.org/index.php?topic=18746.0 )

Last edit: Maarten Bakker 2020-08-27

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eugene Grosbein - 2020-08-27

Not enough information yet. What was uptime of the system when problem occured? Did it work long time or failed just after start?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maarten Bakker - 2020-08-27

It's appears on system boot, think that's the only times I've had this issue. When it's online it seems to be stable.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eugene Grosbein - 2020-08-27

I just made a change to mpd5 code so it writes additional details to its log in such case of failure. The change is committed to FreeBSD Ports as new revision 5.8_11

Please update your copy of mpd5 package (or ask opnsense maintainers do that for you), so that we have needed details if the problem returns.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maarten Bakker - 2020-09-01

ok, thanks, I've copied the new version to the router, and let you know when it happens again.

( https://mirrors.xtom.com/freebsd-pkg/FreeBSD%3A11%3Aamd64/latest/All/mpd5-5.8_11.txz )

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hi Eugene,

The fatal error happened again right after a reboot. This is the ppp log from the mpd5, this is version 5.9 now.

ppps.log

2020-09-29T10:45:11 ppp[31953]  process 31953 terminated
2020-09-29T10:45:11 ppp[31953]  [opt1_link0] Link: Shutdown
2020-09-29T10:45:11 ppp[31953]  [opt1] Bundle: Shutdown
2020-09-29T10:45:11 ppp[31953]  [opt1] IPV6CP: state change Stopped --> Closed
2020-09-29T10:45:11 ppp[31953]  [opt1] IPV6CP: Close event
2020-09-29T10:45:11 ppp[31953]  [opt1] IFACE: Rename interface pppoe0 to pppoe0
2020-09-29T10:45:11 ppp[31953]  [opt1] IFACE: Down event
2020-09-29T10:45:11 ppp[31953]  [opt1] IFACE: Delete route 0.0.0.0/0 x.x.x.x failed: No such process
2020-09-29T10:45:11 ppp[31953]  [opt1] IPCP: LayerDown
2020-09-29T10:45:11 ppp[31953]  [opt1] IPCP: SendTerminateReq #4
2020-09-29T10:45:11 ppp[31953]  [opt1] IPCP: state change Opened --> Closing
2020-09-29T10:45:11 ppp[31953]  [opt1] IPCP: Close event
2020-09-29T10:45:11 ppp[31953]  [opt1] IFACE: Close event
2020-09-29T10:45:11 ppp[31953]  fatal error, exiting
2020-09-29T10:45:11 ppp[31953]  0x71c609393c0 <_pthread_sigmask+0x530> at /lib/libthr.so.3
2020-09-29T10:45:11 ppp[31953]  0x395692ebea0 <SendSignal+0x50> at /usr/local/sbin/mpd5
2020-09-29T10:45:11 ppp[31953]  0x395692ec2f9 <DoExit+0x289> at /usr/local/sbin/mpd5
2020-09-29T10:45:11 ppp[31953]  caught fatal signal ABRT
2020-09-29T10:45:11 ppp[31953]  pevent_ctx_main: poll: Invalid argument
2019-10-27T09:32:08 ppp[31953]  [opt1] IFACE: Rename interface ng0 to pppoe0
2019-10-27T09:32:08 ppp[31953]  [opt1] IFACE: Up event
2019-10-27T09:32:08 ppp[31953]  [opt1]   x.x.x.x -> x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: LayerUp
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: state change Ack-Sent --> Opened
2019-10-27T09:32:08 ppp[31953]  [opt1]   SECDNS x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1]   PRIDNS x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1]   IPADDR x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: rec'd Configure Ack #3 (Ack-Sent)
2019-10-27T09:32:08 ppp[31953]  [opt1]   SECDNS x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1]   PRIDNS x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1]   IPADDR x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: SendConfigReq #3
2019-10-27T09:32:08 ppp[31953]  [opt1]   SECDNS x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1]   PRIDNS x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1]     x.x.x.x is OK
2019-10-27T09:32:08 ppp[31953]  [opt1]   IPADDR x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: rec'd Configure Nak #2 (Ack-Sent)
2019-10-27T09:32:08 ppp[31953]  [opt1] IPV6CP: LayerFinish
2019-10-27T09:32:08 ppp[31953]  [opt1] IPV6CP: state change Req-Sent --> Stopped
2019-10-27T09:32:08 ppp[31953]  [opt1] IPV6CP: protocol was rejected by peer
2019-10-27T09:32:08 ppp[31953]  [opt1_link0] LCP: protocol IPV6CP was rejected
2019-10-27T09:32:08 ppp[31953]  [opt1_link0] LCP: rec'd Protocol Reject #2 (Opened)
2019-10-27T09:32:08 ppp[31953]  [opt1]   SECDNS 0.0.0.0
2019-10-27T09:32:08 ppp[31953]  [opt1]   PRIDNS 0.0.0.0
2019-10-27T09:32:08 ppp[31953]  [opt1]   IPADDR 0.0.0.0
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: SendConfigReq #2
2019-10-27T09:32:08 ppp[31953]  [opt1]   COMPPROTO VJCOMP, 16 comp. channels, no comp-cid
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: rec'd Configure Reject #1 (Ack-Sent)
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: state change Req-Sent --> Ack-Sent
2019-10-27T09:32:08 ppp[31953]  [opt1]   IPADDR x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: SendConfigAck #1
2019-10-27T09:32:08 ppp[31953]  [opt1]     x.x.x.x is OK
2019-10-27T09:32:08 ppp[31953]  [opt1]   IPADDR x.x.x.x
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: rec'd Configure Request #1 (Req-Sent)
2019-10-27T09:32:08 ppp[31953]  [opt1]   PRIDNS 0.0.0.0
2019-10-27T09:32:08 ppp[31953]  [opt1]   COMPPROTO VJCOMP, 16 comp. channels, no comp-cid
2019-10-27T09:32:08 ppp[31953]  [opt1]   IPADDR 0.0.0.0
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: SendConfigReq #1
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: state change Starting --> Req-Sent
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: Up event
2019-10-27T09:32:08 ppp[31953]  [opt1] IPV6CP: LayerStart
2019-10-27T09:32:08 ppp[31953]  [opt1] IPV6CP: state change Initial --> Starting
2019-10-27T09:32:08 ppp[31953]  [opt1] IPV6CP: Open event
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: LayerStart
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: state change Initial --> Starting
2019-10-27T09:32:08 ppp[31953]  [opt1] IPCP: Open event
2019-10-27T09:32:08 ppp[31953]  [opt1] Bundle: Status update: up 1 link, total bandwidth 64000 bps
2019-10-27T09:32:08 ppp[31953]  [opt1_link0] Link: Join bundle "opt1"
2019-10-27T09:32:08 ppp[31953]  [opt1_link0] Link: Matched action 'bundle "opt1" ""'
2019-10-27T09:32:08 ppp[31953]  [opt1_link0] LCP: authorization successful
2019-10-27T09:32:08 ppp[31953]  [opt1_link0]   MESG: Authentication success,Welcome!
2019-10-27T09:32:08 ppp[31953]  [opt1_link0] PAP: rec'd ACK #1 len: 36
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: LayerUp
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] PAP: sending REQUEST #1 len: 12
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] PAP: using authname "kpn"
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: auth: peer wants PAP, I want nothing
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: state change Ack-Sent --> Opened
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MAGICNUM 0x69cb471a
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MRU 1492
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: rec'd Configure Ack #2 (Ack-Sent)
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MAGICNUM 0x69cb471a
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MRU 1492
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: SendConfigReq #2
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   PROTOCOMP
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: rec'd Configure Reject #1 (Ack-Sent)
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: state change Req-Sent --> Ack-Sent
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MAGICNUM 0xb92d659c
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   AUTHPROTO PAP
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MRU 1500
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: SendConfigAck #1
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MAGICNUM 0xb92d659c
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   AUTHPROTO PAP
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MRU 1500
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: rec'd Configure Request #1 (Req-Sent)
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MAGICNUM 0x69cb471a
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   MRU 1492
2019-10-27T09:32:07 ppp[31953]  [opt1_link0]   PROTOCOMP
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: SendConfigReq #1
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: state change Starting --> Req-Sent
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: Up event
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] Link: UP event
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] PPPoE: connection successful
2019-10-27T09:32:07 ppp[31953]  PPPoE: rec'd ACNAME "x.x.x.x"
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] PPPoE: Connecting to ''
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: LayerStart
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: state change Initial --> Starting
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] LCP: Open event
2019-10-27T09:32:07 ppp[31953]  [opt1_link0] Link: OPEN event
2019-10-27T09:32:07 ppp[31953]  [opt1] Bundle: Interface ng0 created
2019-10-27T09:32:07 ppp[31953]  web: web is not running
2019-10-27T09:32:07 ppp[31953]  process 31953 started, version 5.9
2019-10-27T09:32:07 ppp[31953]  
2019-10-27T09:32:07 ppp[31953]  Multi-link PPP daemon for FreeBSD

Kernel boot timestamp:

2019-10-27T09:32:06 kernel  ---<<BOOT>>---

Eugene Grosbein - 2020-10-13

This new log shows thjat mpd aborts itself due to poll() system call returning unexpected error EINVAL. This should not happen in FreeBSD with default settings, so you should show this new log to OPNSense developers and ask them if they have any system changes/tweaks comparing vanilla FreeBSD code that could provoke this. Suspects are possibly changed values for sysctl kern.maxfilesperproc and build-time constant FD_SETSIZE.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eugene Grosbein - 2020-10-13

What is hardware you are running? Do you run ntpd or some other software to syncronize its clock? If you run ntpd, please show contents of its ntp.drift file after at least 30 minutes of up time.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eugene Grosbein - 2020-10-13

Can you share your mpd.conf?
Also, show output of the command: sysctl kern.maxfilesperproc

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Can the mpd crash at boot because the OS updates its time? When booting the OS timestamp is "2019-10-27T09:32:06" and it looks like the daemon crashes when the OS switches to the correct time "2020-09-29T10:45:11".

root@OPNsense:~ # uptime
 9:34AM  up 20:33, 1 user, load averages: 0.23, 0.24, 0.19

root@OPNsense:~ # cat /var/db/ntpd.drift
0.465

root@OPNsense:~ # sysctl kern.maxfilesperproc
kern.maxfilesperproc: 28377

root@OPNsense:~ # cat /var/etc/ntpd.conf
#
# Autogenerated configuration file
#

tinker panic 0
# Orphan mode stratum
tos orphan 12


# Upstream Servers
server 0.opnsense.pool.ntp.org iburst maxpoll 9 prefer
server 1.opnsense.pool.ntp.org iburst maxpoll 9
server 2.opnsense.pool.ntp.org iburst maxpoll 9
server 3.opnsense.pool.ntp.org iburst maxpoll 9


disable monitor
statsdir /var/log/ntp
logconfig =syncall +clockall
driftfile /var/db/ntpd.drift
restrict default kod limited nomodify nopeer notrap
restrict -6 default kod limited nomodify nopeer notrap

interface ignore all
interface listen pppoe0
interface listen hn2

Eugene Grosbein - 2020-10-14

It's possible but it would mean a bug in the kernel.

So, can you reproduce this reliably or semi-reliably at least by rebooting the box? What hardware is it?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maarten Bakker - 2020-10-15

No it's not at every boot unfortunately, it's a hyperv vm.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Eugene Grosbein - 2020-10-15
  
  If it's a VM guest machine, then why it's emulated RTC (real-time clock chip) contains wrong time at start?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maarten Bakker - 2020-10-15

Hmm, I have the time syncronisation integration service disabled, because I want to synchronise it using ntp, and not have the host and the ntpd fight eachother. I guess when booting up, hyperv might supply a wrong RTC or none at all. (I dont know)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Eugene Grosbein - 2020-10-15
  
  I could not reproduce the problem with vanilla FreeBSD system. You should either try running FreeBSD instead of OPNSense, or direct the problem to OPNSense developers. Again, the problem's due to system call poll() returning EINVAL and that should not happen.
  
  Or maybe you could insert "ktrace -if /var/tmp/ktrace.out mpd5 ..." into startup script /usr/local/etc/rc.d/mpd5, wait until problem reproduces and upload contents of ktrace.out (preferably compressed). This should help to see what really happens. If you can do that, also add to /etc/crontab:
  
  @reboot sleep 300 && ktrace -C
  
  This will keep ktrace.out from growing indefinitely and still keep first 5 minutes of debugging data.
  
  Last edit: Eugene Grosbein 2020-10-15
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eugene Grosbein - 2021-02-15

status: open --> closed

Group: -->
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Eugene Grosbein - 2021-02-15

Feedback timeout (4 months).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

fatal error

FreeBSD Multi PPP daemon

Group

Searches

Help

#68 fatal error

Discussion