Thread: [Linuxptp-users] "poll tx timestamp timeout", problem since v1.2, igb, stmmac
PTP IEEE 1588 stack for Linux
Brought to you by:
rcochran
From: Jean-Baptiste M. <jea...@pa...> - 2013-10-25 14:27:04
|
Hi, I know this "poll tx timestamp timeout" have been discussed around here[1] and on e1000-devel previously, without clear conclusion (to me), and no sure to which list this should go, but here is my case: I'm using ptp4l git head with hardware time stamping, on - Intel i210 with igb driver, - Intel i210 with igb_avb driver from Open-AVB project[2], - stmmac. Intel igb_avb is fine, yet both vanilla igb and stmmac fail with a "poll tx timestamp timeout" 100% reproducible. Increasing the tx_timestamp_timeout does not help. Here is a typical trace with the igb driver on a i210 NIC: $ sudo ./ptp4l -i eth2-igb -m -l 7 ptp4l[457162.487]: selected /dev/ptp0 as PTP clock ptp4l[457162.487]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 ptp4l[457162.488]: driver changed our HWTSTAMP options ptp4l[457162.489]: tx_type 1 not 1 ptp4l[457162.489]: rx_filter 1 not 12 ptp4l[457162.489]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[457162.489]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[457168.489]: port 1: announce timeout ptp4l[457168.489]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[457168.489]: selected best master clock a0369f.fffe.1c38a8 ptp4l[457168.489]: assuming the grand master role ptp4l[457168.490]: port 1: master tx announce timeout ptp4l[457168.490]: port 1: setting asCapable ptp4l[457169.489]: port 1: master sync timeout ptp4l[457169.490]: poll tx timestamp timeout ptp4l[457169.490]: port 1: send sync failed ptp4l[457169.490]: port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) ptp4l[457169.491]: waiting 2^{4} seconds to clear fault on port 1 ... My first trial with going back to v1.1 was succesful, but just by luck: 9 times out of 10 I have a "recvmsg tx timestamp failed: Resource temporarily unavailable". The same goes going back to 2ec3829, just before 76e10e9 "ptp4l: Use poll() instead of a try-again loop". Increasing the tx_timestamp_retries does not help. On the kernel side, for driver/ptp/ I'm on 0d8c3e7 "ptp_pch: fix error handling in pch_probe()" for stmmac, Debian 3.10.7-1 for i210. To this point, I think this is a hardware driver issue (not really ptp4l nor ptp driver, and hoping this is not a combination of the 3) but any suggestion would be welcome. [1] https://sourceforge.net/mailarchive/forum.php?thread_name=20130815062826.GB4679%40netboy&forum_name=linuxptp-devel [2] https://github.com/intel-ethernet/Open-AVB -- JB |
From: Richard C. <ric...@gm...> - 2013-10-25 15:09:39
|
On Fri, Oct 25, 2013 at 04:10:36PM +0200, Jean-Baptiste Maillet wrote: > > On the kernel side, for driver/ptp/ I'm on 0d8c3e7 "ptp_pch: fix error handling in pch_probe()" for stmmac, > Debian 3.10.7-1 for i210. So 0d8c3e7 means a pure mainstream kernel (v3.10-rc5~25^2~36)? And Debian kernel is from testing or unstable? I don't really trust the Debian kernels. I would try the igb driver from a recent mainline kernel. The igb driver really should be working in mainline Linux. Regarding the stmmac, although I reviewed the patches, I never tested them (no hardware to try), and so I would not be surprised if it had bugs. > To this point, I think this is a hardware driver issue (not really ptp4l nor ptp driver, and hoping this is not > a combination of the 3) but any suggestion would be welcome. If not a hardware issue, then probably a driver/kernel issue. The ptp4l program hasn't really ever substantially changed the way the transmit time stamps are read. Thanks, Richard |
From: Jean-Baptiste M. <jea...@pa...> - 2013-10-25 15:47:33
|
On 10/25/2013 05:09 PM, Richard Cochran wrote: > On Fri, Oct 25, 2013 at 04:10:36PM +0200, Jean-Baptiste Maillet wrote: >> >> On the kernel side, for driver/ptp/ I'm on 0d8c3e7 "ptp_pch: fix error handling in pch_probe()" for stmmac, >> Debian 3.10.7-1 for i210. > > So 0d8c3e7 means a pure mainstream kernel (v3.10-rc5~25^2~36)? Nope, the sha is just the reference to where I am regarding ptp compared to upstream. This is a SoC with specific BSP and platform drivers. > And Debian kernel is from testing or unstable? Testing (but at least pure Debian upstream). > I don't really trust the Debian kernels. I would try the igb driver > from a recent mainline kernel. The igb driver really should be working > in mainline Linux. OK, will do. I need a hardware and upstream reference point known to work. It's (bad) luck I didn't tried the Debian igb driver right away but the highly experimental igb_avb first. My only suspect was stmmac. BTW igb support 2-3 families of chip (82575, 82576, 82580, then i210, i211, and i350, i354): do any users out there can share their experience using it specifically with i210? > Regarding the stmmac, although I reviewed the patches, I never tested > them (no hardware to try), and so I would not be surprised if it had > bugs. > >> To this point, I think this is a hardware driver issue (not really ptp4l nor ptp driver, and hoping this is not >> a combination of the 3) but any suggestion would be welcome. > > If not a hardware issue, then probably a driver/kernel issue. The > ptp4l program hasn't really ever substantially changed the way the > transmit time stamps are read. Thanks -- JB |
From: Vick, M. <mat...@in...> - 2013-10-25 16:13:14
|
On 10/25/13, 8:47 AM, "Jean-Baptiste Maillet" <jea...@pa...> wrote: >On 10/25/2013 05:09 PM, Richard Cochran wrote: [...] >> I don't really trust the Debian kernels. I would try the igb driver >> from a recent mainline kernel. The igb driver really should be working >> in mainline Linux. > >OK, will do. I need a hardware and upstream reference point known to work. >It's (bad) luck I didn't tried the Debian igb driver right away but the >highly experimental >igb_avb first. My only suspect was stmmac. > >BTW igb support 2-3 families of chip (82575, 82576, 82580, then i210, >i211, and i350, i354): >do any users out there can share their experience using it specifically >with i210? I210 should definitely be working in the upstream Linux kernel. Please let me know if it is not. I'm uncertain what the Debian kernel version of igb looks like. It's possible some critical patch is missing. You can try the latest version (5.0.6) from e1000.sf.net and compiling with CFLAGS_EXTRA=-DIGB_PTP to see if that driver works for you. Cheers, Matthew Matthew Vick Linux Development Networking Division Intel Corporation |
From: Keller, J. E <jac...@in...> - 2013-10-28 10:10:57
|
Hi, > -----Original Message----- > From: Vick, Matthew [mailto:mat...@in...] > Sent: Friday, October 25, 2013 9:13 AM > To: Jean-Baptiste Maillet; lin...@li... > Subject: Re: [Linuxptp-users] "poll tx timestamp timeout", problem since > v1.2, igb, stmmac > > On 10/25/13, 8:47 AM, "Jean-Baptiste Maillet" > <jea...@pa...> wrote: > > >On 10/25/2013 05:09 PM, Richard Cochran wrote: > > [...] > > >> I don't really trust the Debian kernels. I would try the igb driver > >> from a recent mainline kernel. The igb driver really should be working > >> in mainline Linux. > > > >OK, will do. I need a hardware and upstream reference point known to > work. > >It's (bad) luck I didn't tried the Debian igb driver right away but the > >highly experimental > >igb_avb first. My only suspect was stmmac. > > > >BTW igb support 2-3 families of chip (82575, 82576, 82580, then i210, > >i211, and i350, i354): > >do any users out there can share their experience using it specifically > >with i210? > > I210 should definitely be working in the upstream Linux kernel. Please let > me know if it is not. > > I'm uncertain what the Debian kernel version of igb looks like. It's > possible some critical patch is missing. You can try the latest version > (5.0.6) from e1000.sf.net and compiling with CFLAGS_EXTRA=-DIGB_PTP > to see > if that driver works for you. > > Cheers, > Matthew > > Matthew Vick > Linux Development > Networking Division > Intel Corporation > I second this. The igb driver definitely had some fixes in place over time, which may be the case that the debian kernel didn't manage to include (or included with bugs). My guess is the avb driver has the proper code to fix this issue. In general, since we moved to using poll() for a longer period of time, this error indicates a more serious issue, and I am thinking of submitting a patch to update the error text to be more indicative of what happened. Regards, Jake |
From: Jean-Baptiste M. <jea...@pa...> - 2013-10-28 16:41:59
|
On 10/25/2013 06:12 PM, Vick, Matthew wrote: ... > I210 should definitely be working in the upstream Linux kernel. Please let > me know if it is not. I think there is a problem. ptp4l "poll tx timestamp timeout" 100% igb driver on i210 "Detected Tx Unit Hang" 100% Tested here with a freshly built 3.10.2 fetched from kernel.org. # uname -a Linux sumo 3.10.2-vanilla-686-pae #2 SMP Mon Oct 28 11:21:13 CET 2013 i686 GNU/Linux # modinfo /lib/modules/3.10.2-vanilla-686-pae/kernel/drivers/net/ethernet/intel/igb/igb.ko filename: /lib/modules/3.10.2-vanilla-686-pae/kernel/drivers/net/ethernet/intel/igb/igb.ko version: 5.0.3-k license: GPL description: Intel(R) Gigabit Ethernet Network Driver author: Intel Corporation, <e10...@li...> srcversion: 8251E5B658798814C71FD34 ... For ptp4l: # git log -n 1 --pretty=oneline 52c5e0cfc972c3e2b65ca492eecbff6edb8b2aaf Don't calculate delay with old master's sync time stamp. Setup: # rmmod igb # dmesg -c # dmesg -n 7 # modprobe igb (for the braves out here, contact me for the "debug=16" dmesg output bellow) # ifconfig eth2-igb 192.168.100.2 up Then: # ./ptp4l -i eth2-igb -m -l 7 ptp4l[12963.591]: selected /dev/ptp0 as PTP clock ptp4l[12963.592]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 ptp4l[12963.592]: driver changed our HWTSTAMP options ptp4l[12963.593]: tx_type 1 not 1 ptp4l[12963.593]: rx_filter 1 not 12 ptp4l[12963.593]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[12963.593]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[12969.593]: port 1: announce timeout ptp4l[12969.593]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[12969.593]: selected best master clock a0369f.fffe.1c38a8 ptp4l[12969.593]: assuming the grand master role ptp4l[12969.594]: port 1: master tx announce timeout ptp4l[12969.594]: port 1: setting asCapable ptp4l[12970.593]: port 1: master sync timeout ptp4l[12970.594]: poll tx timestamp timeout ptp4l[12970.594]: port 1: send sync failed ptp4l[12970.594]: port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) ptp4l[12970.594]: waiting 2^{4} seconds to clear fault on port 1 ^Cptp4l[12972.061]: caught signal 2 # dmesg [12955.198501] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.0.3-k [12955.198511] igb: Copyright (c) 2007-2013 Intel Corporation. [12955.199411] igb 0000:03:00.0: irq 43 for MSI/MSI-X [12955.199432] igb 0000:03:00.0: irq 44 for MSI/MSI-X [12955.199450] igb 0000:03:00.0: irq 45 for MSI/MSI-X [12955.199467] igb 0000:03:00.0: irq 46 for MSI/MSI-X [12955.199487] igb 0000:03:00.0: irq 47 for MSI/MSI-X [12955.284305] igb 0000:03:00.0: added PHC on eth0 [12955.284317] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection [12955.284326] igb 0000:03:00.0: eth0: (PCIe:2.5Gb/s:Width x1) a0:36:9f:1c:38:a8 [12955.284532] igb 0000:03:00.0: eth0: PBA No: G69016-001 [12955.284539] igb 0000:03:00.0: Using MSI-X interrupts. 4 rx queue(s), 4 tx queue(s) [12955.311533] systemd-udevd[11676]: renamed network interface eth0 to eth2-igb [12962.123669] IPv6: ADDRCONF(NETDEV_UP): eth2-igb: link is not ready [12964.333099] igb: eth2-igb NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX [12964.333533] IPv6: ADDRCONF(NETDEV_CHANGE): eth2-igb: link becomes ready [12966.207446] igb 0000:03:00.0: Detected Tx Unit Hang [12966.207446] Tx Queue <2> [12966.207446] TDH <0> [12966.207446] TDT <1> [12966.207446] next_to_use <1> [12966.207446] next_to_clean <0> [12966.207446] buffer_info[next_to_clean] [12966.207446] time_stamp <30429e> [12966.207446] next_to_watch <eedcc000> [12966.207446] jiffies <304470> [12966.207446] desc.status <168000> [12968.208531] igb 0000:03:00.0: Detected Tx Unit Hang [12968.208531] Tx Queue <2> [12968.208531] TDH <0> [12968.208531] TDT <1> [12968.208531] next_to_use <1> [12968.208531] next_to_clean <0> [12968.208531] buffer_info[next_to_clean] [12968.208531] time_stamp <30429e> [12968.208531] next_to_watch <eedcc000> [12968.208531] jiffies <304664> [12968.208531] desc.status <168000> [12970.210527] igb 0000:03:00.0: Detected Tx Unit Hang [12970.210527] Tx Queue <2> [12970.210527] TDH <0> [12970.210527] TDT <1> [12970.210527] next_to_use <1> [12970.210527] next_to_clean <0> [12970.210527] buffer_info[next_to_clean] [12970.210527] time_stamp <30429e> [12970.210527] next_to_watch <eedcc000> [12970.210527] jiffies <304858> [12970.210527] desc.status <168000> [12972.212578] igb 0000:03:00.0: Detected Tx Unit Hang [12972.212578] Tx Queue <1> [12972.212578] TDH <0> [12972.212578] TDT <1> [12972.212578] next_to_use <1> [12972.212578] next_to_clean <0> [12972.212578] buffer_info[next_to_clean] [12972.212578] time_stamp <304790> [12972.212578] next_to_watch <eec1d000> [12972.212578] jiffies <304a4c> [12972.212578] desc.status <d8000> [12972.212594] igb 0000:03:00.0: Detected Tx Unit Hang [12972.212594] Tx Queue <2> [12972.212594] TDH <0> [12972.212594] TDT <1> [12972.212594] next_to_use <1> [12972.212594] next_to_clean <0> [12972.212594] buffer_info[next_to_clean] [12972.212594] time_stamp <30429e> [12972.212594] next_to_watch <eedcc000> [12972.212594] jiffies <304a4c> [12972.212594] desc.status <168000> [12974.214274] igb 0000:03:00.0 eth2-igb: Reset adapter [12978.014992] igb: eth2-igb NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX [12982.234731] igb 0000:03:00.0: Detected Tx Unit Hang [12982.234731] Tx Queue <2> [12982.234731] TDH <0> [12982.234731] TDT <1> [12982.234731] next_to_use <1> [12982.234731] next_to_clean <0> [12982.234731] buffer_info[next_to_clean] [12982.234731] time_stamp <3052a1> [12982.234731] next_to_watch <eedcc000> [12982.234731] jiffies <305413> [12982.234731] desc.status <1ac000> [12984.236804] igb 0000:03:00.0: Detected Tx Unit Hang [12984.236804] Tx Queue <2> [12984.236804] TDH <0> [12984.236804] TDT <1> [12984.236804] next_to_use <1> [12984.236804] next_to_clean <0> [12984.236804] buffer_info[next_to_clean] [12984.236804] time_stamp <3052a1> [12984.236804] next_to_watch <eedcc000> [12984.236804] jiffies <305607> [12984.236804] desc.status <1ac000> [12986.238813] igb 0000:03:00.0: Detected Tx Unit Hang [12986.238813] Tx Queue <3> [12986.238813] TDH <0> [12986.238813] TDT <2> [12986.238813] next_to_use <2> [12986.238813] next_to_clean <0> [12986.238813] buffer_info[next_to_clean] [12986.238813] time_stamp <3055a0> [12986.238813] next_to_watch <ef523010> [12986.238813] jiffies <3057fb> [12986.238813] desc.status <158200> [12986.238828] igb 0000:03:00.0: Detected Tx Unit Hang [12986.238828] Tx Queue <2> [12986.238828] TDH <0> [12986.238828] TDT <1> [12986.238828] next_to_use <1> [12986.238828] next_to_clean <0> [12986.238828] buffer_info[next_to_clean] [12986.238828] time_stamp <3052a1> [12986.238828] next_to_watch <eedcc000> [12986.238828] jiffies <3057fb> [12986.238828] desc.status <1ac000> [12986.238843] igb 0000:03:00.0: Detected Tx Unit Hang [12986.238843] Tx Queue <1> [12986.238843] TDH <0> [12986.238843] TDT <1> [12986.238843] next_to_use <1> [12986.238843] next_to_clean <0> [12986.238843] buffer_info[next_to_clean] [12986.238843] time_stamp <3055a2> [12986.238843] next_to_watch <eec1d000> [12986.238843] jiffies <3057fb> [12986.238843] desc.status <f8000> [12988.240845] igb 0000:03:00.0: Detected Tx Unit Hang [12988.240845] Tx Queue <2> [12988.240845] TDH <0> [12988.240845] TDT <1> [12988.240845] next_to_use <1> [12988.240845] next_to_clean <0> [12988.240845] buffer_info[next_to_clean] [12988.240845] time_stamp <3052a1> [12988.240845] next_to_watch <eedcc000> [12988.240845] jiffies <3059ef> [12988.240845] desc.status <1ac000> [12988.240860] igb 0000:03:00.0: Detected Tx Unit Hang [12988.240860] Tx Queue <3> [12988.240860] TDH <0> [12988.240860] TDT <2> [12988.240860] next_to_use <2> [12988.240860] next_to_clean <0> [12988.240860] buffer_info[next_to_clean] [12988.240860] time_stamp <3055a0> [12988.240860] next_to_watch <ef523010> [12988.240860] jiffies <3059ef> [12988.240860] desc.status <158200> [12988.240875] igb 0000:03:00.0: Detected Tx Unit Hang [12988.240875] Tx Queue <1> [12988.240875] TDH <0> [12988.240875] TDT <1> [12988.240875] next_to_use <1> [12988.240875] next_to_clean <0> [12988.240875] buffer_info[next_to_clean] [12988.240875] time_stamp <3055a2> [12988.240875] next_to_watch <eec1d000> [12988.240875] jiffies <3059ef> [12988.240875] desc.status <f8000> [12988.244531] igb 0000:03:00.0 eth2-igb: Reset adapter > I'm uncertain what the Debian kernel version of igb looks like. It's > possible some critical patch is missing. You can try the latest version > (5.0.6) from e1000.sf.net and compiling with CFLAGS_EXTRA=-DIGB_PTP to see > if that driver works for you. Will do. -- JB |
From: Jean-Baptiste M. <jea...@pa...> - 2013-10-28 16:53:40
|
On 10/28/2013 05:41 PM, Jean-Baptiste Maillet wrote: > On 10/25/2013 06:12 PM, Vick, Matthew wrote: > ... >> I210 should definitely be working in the upstream Linux kernel. Please let >> me know if it is not. > > I think there is a problem. > ptp4l "poll tx timestamp timeout" 100% > igb driver on i210 "Detected Tx Unit Hang" 100% > > Tested here with a freshly built 3.10.2 fetched from kernel.org. ... > >> I'm uncertain what the Debian kernel version of igb looks like. It's >> possible some critical patch is missing. You can try the latest version >> (5.0.6) from e1000.sf.net and compiling with CFLAGS_EXTRA=-DIGB_PTP to see >> if that driver works for you. > > Will do. Done. Good news: same results. Should this go to e1000-devel instead? $ make CFLAGS_EXTRA=-DIGB_PTP make -C /lib/modules/3.10.2-vanilla-686-pae/build SUBDIRS=/home/jbmaillet/sandbox/igb-5.0.6/src modules ... $ modinfo /home/jbmaillet/sandbox/igb-5.0.6/src/igb.ko filename: /home/jbmaillet/sandbox/igb-5.0.6/src/igb.ko version: 5.0.6 license: GPL description: Intel(R) Gigabit Ethernet Network Driver author: Intel Corporation, <e10...@li...> srcversion: 55D8265EA2BD13346D24336 ... # insmod /home/jbmaillet/sandbox/igb-5.0.6/src/igb.ko (root@sumo) (/home/jbmaillet/sandbox/linuxptp-git) # ifconfig eth2-igb 192.168.100.2 up (root@sumo) (/home/jbmaillet/sandbox/linuxptp-git) # ./ptp4l -i eth2-igb -m -l 7 ptp4l[14466.673]: selected /dev/ptp0 as PTP clock ptp4l[14466.673]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000 ptp4l[14466.674]: driver changed our HWTSTAMP options ptp4l[14466.674]: tx_type 1 not 1 ptp4l[14466.674]: rx_filter 1 not 12 ptp4l[14466.674]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[14466.675]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[14472.675]: port 1: announce timeout ptp4l[14472.675]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[14472.675]: selected best master clock a0369f.fffe.1c38a8 ptp4l[14472.675]: assuming the grand master role ptp4l[14472.676]: port 1: master tx announce timeout ptp4l[14472.676]: port 1: setting asCapable ptp4l[14473.675]: port 1: master sync timeout ptp4l[14473.676]: poll tx timestamp timeout ptp4l[14473.676]: port 1: send sync failed ptp4l[14473.676]: port 1: MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) ptp4l[14473.676]: waiting 2^{4} seconds to clear fault on port 1 ^Cptp4l[14476.358]: caught signal 2 (root@sumo) (/home/jbmaillet/sandbox/linuxptp-git) # dmesg [14469.599074] Intel(R) Gigabit Ethernet Network Driver - version 5.0.6 [14469.599085] Copyright (c) 2007-2013 Intel Corporation. [14469.599969] igb 0000:03:00.0: irq 43 for MSI/MSI-X [14469.599989] igb 0000:03:00.0: irq 44 for MSI/MSI-X [14469.682421] igb 0000:03:00.0: added PHC on eth0 [14469.682433] igb 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection [14469.682440] igb 0000:03:00.0: eth0: (PCIe:2.5GT/s:Width x1) [14469.682447] igb 0000:03:00.0: eth0: MAC: [14469.682451] a0:36:9f:1c:38:a8 [14469.682664] igb 0000:03:00.0: eth0: PBA No: G69016-001 [14469.688917] igb 0000:03:00.0: LRO is disabled [14469.688931] igb 0000:03:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s) [14469.701223] systemd-udevd[13872]: renamed network interface eth0 to eth2-igb [14474.746458] IPv6: ADDRCONF(NETDEV_UP): eth2-igb: link is not ready [14477.545681] igb: eth2-igb NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [14477.545956] IPv6: ADDRCONF(NETDEV_CHANGE): eth2-igb: link becomes ready [14479.744322] igb 0000:03:00.0: Detected Tx Unit Hang ... -- JB |
From: Vick, M. <mat...@in...> - 2013-10-28 17:43:36
|
On 10/28/13, 9:53 AM, "Jean-Baptiste Maillet" <jea...@pa...> wrote: >On 10/28/2013 05:41 PM, Jean-Baptiste Maillet wrote: >> On 10/25/2013 06:12 PM, Vick, Matthew wrote: >> ... >>> I210 should definitely be working in the upstream Linux kernel. Please >>>let >>> me know if it is not. >> >> I think there is a problem. >> ptp4l "poll tx timestamp timeout" 100% >> igb driver on i210 "Detected Tx Unit Hang" 100% >> >> Tested here with a freshly built 3.10.2 fetched from kernel.org. >... >> >>> I'm uncertain what the Debian kernel version of igb looks like. It's >>> possible some critical patch is missing. You can try the latest version >>> (5.0.6) from e1000.sf.net and compiling with CFLAGS_EXTRA=-DIGB_PTP to >>>see >>> if that driver works for you. >> >> Will do. > >Done. >Good news: same results. >Should this go to e1000-devel instead? That's really strange. It looks like the device's transmit unit is hanging on the first few packets you attempt to send, which is why ptp4l is failing. Yes, please do send along all of your information to e1000-devel and we'll have some more Intel folks look at it. A few other bits that would be helpful are the lspci -vvv output before and after the error occurs as well as whether this failure occurs with any other traffic (i.e. ping). Thank you! Cheers, Matthew |
From: Richard C. <ric...@gm...> - 2013-10-30 09:13:14
|
On Fri, Oct 25, 2013 at 05:47:26PM +0200, Jean-Baptiste Maillet wrote: > On 10/25/2013 05:09 PM, Richard Cochran wrote: > > On Fri, Oct 25, 2013 at 04:10:36PM +0200, Jean-Baptiste Maillet wrote: > >> > >> On the kernel side, for driver/ptp/ I'm on 0d8c3e7 "ptp_pch: fix error handling in pch_probe()" for stmmac, > >> Debian 3.10.7-1 for i210. > > > > So 0d8c3e7 means a pure mainstream kernel (v3.10-rc5~25^2~36)? > > Nope, the sha is just the reference to where I am regarding ptp compared to upstream. > This is a SoC with specific BSP and platform drivers. > > > And Debian kernel is from testing or unstable? > > Testing (but at least pure Debian upstream). Okay, so I just tried linux-image-3.10-0.bpo.2-686-pae from wheezy-backports, and it is definitely broken WRT igb. Even commands like "ifdown" and "ifconfig" stall forever. In contrast, a plain old 3.10.17 kernel works just fine with i210, igb and ptp4l. So, based on this thread, other recent threads, and my own past experience, I can only recommend to avoid vendor kernels like the plague. Thanks, Richard |
From: Richard C. <ric...@gm...> - 2013-10-30 09:17:16
|
On Wed, Oct 30, 2013 at 10:12:55AM +0100, Richard Cochran wrote: > > So, based on this thread, other recent threads, and my own past > experience, I can only recommend to avoid vendor kernels like the ^^^^^^ Oops, I meant "distro" kernels, but avoid vendor kernels, too. Thanks, Richard |
From: Richard C. <ric...@gm...> - 2013-10-27 06:35:26
|
On Fri, Oct 25, 2013 at 04:10:36PM +0200, Jean-Baptiste Maillet wrote: > Intel igb_avb is fine, yet both vanilla igb and stmmac fail with a > "poll tx timestamp timeout" 100% reproducible. > Increasing the tx_timestamp_timeout does not help. Your command line > $ sudo ./ptp4l -i eth2-igb -m -l 7 does not show that you used "-f config" to specify a configuration file. The ptp4l program does not scan any config files by default. It just uses built in defaults. So, if you did not use the "-f" command line switch, then you also did not change the tx_timestamp_timeout value. Thanks, Richard |
From: Jean-Baptiste M. <jea...@pa...> - 2013-10-28 09:06:39
|
On 10/27/2013 07:35 AM, Richard Cochran wrote: > On Fri, Oct 25, 2013 at 04:10:36PM +0200, Jean-Baptiste Maillet wrote: > >> Intel igb_avb is fine, yet both vanilla igb and stmmac fail with a >> "poll tx timestamp timeout" 100% reproducible. >> Increasing the tx_timestamp_timeout does not help. > > Your command line > >> $ sudo ./ptp4l -i eth2-igb -m -l 7 > > does not show that you used "-f config" to specify a configuration > file. The ptp4l program does not scan any config files by default. It > just uses built in defaults. So, if you did not use the "-f" command > line switch, then you also did not change the tx_timestamp_timeout > value. Yes, I got that - that was a simplification (rather than explaining "...and then to setup a tx_timestamp_timeout different from the efault zero, providing this config file blablabla and then using this command line blablabla). But thanks for asking, yes, that may have been a mistake. JB |