From: Bart V. A. <bva...@ac...> - 2014-04-28 10:42:26
|
On 04/28/14 03:56, Vasiliy Tolstov wrote: > 2014-04-03 20:16 GMT+04:00 Bart Van Assche <bva...@ac...>: >> Without further information it's hard to tell what's going on. Have you >> already checked the IB error counters ? > > Bart, can you help - what mean this errors: > -W- lid=0x002e guid=0x00066a00e3003293 dev=29472 Port=36 > Performance Monitor counter : Value > vl15_dropped : 0x2 > -W- lid=0x0001 guid=0x0008f105001086ac dev=23130 Port=3 > Performance Monitor counter : Value > port_xmit_discard : 0x4f > -W- lid=0x0001 guid=0x0008f105001086ac dev=23130 Port=7 > Performance Monitor counter : Value > port_xmit_discard : 0x3a The port_xmit_discard counter represents the total number of outbound packets discarded by a port. Since the port numbers are above 2 I assume these counters were obtained from an IB switch and not from an HCA ? The InfiniBand specification defines when an IB switch is allowed to discard packets. Does the IB switch these counters come from support congestion control ? IB switches that do not support congestion control discard packets instead of sending FECNs (Forward Explicit Congestion Notifications). Bart. |