Menu

#118 fragment reassembly error?

closed
None
5
2018-05-22
2014-08-14
No

I'm having a hard time figuring out which sites to use to follow tipc work (bug reporting and patches).

I'll try here to see if this sounds familiar. I don't see any fragmentation issues reported here, and many google searches also gave no useful hits.

We get message corruption when there are 3+ fragments. In the case of 3, we think it's reassembled as 1/3/2. I think we might have seen 4 fragments as 1/4/2/3, but I need to improve our test. but maybe we are getting the final fragment inserted after frag1.

We believe the sender and client must be on different machines; in our case, two ATCA blades.

Underlying MTU is 1500. wireshark reports tipc fragment lengths of 1420 or less. The packets are OK in pcap on the receiver.

The first problem happens at 2817 payload bytes, i.e. 2 * 1408 + 1. In this case, fragment 3 had 1 payload byte.

I just did a quick test varying length from 1-32768. There were errors from 2817-3002, 4237-4422, 5657-5842, 7077-7262, 8497-8682... i.e. 2817 + 1420 * N, for 186 bytes. I haven't had time to get pcaps on this group and I want to improve the test utility.

Earlier, the closest thing I found was http://t154544.network-tipc-general.networkbuzz.info/tipc-fix-message-corruption-bug-fordeferred-packets-t154544.html .
I don't understand how we could get out-of-order delivery with our hardware, but tipc-config -ls did show some non-zero RX "defs".

I'm not certain where I should be looking for discussion, patches, etc. During recent websearches, I have also seen:
http://12.network-tipc-general.networkbuzz.info/
http://www.spinics.net/lists/netdev/

We are currently based on a Ubuntu 12.04 system, using their recently backported kernel 3.13. I think it came from their from 14.04 'trusty' development. Sorry, but I really don't understand all the kernel and driver versioning, but I can provide more details if you give instructions.

Related

Tasks: #118

Discussion

  • Ying Xue

    Ying Xue - 2014-08-15

    It sounds like Erik also met the same issue, so I forward it to him and
    cc tipc-discussion mail list.

    Regards,
    Ying

    On 08/15/2014 06:46 AM, Ned Kittlitz wrote:


    [bugs:#118] http://sourceforge.net/p/tipc/bugs/118 fragment
    reassembly error?

    Status: open
    Group:
    Created: Thu Aug 14, 2014 10:46 PM UTC by Ned Kittlitz
    Last Updated: Thu Aug 14, 2014 10:46 PM UTC
    Owner: nobody

    I'm having a hard time figuring out which sites to use to follow tipc
    work (bug reporting and patches).

    I'll try here to see if this sounds familiar. I don't see any
    fragmentation issues reported here, and many google searches also gave
    no useful hits.

    We get message corruption when there are 3+ fragments. In the case of 3,
    we think it's reassembled as 1/3/2. I think we might have seen 4
    fragments as 1/4/2/3, but I need to improve our test. but maybe we are
    getting the final fragment inserted after frag1.

    We believe the sender and client must be on different machines; in our
    case, two ATCA blades.

    Underlying MTU is 1500. wireshark reports tipc fragment lengths of 1420
    or less. The packets are OK in pcap on the receiver.

    The first problem happens at 2817 payload bytes, i.e. 2 * 1408 + 1. In
    this case, fragment 3 had 1 payload byte.

    I just did a quick test varying length from 1-32768. There were errors
    from 2817-3002, 4237-4422, 5657-5842, 7077-7262, 8497-8682... i.e. 2817
    + 1420 * N, for 186 bytes. I haven't had time to get pcaps on this group
    and I want to improve the test utility.

    Earlier, the closest thing I found was
    http://t154544.network-tipc-general.networkbuzz.info/tipc-fix-message-corruption-bug-fordeferred-packets-t154544.html
    .
    I don't understand how we could get out-of-order delivery with our
    hardware, but tipc-config -ls did show some non-zero RX "defs".

    I'm not certain where I should be looking for discussion, patches, etc.
    During recent websearches, I have also seen:
    http://12.network-tipc-general.networkbuzz.info/
    http://12.network-tipc-general.networkbuzz.info
    http://www.spinics.net/lists/netdev/ http://www.spinics.net/lists/netdev

    We are currently based on a Ubuntu 12.04 system, using their recently
    backported kernel 3.13. I think it came from their from 14.04 'trusty'
    development. Sorry, but I really don't understand all the kernel and
    driver versioning, but I can provide more details if you give instructions.


    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/tipc/bugs/118/
    https://sourceforge.net/p/tipc/bugs/118

    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    https://sourceforge.net/auth/subscriptions

     

    Related

    Tasks: #118

  • Erik Hugne

    Erik Hugne - 2014-08-15

    Hi, we normally use the mailinglist for all TIPC issues.
    Your sf email bounced, so i'm pasting in the same update here.
    In my case i got a similar problem to yours, with corrupt packets. I found out that this was caused by the missing range check when setting the port importance (which effectively modifies the msg_user field). The test program i used was actually faulty, and passed an uninitialized value to the setsockopt()/TIPC_IMPORTANCE call. The effect of this is that the port phdr was set to a random type, causing all kinds of errors on the receiving side. I sent in a patch just now to netdev stable for this, tipc-discussion cc'ed:

    http://thread.gmane.org/gmane.network.tipc.general/7053

     
  • Ned Kittlitz

    Ned Kittlitz - 2014-08-18

    I'm updating the bug, because I'm not sure how to use the mailing list. I don't know why my sf email appeared to bounce for you. I've received email because of bug 117 and bug 118 updates.

    I got some help from a contractor who had some time to investigate.

    He said he extracted the change below from a larger patch that came from the tipc group. He also said that the patch didn't mention reassembly. It seems to have fixed our problem.

    I'm sorry I can't give better information right now. I'm overwhelmed by not understanding so many things.

    Thanks,
    Ned

    diff -uprN -X linux-3.13.orig/Documentation/dontdiff linux-3.13.orig/net/tipc/link.c linux-3.13-tipc-fixes/net/tipc/link.c
    --- linux-3.13.orig/net/tipc/link.c 2014-08-15 05:46:35.000000000 +0530
    +++ linux-3.13-tipc-fixes/net/tipc/link.c 2014-08-15 06:56:51.000000000 +0530
    @@ -2386,6 +2386,8 @@ int tipc_link_recv_fragment(struct sk_bu
    (tail)->next = frag;
    tail = frag;
    (head)->truesize += frag->truesize;
    + (
    head)->len += frag->len;
    + (head)->data_len += frag->len;
    }
    if (fragid == LAST_FRAGMENT) {
    fbuf = *head;

     
    • Erik Hugne

      Erik Hugne - 2014-08-19

      Great to hear that it resolved the problem.
      Should you have future questions, don't hesitate to drop a mail to
      tipc-discussion@lists.sourceforge.net

      2014-08-18 23:38 GMT+02:00 Ned Kittlitz nkittlitz@users.sf.net:

      I'm updating the bug, because I'm not sure how to use the mailing list. I
      don't know why my sf email appeared to bounce for you. I've received email
      because of bug 117 and bug 118 updates.

      I got some help from a contractor who had some time to investigate.

      He said he extracted the change below from a larger patch that came from
      the tipc group. He also said that the patch didn't mention reassembly. It
      seems to have fixed our problem.

      I'm sorry I can't give better information right now. I'm overwhelmed by
      not understanding so many things.

      Thanks,
      Ned

      diff -uprN -X linux-3.13.orig/Documentation/dontdiff
      linux-3.13.orig/net/tipc/link.c linux-3.13-tipc-fixes/net/tipc/link.c
      --- linux-3.13.orig/net/tipc/link.c 2014-08-15 05:46:35.000000000 +0530
      +++ linux-3.13-tipc-fixes/net/tipc/link.c 2014-08-15 06:56:51.000000000
      +0530
      @@ -2386,6 +2386,8 @@ int tipc_link_recv_fragment(struct sk_bu
      (

      tail)->next = frag; tail = frag;
      (
      head)->truesize += frag->truesize; + (head)->len += frag->len;
      + (

      head)->data_len += frag->len; } if (fragid == LAST_FRAGMENT) { fbuf =
      *head;


      Status: open
      Group:
      Created: Thu Aug 14, 2014 10:46 PM UTC by Ned Kittlitz
      Last Updated: Mon Aug 18, 2014 09:53 AM UTC
      Owner: nobody

      I'm having a hard time figuring out which sites to use to follow tipc work
      (bug reporting and patches).

      I'll try here to see if this sounds familiar. I don't see any
      fragmentation issues reported here, and many google searches also gave no
      useful hits.

      We get message corruption when there are 3+ fragments. In the case of 3,
      we think it's reassembled as 1/3/2. I think we might have seen 4 fragments
      as 1/4/2/3, but I need to improve our test. but maybe we are getting the
      final fragment inserted after frag1.

      We believe the sender and client must be on different machines; in our
      case, two ATCA blades.

      Underlying MTU is 1500. wireshark reports tipc fragment lengths of 1420 or
      less. The packets are OK in pcap on the receiver.

      The first problem happens at 2817 payload bytes, i.e. 2 * 1408 + 1. In
      this case, fragment 3 had 1 payload byte.

      I just did a quick test varying length from 1-32768. There were errors
      from 2817-3002, 4237-4422, 5657-5842, 7077-7262, 8497-8682... i.e. 2817 +
      1420 * N, for 186 bytes. I haven't had time to get pcaps on this group and
      I want to improve the test utility.

      Earlier, the closest thing I found was
      http://t154544.network-tipc-general.networkbuzz.info/tipc-fix-message-corruption-bug-fordeferred-packets-t154544.html
      .
      I don't understand how we could get out-of-order delivery with our
      hardware, but tipc-config -ls did show some non-zero RX "defs".

      I'm not certain where I should be looking for discussion, patches, etc.
      During recent websearches, I have also seen:
      http://12.network-tipc-general.networkbuzz.info/
      http://www.spinics.net/lists/netdev/

      We are currently based on a Ubuntu 12.04 system, using their recently
      backported kernel 3.13. I think it came from their from 14.04 'trusty'
      development. Sorry, but I really don't understand all the kernel and driver
      versioning, but I can provide more details if you give instructions.


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/tipc/bugs/118/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Tasks: #118

  • Jon Paul Maloy

    Jon Paul Maloy - 2018-05-22
    • Owner: Anonymous --> Jon Paul Maloy
    • Status: open --> closed
     

Log in to post a comment.

MongoDB Logo MongoDB