Menu

#13 Bad performance at 15 nodes, please look into this asap

open
nobody
None
5
2013-01-28
2008-09-09
Anonymous
No

This is a message from a developer that is putting this into a popular P2P program. The code is being compiled into an existing P2P system that works OK using TCP. The UDT code is not modified from the latest distribution.

Please see what you can do.

The bad news is that its performance appears to leave a great deal to be desired. In tests it managed to reach about 15 daemon nodes before having serious problems. At this stage the RAM use was roughly 35M per instance and page faults were about 300000 per instance, rising at around 50000 per minute. CPU use was at 100%. This was after about 5 minutes of the network *idling*. Compare this with a non-UDT build in a network of 42 daemon nodes, RAM use is 8M per instance, page faults at ~2000 and holding indefinitely, CPU use: 0%.

This is in addition to a few random internal segfaults, 4 extra threads and the fact that several of it's operations seem to be unconfigurably blocking (notably, the connect).

However it does seem to function if there are less than 5 nodes, apart from the increased resource usage - which is not really acceptable.

Discussion

  • Yunhong Gu

    Yunhong Gu - 2008-09-09

    What is the bandwdith between the 15 nodes? 100% CPU usage looks strange. If the bandwidth is very low, you can use setsockopt to set a smaller buffer size. The defaut buffer size is very large compared to TCP default.

    What is the CPU usage of 5 nodes?

    BTW, connect() does not block forever, it does timeout, but requires 30 seconds.

     
  • Nobody/Anonymous

    Bandwidth, normal home internet connections. But during test it is the local interface, monitoring that it only goes up to 50KB/s max. However, after testing on another machine no segfaults or CPU 100% were observed with 20 connections, so there is a question as to what may be causing this problem, we have one good test and one bad. This is a multi platform program so Linux and windows are involved, with MSVC6 being used on the windows side (due to other windows related problems).
    I am reporting the setsockopt as a different bug, it doesn't seem to change the buffer lower after you call it. When the first socket is created for incoming connections UDT opens it with a big buffer and after that it doesn't change it lower with a setsockopt call. I know because if I open it, call setsockopt (as usual) and then close it and then re-open it at that point UDT uses less memory (from 30M to 10M, which is still too much, my god, what the hell is it doing?).
    In this program we clear the buffer pretty fast so I don't know if it needs a big buffer.
    The temp fix is to set the default values directly in the code so it always starts out small. Hope you will fix this soon. It would be nice to be able to adjust the buffers for different user situations (LAN sharing vs. internet) in a clean way.
    As for the blocking, we don't have 30 seconds to wait if the other end doesn't exist. Connections can easily be in the 100's and if 30 of them are waiting for 30 seconds each you can see the problem.
    On the close, as long as we can check to see if the buffer is COMPLETELY empty and UDT is ready to close quickly, then we could call close and it shouldn't block, right?
    Point being, TCP doesn't block, how do they get away with it? The application here has code that does come back and wait for TCP to complete a connection, the usual TCP way you do things.
    It looks like you wait to transfer settings or something, the temporary work around for us may be to set static settings in the code since other nodes will only be connecting to this same code on the other end. Any suggestions?
    Also, being on the internet it is possible that people could send false settings and cause problems for the other end, you always have to think about that and limit settings etc..
    Have you checked this code carefully for any buffer overflow attack problems? Meaning, do all inputs from the net limit the size of input properly so buffers aren't overflowed?
    I like the code, UDT looks like a lot of work went into it, and it will really save a lot of time trying to code something like this from scratch.
    I will watch this bug report and others for further replies.

     
  • Yunhong Gu

    Yunhong Gu - 2008-09-10

    MSVC6 may cause problems (of segfault and 100% CPU usage) but I am not very sure about it. The code was not developed under VC6 but was tweaked to be compiled under VC6 (you probably did some of your own changes too).

    In UDT, setsockopt only works before the socket is connected ("accepted" sockets inherit the value from listening sockets). The default large buffer is used because UDT was traditionally used in very high bandwidth networks (Gb/s), which requires large buffer. I am working on a buffer-auto-sizing function but hasn't got time to finish it.

    Connection timeout for a regular connect() is 3 seconds, and 30 seconds for rendezvous connect(). TCP uses ICMP to report socket state but UDT cannot read ICMP at user space. Smaller waiting time will cause more false errors (you cannot know if the peer side is waiting for connection until you tried enough times; For TCP, the host will send an ICMP packet telling that no TCP socket is waiting on the port).

    close() should return pretty soon. Did you notice any significant delay on close()?

    In UDT, configuration parameters at both sides are set seperately. Certain parameters will exchange during the connection setup, which is necessary for the protocol to work, but basically you cannot affect the setting of the other end. Meanwhile, if a malicious user sends out illegal packets (incorrect sequence numbers, control information, etc.), either the packet will be dropped (if it contains illegal sequence number) or the attacker's own connection is damaged. This should not affect other connections.

     
  • Nobody/Anonymous

    Wow, thanks for the fast response :)
    MSVC6 is old, I know, but the main developer uses it because it solves some GUI problems. I have been doing further research on that, sorry. It does compile OK on VS2005.
    I should have mentioned that the first socket we create is a server type of socket, it sits and listens and the 30M is used up the second it's opened, no connection on it at all, and I can only call setsockopt if I have a fd, which we do after the socket is opened, if you think about it that's when your variables are changed, then if I close that socket and open another one it uses those variables to make the new buffer (but for some reason still sits at 10M used, probably leftovers from the initial settings).
    If you could point me to the variables and values I can set so it creates about 100KB buffer for each socket as it's opened, that would be great and I think reasonable for P2P uses.
    Do the buffers grow as more connections are made? Or am I missing the point?
    I just started getting used to where things are in the code so anything can help. I also know that if we get to 100 nodes connected (combinations of incoming and outgoing) that they all use the same buffer and incoming UDP socket (I think) so I am not sure how to do all this but the best way to do this so I can use it now would be to set the buffer at what seems like a reasonable figure in your code and leave it alone.
    On the setsockopt issue, on your end I think all you have to do is call setsockopt on the actual UDP socket to change the buffer size and also change the size of your internal UDT buffer, maybe throw an error if someone tries to do that after a connection is made, or only change the size of the internal buffer and only change the UDP one if no connection is made. I looked and it doesn't look like that's being done. I could do it myself but you know this code and the variables much better than me and you may know the right way to do it.

    > Certain parameters will exchange during the connection setup, which is necessary for the protocol to work

    The other nodes will be running the same code on the other end so can't I just set all the parameters right up front so that you don't have to wait for the other end?
    The malicious users part I was talking about is when UDT exchanges those parameters at first connect, although it's nice to know that the regular packets are also checked and no buffer overflows are possible. You see, there's certain greedy corporations pumping money into spying on people who use P2P and they would love to have access through a open port via overflow or other means.
    I want to get this working ASAP so changing values directly in your code is OK for now, just point me in the right direction.
    BTW: Some of this stuff you are typing here should be copied and pasted into the doc section for other people to reference.

     
  • Nobody/Anonymous

    Forgot to say that I wasn't paying attention to the close call. But if there's a reliable non blocking way to tell that UDT has emptied it's buffer and is ready to close, please let me know. I assume that if the buffer is empty then there is no need to delay if I make a close call.

     
  • Yunhong Gu

    Yunhong Gu - 2008-09-11

    There are two levels of buffer in UDT: the UDP socket buffer and the UDT socket buffer. For a single UDT socket, this is straight forward, you can set the buffer sizes after calling "socket" to create a new UDT socket, but before listen/accept/connect calls, because these two parameters will be exchanged between two sides during connection setup. If you set up the buffer size on a listening socket, all accepted sockets will inherit the same buffer size, and cannot be changed later.

    For multiple UDT socket sharing the same UDP socket, the buffer size you set on the first UDT socket is the only one that is effective, and the buffer is shared between all sockets.

    The default buffer sizes are in core.cpp CUDT::CUDT(). Note that the unit of the size values is "packet". But when using setsockopt, the unit is byte.

    For a normal connection, the parameter exchange cannot be eliminated because one socket needs to know some infomation on the peer side. For example, the sequence number of the first packet (UDT starts with a random number). These are required for security reasons. However, if you want to use fixed values at both sides, you can change CUDT::connect() to fix the values and remove the parameter exchange part. I don't recommend this though.

     
  • Nobody/Anonymous

    Just so you know, we will not be using the rendezvous connect.

    referring to the struct below, the only thing I see that would change is m_iISN (random initial sequence number), everything else would stay the same for all connections from the P2P client.

    What should happen is the first packet sent either way contains the first sequence number, start there and use that from now on.

    When you think about it, it's the same thing, just that the first packet coming back doesn't happen at the initial connect(), it happens when the other side sends it's first packet.

    That really is the proper way to do this.

    And, just in case this isn't clear, you take the header info from the first packet that comes in from a new connection and use that data to set up your "new" structures for that connection. You don't have to do it at connect. Connect can simply send out one packet with a header and it's "parameters". You could even call that the connect packet and it could have a larger header if you wanted, you only send one of those per session.

    I'm not sure what to do with the m_iID (socket ID), but you could do the same thing, if this socket ID is a new one, then use that from now on for this connection. But it looks more like the other end (when making outgoing connect) sends that back to me to use, so I really don't need it in the first open.

    I also see that the m_iID is contained in every packet. If I create the connection, shouldn't I assign that number? What if two m_iID's match from different incoming connections? What if someone fakes a m_iID so they can disturb other connections, or flood my incoming port with every combination, sending a close packet to every m_iID ? That would easily shut down my node.

    The cookie looks like it's the same way, it can wait for later. I'm not sure what it's for yet.

    Of course you are thinking that it's going to be a lot of work if you change this. Yea, it will be, but you know it should have been done that way in the first place to do non-blocking properly and consistently.

    As for close, I ask again, how can I check to be sure that the UDT buffer for a particular connection is empty so that close won't block? If I have to make a call into your code that's OK, I can build a wrapper or something.

    struct CHandShake
    {
    int32_t m_iVersion; // UDT version
    int32_t m_iType; // UDT socket type
    int32_t m_iISN; // random initial sequence number
    int32_t m_iMSS; // maximum segment size
    int32_t m_iFlightFlagSize; // flow control window size
    int32_t m_iReqType; // connection request type: 1: regular connection request, 0: rendezvous connection request, -1/-2: response
    int32_t m_iID; // socket ID
    int32_t m_iCookie; // cookie
    };

     
  • Yunhong Gu

    Yunhong Gu - 2008-09-12

    I agree that many of the connection setup technique you mentioned are doable with a variable length packet header, instead of the fixed length packet header that is used in current version.

    The only exception is the cookie, which is used to prevent the kind of syn flood attack.

    I have a planned version 5 to address all these issues, but haven't found time to implement it. However, reducing connection setup does bring some disadvantages. Using variable length header will increase the processing time per packet and also makes the protocol more complicated. I do agree that the advantage is probably bigger than the disadvantage. However, the early versions of UDT were designed for big flows and the connection setup time is nothing compared to the time required for sending 1TB. That's why I didnt implement those techniques in the first place.

    The m_iID is used for UDP multiplexing. It is used to differentiate multiple UDT flows carried by the same UDP port. An attacker would also need to fake source IP:port (in IP header) togeter with the fake m_iID the kind of attack you mentioned. A UDT connection is described by <src IP, src port, src ID, dst ID, dst port, dst IP>. Simply faking either src ID or dst ID or both wouldnt work.

    [that being said, I didnt implement this security check in the current version because of its CPU overhead. It is just several lines of code though (checking the source address of an incoming UDP packet against the address associated with the ID)]

    Regarding the close() question, you need to write a new API to expose CSndBuffer::getCurrBufSize() in buffer.h/cpp. If the return value is 0 (means no data in the buffer), you can close the socket immediately.

     
  • Yunhong Gu

    Yunhong Gu - 2008-09-12

    By the way, I just added that source address check to the code in CVS. The CPU today is probably fast enough so that the overhead neglectable.

     
  • Nobody/Anonymous

    You already have

    // 0: Protocol Connection Handshake
    // Add. Info: Undefined
    // Control Info: Handshake information (see CHandShake)

    So when you detect that bit you then jump to another section of code which would expect a longer header (if you really need that, I'm still not sure since most of the info you want is already in a normal packet). For a normal packet it wouldn't take that special connection path, saving CPU.

    And don't forget you have "Additional Info" and "Reserved" fields. Call the "Additional Info" "multi use field".

    Checking the IP was a good addition, thanks.

    I don't think it would be unacceptable to people if you had a few "outside" calls (like checking the buffer) to the UDT code in the API that weren't TCP like compatible. All it takes is a "if" statement if I am switching between TCP and UDP, and actually, we already do that anyway.

    Don't forget that there is a need to know what "CUDT" thread we are in for any particular "fd" (if I said all that right) so there needs to be a call for that too, otherwise I could be checking the wrong buffer.

     
  • Nobody/Anonymous

    What is normally done with TCP, we call select() over and over to see if the socket is ready, that is where UDT could do the rest of the connection handshake.

    So I am working on a new special select() called select2(), it's not fancy and is really only checking if the handshake is done, but if it works it could be easily added to the regular select(). I'm still having problems, even though it looks correct, some pointers or values are not being set up properly or something. UDP handshake packets go back and forth (watching with tcpdump) but when actual data packets are sent, nothing happens on the other end.

    I'm posting the code so far if you want to look it over. The code you see from connect() is not called anymore in the connect(). I'm going to keep trying but you may see something I am missing.

    // MOD - keep calling this to see if the connection completed
    // we only check the writefds so NULL the rest, and no timeout
    // you should be only checking one socket at a time here, returns 1 if all is OK

    int CUDTUnited::select2(ud_set* readfds, ud_set* writefds, ud_set* exceptfds, const timeval* timeout)
    {

    // initialize results
    int count = 0;
    // set<UDTSOCKET> rs, ws, es;

    // retrieve related UDT sockets
    // vector<CUDTSocket*> ru, wu, eu;
    CUDTSocket* s;
    if (NULL != writefds)
    for (set<UDTSOCKET>::iterator i2 = writefds->begin(); i2 != writefds->end(); ++ i2)
    {
    if (NULL == (s = locate(*i2)))
    throw CUDTException(5, 4, 0);
    else if (NULL == s->m_pUDT)
    throw CUDTException(5, 4, 0);
    else
    {
    // we now have CUDTSocket* s use that to get everything else
    // this code is taken from the initial connect() section

    CPacket response;

    // I don't really like this, possible buffer overflow? when CHandShake > m_iPayloadSize ???

    char* resdata = new char [s->m_pUDT->m_iPayloadSize];
    CHandShake* res = (CHandShake *)resdata;

    // Wait for the negotiated configurations from the peer side.
    response.pack(0, NULL, resdata, sizeof(CHandShake));

    CUDTException e(0, 0);

    if (!s->m_pUDT->m_bClosing)
    {
    response.setLength(s->m_pUDT->m_iPayloadSize);
    if (s->m_pUDT->m_pRcvQueue->recvfrom(s->m_pUDT->m_SocketID, response) > 0)
    {

    if ((1 != response.getFlag()) || (0 != response.getType()))
    response.setLength(-1);
    else count = 1; // indicate that we are ready now
    }
    }

    if (e.getErrorCode() == 0)
    {
    if (s->m_pUDT->m_bClosing) // if the socket is closed before connection...
    e = CUDTException(1);
    else if (1002 == res->m_iReqType) // connection request rejected
    e = CUDTException(1, 2, 0);
    else if ((!s->m_pUDT->m_bRendezvous) && (s->m_pUDT->m_iISN != res->m_iISN)) // secuity check
    e = CUDTException(1, 4, 0);
    }

    if (e.getErrorCode() != 0)
    {
    // connection failure, clean up and throw exception
    delete [] resdata;

    throw e;
    }

    // Got it. Re-configure according to the negotiated values.
    s->m_pUDT->m_iPeerISN = res->m_iISN;
    s->m_pUDT->m_iRcvLastAck = res->m_iISN;
    s->m_pUDT->m_iRcvLastAckAck = res->m_iISN;
    s->m_pUDT->m_iRcvCurrSeqNo = res->m_iISN - 1;
    s->m_pUDT->m_PeerID = res->m_iID;

    delete [] resdata;

    // And, I am connected too.
    s->m_pUDT->m_bConnected = true;

    // register this socket for receiving data packets
    s->m_pUDT->m_pRcvQueue->setNewEntry(s->m_pUDT);

    } // close the else
    } // close the for loop

    return count;
    }

     

Log in to post a comment.