Menu

#5 we are seeing continues spike in CPU usage using ArpON

3.1-ng
closed
None
2020-12-25
2020-03-02
prakash ks
No

Hi,

Tried to integrate Arpon into one of our device to protect the arp cache but Unfortunately we are seeing continues spike in CPU usage. Is there any performance optimaization flags to reduce CPU spike usage??
Just for referrence:
10016 root 30 10 36800 16m 16m S ** 3 2.6 0:00.63 arpon \
10016 root 30 10 36800 16m 16m S ** 4 2.6 ** 0:00.74 arpon \
10016 root 30 10 36800 16m 16m S
3 2.6 0:00.84 arpon \
10016 root 30 10 36800 16m 16m S 2 2.6 0:00.91 arpon \
10016 root 30 10 36800 16m 16m S
3 2.6 0:01.01 arpon \
10016 root 30 10 36800 16m 16m S
4 2.6 ** 0:01.12 arpon \
10016 root 30 10 36800 16m 16m S ** 4 2.6 ** 0:01.24 arpon \
10016 root 30 10 36800 16m 16m S 3 2.6 0:01.33 arpon \
10016 root 30 10 36800 16m 16m S 4 2.6 0:01.46 arpon \
10016 root 30 10 36800 16m 16m S **3 2.6 ** 0:01.56 arpon

Any help much appriciated.
Thanks!
Prakash

Discussion

  • prakash ks

    prakash ks - 2020-03-17

    Hi Andrea,

    further I digged the code and done profiling to figure out functions which is causing cpu performace.

    Check the attached diagram. Intf_capture and select() function is mainly contributing to the performance hit.

    Description automatically generated

    But
    I am able to reduce the constant CPU performance from 3-4 to 0.3-0.7 with the below findings:
    As I mentioned in earlier, these 2 Intf_capture and select() functions were contributing to higher CPU usage,
    Intf_capture is implemented to do Live capture of the I/O ARP packets read from the network traffic
    of the interface.
    Specially below while loop part of the code contributing more
    while (1) {
    fd_set fdread;

        /* Clear the set. */
        FD_ZERO(&fdread);
    
        /* Add the pcap capture handle file descriptor to set. */
        FD_SET(fd, &fdread);
    
        /*
    
         * Monitor the pcap capture handle file descriptor
         * until it becomes ready for the reading.
         */
        if (select(fd + 1, &fdread, NULL, NULL, &timeout) < 0) {
            /* Interrupt signal? */
            if (errno == EINTR) {
                /* Loop again with the time remaining calculated by select. */
                continue;
            }
        …….
    

    …………..
    / Re-set the read capture timeout to 1 millisecond. /
    timeout.tv_sec = INTF_SECS; / No seconds. /
    timeout.tv_usec = INTF_MILLI2MICRO(INTF_READTIMEOUT);
    }

    /* Never reaches here. */
    return NULL;
    

    }

    Increased live packet capture timeout.
    INTF_READTIMEOUT increased this value to 10MilliSeconds(by default it was 1Millisecond) by this change we are able to see a drastic improvements in CPU performance.
    With Change:
    top -b | grep arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.01 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.01 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.02 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.02 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.03 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.04 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.04 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.05 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.06 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.06 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.07 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.08 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.08 arpon
    15510 root 20 0 177196 2248 2096 S 0.7 2.6 0:00.10 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.10 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.11 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.11 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.12 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.12 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.13 arpon
    15510 root 20 0 177196 2248 2096 S 0.7 2.6 0:00.15 arpon
    15510 root 20 0 177196 2248 2096 S 0.0 2.6 0:00.15 arpon
    15510 root 20 0 177196 2248 2096 S 0.3 2.6 0:00.16 arpon

    Without Change:
    23280 root 30 10 36492 17184 17024 S 3.2 2.7 0:01.04 arpon\
    23280 root 30 10 36492 17184 17024 S 2.9 2.7 0:01.13 arpon\
    23280 root 30 10 36492 17184 17024 S 3.2 2.7 0:01.23 arpon\
    23280 root 30 10 36492 17184 17024 S 3.9 2.7 0:01.35 arpon\
    23280 root 30 10 36492 17184 17024 S 2.5 2.7 0:01.43 arpon\
    23280 root 30 10 36492 17184 17024 S 3.6 2.7 0:01.54 arpon\
    23280 root 30 10 36492 17184 17024 S 2.9 2.7 0:01.63 arpon\
    23280 root 30 10 36492 17184 17024 S 3.2 2.7 0:01.73 arpon\
    23280 root 30 10 36492 17184 17024 S 2.9 2.7 0:01.82 arpon\
    23280 root 30 10 36492 17184 17024 S 4.9 2.7 0:01.97 arpon\
    23280 root 30 10 36492 17184 17024 S 3.9 2.7 0:02.09 arpon\
    23280 root 30 10 36492 17184 17024 S 3.2 2.7 0:02.19 arpon\
    23280 root 30 10 36492 17184 17024 S 1.6 2.7 0:02.24 arpon\
    23280 root 30 10 36492 17184 17024 S 3.8 2.7 0:02.36 arpon\
    23280 root 30 10 36492 17184 17024 S 3.3 2.7 0:02.46 arpon\
    23280 root 30 10 36492 17184 17024 S 3.9 2.7 0:02.58 arpon\
    23280 root 30 10 36492 17184 17024 S 3.3 2.7 0:02.68 arpon\
    23280 root 30 10 36492 17184 17024 S 3.5 2.7 0:02.79 arpon\

    What would be the impact of this change?

     
  • Andrea Di Pasquale

    Hi,

    the timeout related to our Poll Mode Capture is absolutely one of the most important parameters in order to avoid ARP spoofing attack. For that reason, we designed and implemented a small timeout in ArpON.

    In Poll Mode Capture is absolutely normal to have more CPU consumption. It's constant but low. It's part of our design.

    Therefore, as your CPU consumption (around 2-3% on 1 CPU core) is absolutely acceptable and low on both UP and SMP CPU Architecture, we will not accept this patch.

    Thanks

     
  • Andrea Di Pasquale

    • status: open --> closed
     

Log in to post a comment.

MongoDB Logo MongoDB