Re: curl-loader https: low performance due to DNS query?
Status: Alpha
Brought to you by:
coroberti
From: Fred H. <di...@gm...> - 2012-11-26 07:08:28
|
yes. I will follow up. one more thing I see is that cares driven by libcurl sends queries in req/resp/req/resp/req/resp mode while adnshost sends queries in req,req,req/resp, resp,resp/... mode. that is to say, libcares doesnot use multi-querying in the wire line. req/resp/req/resp mode has a limit between 250~600 queries/s, while req,req,req/resp, resp,resp/... mode can reach 10,000 queries/s 2012/11/24 Robert Iakobashvili <cor...@gm...> > Hi Fred, > I think you can address this suggestion and your observations to the lists > of cares and curl-development. > > Take care, > Thanks > Robert > > On Sat, Nov 24, 2012 at 5:00 PM, Fred Huang <di...@gm...> wrote: > > hi all, > > > > is it possible to use adns (http://www.chiark.greenend.org.uk/~ian/adns/) > to > > do dns resolving instead of libc-ares? all my test results shows that > > lib-cares can do no more than 300 queries per second. however, using 'dig > > +short -f hostnames.list &>/dev/null' against same dns server I can get > more > > 7500 queries per second. I think libc-ares is also limited by the system > API > > of gethostbyname. > > > > About ADNS lib: > > Many clients for DNS resolution are coded poorly.Most UNIX systems > > provide an implementation of gethostbyname (the DNS client > API—application > > program interface), which cannot concurrently handle multiple outstanding > > requests. Therefore, the crawler cannot issue many resolution requests > > together and poll at a later time for completion of individual requests, > > which is critical for acceptable performance. Furthermore, if the > > system-provided client is used, there is no way to distribute load among > a > > number of DNS servers. For all these reasons, many crawlers choose to > > include their own custom client for DNS name resolution. The Mercator > > crawler from Compaq System Research Center reduced the time spent in DNS > > from as high as 87% to a modest 25% by implementing a custom client. The > > ADNS asynchronous DNS client library is ideal for use in crawlers. > > In spite of these optimizations, a large-scale crawler will spend a > > substantial fraction of its network time not waiting for Http data > transfer, > > but for address resolution. For every hostname that has not been resolved > > before (which happens frequently with crawlers), the local DNS may have > to > > go across many network hops to fill its cache for the first time. To > overlap > > this unavoidable delay with useful work, prefetching can be used. When a > > page that has just been fetched is parsed, a stream of HREFs is > extracted. > > Right at this time, that is, even before any of the corresponding URLs > are > > fetched, hostnames are extracted from the HREF targets, and DNS > resolution > > requests are made to the caching server. The prefetching client is > usually > > implemented using UDP instead of TCP, and it does not wait for > resolution > > to be completed. The request serves only to fill the DNS cache so that > > resolution will be fast when the page is actually needed later on. > > > > ===end > > > > > > 2012/3/5 Fred Huang <di...@gm...> > >> > >> test 1: > >> > >> dns server: dnsmasq@127.0.0.1, 2,000,000 dns entry cache, resolve > *.com to > >> one IP address > >> number of domain names in all https URLs: 780,000 > >> number of client: 1000 > >> CPU usage: 78% > >> cpu% irq% sirq% sys% iowt% mem_used buf&cached > >> 78.8 0.0 0.6 3.8 0.0 3640.9Mb 282.1Mb > >> SSL TPS: 300 > >> SSL throughput: 200Mbps > >> > >> # gprof /usr/bin/curl-loader gmon.out -p | head -50 > >> Flat profile: > >> Each sample counts as 0.01 seconds. > >> % cumulative self self total > >> time seconds seconds calls s/call s/call name > >> 38.55 127.10 127.10 130975 0.00 0.00 > >> Curl_hash_clean_with_criterium > >> 24.63 208.32 81.22 1247528407 0.00 0.00 > >> hostcache_timestamp_remove > >> 13.08 251.44 43.12 2791988 0.00 0.00 Curl_hash_pick > >> 7.30 275.51 24.07 352490 0.00 0.00 Curl_hash_add > >> 6.01 295.31 19.80 410760770 0.00 0.00 > >> Curl_str_key_compare > >> 2.95 305.04 9.73 131015 0.00 0.00 create_conn > >> 0.88 307.94 2.90 4841390 0.00 0.00 dprintf_formatf > >> 0.74 310.38 2.44 136036 0.00 0.00 ConnectionStore > >> 0.48 311.96 1.58 814604911 0.00 0.00 > ares__is_list_empty > >> 0.36 313.16 1.20 locking_function > >> 0.27 314.06 0.90 261608 0.00 0.00 ares_cancel > >> 0.22 314.79 0.73 127417 0.00 0.00 > >> curl_multi_remove_handle > >> 0.19 315.40 0.62 8906301 0.00 0.00 > >> client_tracing_function > >> 0.17 315.97 0.57 135305 0.00 0.00 Curl_num_addresses > >> 0.17 316.54 0.57 181937907 0.00 0.00 addbyter > >> 0.17 317.09 0.55 2374363 0.00 0.00 multi_runsingle > >> 0.16 317.62 0.53 541954 0.00 0.00 > ares__init_list_node > >> 0.16 318.14 0.52 494412 0.00 0.00 > >> curl_multi_socket_action > >> 0.14 318.60 0.46 33578897 0.00 0.00 Curl_socket_check > >> 0.13 319.04 0.44 102951052 0.00 0.00 Curl_raw_toupper > >> 0.13 319.48 0.44 787558 0.00 0.00 Curl_readwrite > >> 0.13 319.91 0.44 id_function > >> 0.12 320.29 0.38 391390 0.00 0.00 Curl_hash_str > >> 0.09 320.58 0.29 9216 0.00 0.00 curl_multi_perform > >> 0.08 320.86 0.28 2350908 0.00 0.00 Curl_raw_equal > >> 0.08 321.14 0.28 2019415 0.00 0.00 Curl_splay > >> 0.08 321.40 0.26 520625 0.00 0.00 > ossl_connect_common > >> 0.08 321.65 0.25 3532992 0.00 0.00 Curl_pgrsUpdate > >> 0.07 321.87 0.22 131778537 0.00 0.00 curl_strequal > >> 0.05 322.05 0.18 8954118 0.00 0.00 scan_response > >> 0.05 322.23 0.18 533644 0.00 0.00 ares_expand_name > >> 0.05 322.40 0.17 2477345 0.00 0.00 fd_key_compare > >> 0.05 322.57 0.17 2429135 0.00 0.00 Curl_infof > >> 0.05 322.74 0.17 130966 0.00 0.00 singleipconnect > >> 0.05 322.90 0.16 9292 0.00 0.00 > >> curl_multi_socket_all > >> 0.05 323.06 0.16 270137 0.00 0.00 ares__get_hostent > >> 0.05 323.21 0.15 8943657 0.00 0.00 Curl_debug > >> 0.04 323.35 0.14 262493 0.00 0.00 > >> Curl_ssl_getsessionid > >> 0.04 323.48 0.13 566006 0.00 0.00 socket_callback > >> 0.04 323.61 0.13 33983864 0.00 0.00 curlx_tvdiff > >> 0.04 323.74 0.13 132853 0.00 0.00 > >> Curl_http_readwrite_headers > >> 0.04 323.86 0.12 344032 0.00 0.00 epoll_del > >> 0.04 323.98 0.12 223477 0.00 0.00 Curl_poll > >> 0.04 324.10 0.12 8526280 0.00 0.00 Curl_raw_nequal > >> 0.03 324.21 0.11 348839 0.00 0.00 event_del > >> > >> > >> > >> > >> test 2: > >> > >> dns server: dnsmasq@127.0.0.1, 2,000,000 dns entry cache, resolve > *.com to > >> one IP address > >> number of domain names in all https URLs: 1 > >> number of client: 1000 > >> CPU usage: 75% > >> SSL TPS: 1300 > >> SSL throughput: 700Mbps > >> > >> # gprof /usr/bin/curl-loader gmon.out -p | head -50 > >> Flat profile: > >> Each sample counts as 0.01 seconds. > >> % cumulative self self total > >> time seconds seconds calls s/call s/call name > >> 8.98 2.45 2.45 5127312 0.00 0.00 dprintf_formatf > >> 8.90 4.88 2.43 1292796503 0.00 0.00 > >> ares__is_list_empty > >> 6.12 6.55 1.67 322461 0.00 0.00 create_conn > >> 4.58 7.80 1.25 419872 0.00 0.00 ares_cancel > >> 4.10 8.92 1.12 2100162 0.00 0.00 Curl_readwrite > >> 3.92 9.99 1.07 4527388 0.00 0.00 Curl_hash_pick > >> 3.43 10.93 0.94 locking_function > >> 3.04 11.76 0.83 909759 0.00 0.00 > >> curl_multi_socket_action > >> 3.00 12.58 0.82 97252 0.00 0.00 > >> Curl_hash_clean_with_criterium > >> 2.71 13.32 0.74 179704016 0.00 0.00 Curl_raw_toupper > >> 2.68 14.05 0.73 4646524 0.00 0.00 multi_runsingle > >> 2.44 14.71 0.67 13862992 0.00 0.00 > >> client_tracing_function > >> 2.42 15.37 0.66 319735 0.00 0.00 > >> curl_multi_remove_handle > >> 2.42 16.03 0.66 22 0.03 0.03 > ares__init_list_node > >> 1.94 16.56 0.53 9930244 0.00 0.00 > >> hostcache_timestamp_remove > >> 1.94 17.09 0.53 169320088 0.00 0.00 addbyter > >> 1.80 17.58 0.49 id_function > >> 1.36 17.95 0.37 7057146 0.00 0.00 Curl_pgrsUpdate > >> 1.36 18.32 0.37 16835 0.00 0.00 > >> curl_multi_socket_all > >> 1.25 18.66 0.34 16719 0.00 0.00 curl_multi_perform > >> 1.21 18.99 0.33 329403 0.00 0.00 > >> Curl_http_readwrite_headers > >> 1.17 19.31 0.32 4036624 0.00 0.00 Curl_splay > >> 1.14 19.62 0.31 20838902 0.00 0.00 Curl_raw_nequal > >> 0.82 19.85 0.23 2764123 0.00 0.00 Curl_infof > >> 0.81 20.07 0.22 5933857 0.00 0.00 ossl_recv > >> 0.77 20.28 0.21 323112 0.00 0.00 > >> Curl_splayremovebyaddr > >> 0.75 20.48 0.21 5943991 0.00 0.00 Curl_read > >> 0.70 20.67 0.19 425174 0.00 0.00 event_del > >> 0.70 20.86 0.19 98652 0.00 0.00 > >> Curl_if_is_interface_name > >> 0.70 21.05 0.19 8677733 0.00 0.00 Curl_socket_check > >> 0.59 21.21 0.16 4129554 0.00 0.00 fd_key_compare > >> 0.57 21.37 0.16 14017019 0.00 0.00 Curl_debug > >> 0.55 21.52 0.15 9376861 0.00 0.00 stat_data_in_add > >> 0.55 21.67 0.15 6840468 0.00 0.00 Curl_timeleft > >> 0.55 21.82 0.15 2054602 0.00 0.00 Curl_raw_equal > >> 0.55 21.97 0.15 1291611 0.00 0.00 Curl_expire > >> 0.51 22.11 0.14 16570063 0.00 0.00 curlx_tvnow > >> 0.51 22.25 0.14 13863284 0.00 0.00 scan_response > >> 0.51 22.39 0.14 5498864 0.00 0.00 Curl_setopt > >> 0.48 22.52 0.13 97490291 0.00 0.00 curl_strequal > >> 0.44 22.64 0.12 324253 0.00 0.00 Curl_http > >> 0.44 22.76 0.12 394386 0.00 0.00 > ossl_connect_common > >> 0.40 22.87 0.11 3581335 0.00 0.00 Curl_getinfo > >> 0.38 22.97 0.11 26983092 0.00 0.00 alloc_addbyter > >> 0.37 23.07 0.10 8729044 0.00 0.00 Curl_client_write > >> > > > > > > > ------------------------------------------------------------------------------ > > Monitor your physical, virtual and cloud infrastructure from a single > > web console. Get in-depth insight into apps, servers, databases, vmware, > > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > > Pricing starts from $795 for 25 servers or applications! > > http://p.sf.net/sfu/zoho_dev2dev_nov > > _______________________________________________ > > curl-loader-devel mailing list > > cur...@li... > > https://lists.sourceforge.net/lists/listinfo/curl-loader-devel > > > > > > -- > Regards, > Robert Iakobashvili, Ph.D. > > Home: http://www.ghotit.com > ......................................................... > Ghotit Dyslexia -> Das Ist Real Writer > ......................................................... > > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov > _______________________________________________ > curl-loader-devel mailing list > cur...@li... > https://lists.sourceforge.net/lists/listinfo/curl-loader-devel > |