From: Trevor D. (Twylite) <tw...@cr...> - 2012-11-22 10:16:40
|
Hi, On 2012/11/21 04:55 PM, Larry McVoy wrote: >> I tend to see performance drops when I ... >> >> (1) abstract a control idiom into a custom control function (using >> uplevel). > Got a benchmark that shows this? Or can you suggest one? >> (2) refactor a bloated proc into smaller functions (indicating a >> function call overhead). > Same question. I see tcl about the same as ruby, about 4x slower > than python. Is that what you see? >> (3) add more error handling via try/trap or catch. I've >> investigated this somewhat and the main culprit is a finally handler >> - specifically the way the interp state is serialised and restored. >> I have a prototype that uses a custom Tcl_ObjType to manage the >> serialisation and it improves the performance somewhat in >> micro-benchmarks. > Details? Example code? Micro-benchmark set 1: looking at function call overhead for proc, method, ensemble. This is a helper proc that I use to ensure numeric parameters are handled correctly while keeping the code neat & terse. The performance impact of this sort of helper, plus asserts and logging, is the main reason I am interested in macro support for Tcl. Results: time { check_inline_fast 10 } 1000000 ;# 1.14 µs/iter, factor 0.89 time { check_inline 10 } 1000000 ;# 1.28 µs/iter, factor 1.00 time { check_proc 10 } 1000000 ;# 1.69 µs/iter, factor 1.32 time { check_obj 10 } 1000000 ;# 1.96 µs/iter, factor 1.53 time { check_ensemble 10 } 1000000 ;# 2.01 µs/iter, factor 1.57 Code: proc check_int {name val {min {}} {max {}}} { if { ([scan $val %lli%c value dummy] != 1) || (($min ne {}) && ($value < $min)) || (($max ne {}) && ($value > $max)) } { error "input '$name' is not an integer or is out of range" } return $value } oo::object create check oo::objdefine check method int {name val {min {}} {max {}}} { if { ([scan $val %lli%c value dummy] != 1) || (($min ne {}) && ($value < $min)) || (($max ne {}) && ($value > $max)) } { error "input '$name' is not an integer or is out of range" } return $value } namespace eval checkns { namespace export int namespace ensemble create proc int {name val {min {}} {max {}}} { if { ([scan $val %lli%c value dummy] != 1) || (($min ne {}) && ($value < $min)) || (($max ne {}) && ($value > $max)) } { error "input '$name' is not an integer or is out of range" } return $value } } proc check_inline_fast {val} { if { ([scan $val %lli%c value dummy] != 1) || ($value < 0) || ($value > 99) } { error "input 'val' is not an integer or is out of range" } incr value 100 } proc check_inline {val} { set min 0 set max 99 if { ([scan $val %lli%c val dummy] != 1) || (($min ne {}) && ($val < $min)) || (($max ne {}) && ($val > $max)) } { error "input 'val' is not an integer or is out of range" } incr val 100 } proc check_proc {val} { set val [check_int val $val 0 99] incr val 100 } proc check_obj {val} { set val [check int val $val 0 99] incr val 100 } proc check_ensemble {val} { set val [checkns int val $val 0 99] incr val 100 } ------------------------- Micro-benchmarks set 2: looking at catch, try/finally, and abstracting these into a control construct. See bottom for results. The last time I checked this against a real world system I got a 2%-3% performance loss from abstracting a few key [catch] handlers. # loop_* procs have a tight loop that creates an oo::object, does a simple # math op, then destroys the object. They use different types of error handling # to avoid resource leaks. # - loop_catch # - loop_catch_ret_opts # - loop_try_finally # - loop_using1 # - loop_using2 namespace eval ::tcl::destroy { namespace export object namespace ensemble create proc object {v} { $v destroy } } proc ::using1 {type varname value body} { upvar 1 $varname var set var $value tailcall try $body finally [list ::tcl::destroy $type $value] } proc ::using2 {type varname value body} { upvar 1 $varname var set var $value set rcode [catch { uplevel 1 $body } result opts] set fcode [catch { ::tcl::destroy $type $value } fresult fopts] if { $fcode != 0 } { if { $rcode == 0 } { set result $fresult set opts $fopts } else { dict set opts -during $fopts } } dict incr opts -level return -options $opts $result } proc loop_catch {count icount} { set j 1 for {set i 0} {$i < $count} {incr i} { set obj [oo::object new] set r [catch { for {set k 1} {$k < $icount} {incr k} { set j [expr { ($j * $k) % 0xFFFFFFFB }] } } e] ::tcl::destroy object $obj if { $r == 1 } { error $e } } return $j } proc loop_catch_ret_opts {count icount} { set j 1 for {set i 0} {$i < $count} {incr i} { set obj [oo::object new] catch { for {set k 1} {$k < $icount} {incr k} { set j [expr { ($j * $k) % 0xFFFFFFFB }] } } e opts ::tcl::destroy object $obj return -options $opts $e } return $j } proc loop_try_finally {count icount} { set j 1 for {set i 0} {$i < $count} {incr i} { set obj [oo::object new] try { for {set k 1} {$k < $icount} {incr k} { set j [expr { ($j * $k) % 0xFFFFFFFB }] } } finally { ::tcl::destroy object $obj } } return $j } proc loop_using1 {count icount} { set j 1 for {set i 0} {$i < $count} {incr i} { using1 object obj [oo::object new] { for {set k 1} {$k < $icount} {incr k} { set j [expr { ($j * $k) % 0xFFFFFFFB }] } } } return $j } proc loop_using2 {count icount} { set j 1 for {set i 0} {$i < $count} {incr i} { using2 object obj [oo::object new] { for {set k 1} {$k < $icount} {incr k} { set j [expr { ($j * $k) % 0xFFFFFFFB }] } } } return $j } foreach func {loop_catch loop_catch_ret_opts loop_try_finally loop_using1 loop_using2} { puts "$func=[$func 10000 10]" } set nouter 10000 set ninner 1 time {loop_catch $::nouter $::ninner} 200 ;# 50570.08 µs/iter, factor 1.00 time {loop_try_finally $::nouter $::ninner} 200 ;# 65637.92 µs/iter, factor 1.30 time {loop_catch_ret_opts $::nouter $::ninner} 200 ;# 66612.16 µs/iter, factor 1.32 time {loop_using2 $::nouter $::ninner} 200 ;# 90361.09 µs/iter, factor 1.79 time {loop_using1 $::nouter $::ninner} 200 ;# 91537.21 µs/iter, factor 1.81 set ninner 50 time {loop_catch $::nouter $::ninner} 200 ;# 341384.62 µs/iter, factor 1.00 time {loop_try_finally $::nouter $::ninner} 200 ;# 356503.04 µs/iter, factor 1.04 time {loop_catch_ret_opts $::nouter $::ninner} 200 ;# 357200.53 µs/iter, factor 1.05 time {loop_using2 $::nouter $::ninner} 200 ;# 494404.39 µs/iter, factor 1.45 time {loop_using1 $::nouter $::ninner} 200 ;# 496856.76 µs/iter, factor 1.46 # return -options and try/finally have similar performance. A small part of # this slowdown comes from Tcl_GetReturnOptions(), but most comes from # Tcl_SetReturnOptions(). Using a custom Tcl_ObjType with InterpState as its # internal rep we can use Tcl_SaveInterpState()/Tcl_RestoreInterpState() and # recover most of this performance loss, at a slight expense when you actually # want to query or modify the options dict. # uplevel and tailcall have similar performance Regards, Twylite |