From: SourceForge.net <no...@so...> - 2009-02-23 13:11:00
|
Bugs item #2613766, was opened at 2009-02-18 14:46 Message generated for change (Comment added) made by dgp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2613766&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 43. Regexp Group: development: 8.6b1.1 Status: Open Resolution: None >Priority: 5 Private: No Submitted By: Don Porter (dgp) Assigned to: Donal K. Fellows (dkf) Summary: "RE var backtrack case" lost performance Initial Comment: Something since 8.6b1 has dropped the performance of the tclbench test noted in the summary by about a factor of 2. ---------------------------------------------------------------------- >Comment By: Don Porter (dgp) Date: 2009-02-23 08:10 Message: the other knob to twist on this one is to look at the handful of places that operate on the unicode array rep of a value iff the value already has the "string" objType. Might make sense to broaden that net to capture untyped values as well. We still avoid shimmering away intreps of other types that might have value, and this test at least suggests that the performance gains of working with unicode are so significant that they win even if you have to create the unicode rep from scratch as part of the operation. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2009-02-23 00:31 Message: Assuming that change isn't feasible, the performance loss comes down to a difference in the objtype of words produced by TclSubstTokens and by TclCompileTokens. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2009-02-23 00:00 Message: The simplest way to restore the lost performance is to add the flag TCL_EVAL_DIRECT to the TclNREvalObjEx() call at line 1869 of file tclIOUtil.c That change also solves the failing tests in info.test (2487771). Unfortunately, it also causes test source-8.1 to fail. This raises some questions. Is the TCL_EVAL_DIRECT flag supposed to be a valid thing to pass to TclNREvalObjEx() ? If so, why is it causing source-8.1 to fail? If not, why is there still code in that routine to handle that flag? Also, what about all the callers? Do they all take care to be sure that flag value doesn't flow through? ---------------------------------------------------------------------- Comment By: Jeffrey Hobbs (hobbs) Date: 2009-02-22 22:10 Message: I haven't checked the specifics, but note the a.*b.*c RE case was made slower by my changes to REToGlob at some point. They sped up the more common a.*b, but more than one .* shows the RE is still faster. I never got around to the code that counts the .*'s and falls back to RE with more than 1 present. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2009-02-22 21:58 Message: The difference is due to the objtype of $::str when that test runs. In older Tcl, it's the "string" Tcl_ObjType; in the current HEAD it's an untyped value. Then the performance difference is the difference between doing a glob match on a unicode array or on a utf encoded string. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2009-02-22 09:43 Message: That would (probably) indicate a shimmering problem. I didn't do anything with the RE code in that commit. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2009-02-21 23:59 Message: some bisection blames the "NRE-enable [source]" commit from 2009-01-05 for the loss. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2009-02-18 21:50 Message: A bit less cryptically this script demo's the slowdown: proc demo {s e} { regexp $e $s } demo a[string repeat b 200] a.*b.*c ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2613766&group_id=10894 |