#11 Fix for incorrect pattern length

open-fixed
None
7
2010-10-27
2010-10-26
Sergei Golovan
No

Hi!

I'd like to propose a patch which fixes segfault when matching exact string with non-ASCII in it (in fact when UTF-8 bytelength is greater than Unicode cahracter length).

The bug description with a simple script which triggers it can be found at http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/8ea4b666c2f31cac#

Another bugreport is at Ubuntu tracker (though it's incomplete, but the segfault is at the same code fragment). See https://bugs.launchpad.net/ubuntu/+source/expect/+bug/608343

The issue seems to be in matching function where UTF-8 pattern is used to match Tcl_UniChar string. The matching itself is fine, but the length of matched string segment is calculated incorrectly as a UTF-8 bytelength of the pattern. The attached patch switches to Tcl_UniChar pattern.

Discussion

  • Sergei Golovan
    Sergei Golovan
    2010-10-26

    Fix for incorrect pattern length

     
    Attachments
  • Alternate patch, smaller.

     
    Attachments
  • Thank you for the investigation. I managed to reproduce the problem here too.
    Enclosed (attached) my alternate patch for the problem.
    Instead of rewriting the match routines (which do not return the length info in question) I simply convert the patLength from #bytes to #chars via Tcl_NumUtfChars().

     
    • priority: 5 --> 9
    • assigned_to: nobody --> andreas_kupries
     
    • status: open --> closed-fixed
     
  • Committed my patch

     
    • status: closed-fixed --> open-fixed
     
  • Reopening after a talk with teo(petuk) on the chat:

    08:23] teo1 hi! i'm about your patch to expect (i can't comment in closed bugreports in sf.net). my patch eliminates double conversion between utf-8 and unicode. so it makes search a bit more efficient (it also removes one unused counter)
    [08:23] teo1 though it's more lengthy
    [08:24] aku moin. Ok, I'll have another look and mediation on the various conversions.
    [08:33] teo1 your patch is ok to me as well. it is small enough to push it into debian stable even in freeze time

     
    • priority: 9 --> 7