Menu

#52 Unicode support in ns_puts with nds8x

aolserver3_4
closed-fixed
5
2004-06-29
2001-07-20
No

Under nsd8x, using ADP pages, if you output characters
using 'ns_puts', the UTF-8 string is not converted
back.

Example:

<% ns_puts "éčŕ" %>

will not produce the expected "éčŕ" on the HTML
page, but a string like "ŠÄŠÉŠĎ"

Quick hack to fix the problem:

see the attached file. I did not take the time
to do a clean patch as I did not change the
Makefile to enable compilation on with Tcl 7.6
(as the function doesn't exist it requires a
conditional compilation, but there is not flag
telling if where are compiling for 7.6 or 8.x)

Performance:

the convertion as an impact on performance as it
remove the 'direct memcpy' of the string, but
it is the only way to have correct output.

Discussion

  • laurent riesterer

    Modified adp.c from aolserver/nsd/

     
  • Nobody/Anonymous

    Logged In: NO

    I can add, that sending binary data (in case for example
    some dynamic image) also produce "unexpected" set of
    characters by using
    ns_puts/ns_write and believe, problem is the same kind

     
  • Kriston Rehberg

    Kriston Rehberg - 2002-02-26
    • assigned_to: nobody --> kriston
     
  • Christian Brechbühler

    Logged In: YES
    user_id=660859

    Also ns_log is affected, the same way as ns_write.
    Internally, Tcl strings appear to be robust and can contain
    "weird" characters like NUL (ASCII 0). The length that
    'string length' reports is accurate. But on output the
    strings get UTF-8 encoded, and NUL becomes the infamous C080
    sequence (Looks like A-grave C-A-grave), and you get more
    characters.

     
  • Kriston Rehberg

    Kriston Rehberg - 2003-03-03
    • assigned_to: kriston --> nobody
     
  • Mark Page

    Mark Page - 2003-03-13
    • assigned_to: nobody --> mpagenva
     
  • Dossy Shiobara

    Dossy Shiobara - 2004-06-29
    • milestone: --> aolserver3_4
    • status: open --> closed-fixed
     
  • Dossy Shiobara

    Dossy Shiobara - 2004-06-29

    Logged In: YES
    user_id=21885

    Using the simple test case described, AOLserver 4.0.5 now
    returns the correct 3 bytes when output character set is iso-
    8859-1. Closing this bug.

     

Log in to post a comment.