Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#131 iso-2022-jp Space After URLs Causes Invalid Anchor Link

closed-fixed
nobody
None
5
2006-05-06
2005-11-14
tetsuo13
No

SquirrelMail 1.4.5, PHP 4.4.1.

Plain text messages that are encoded in iso-2022-jp
exhibit an error when translating URLs into anchor
links. If the next character following a URL is a
full-width, iso-2022-jp encoded space, the entire line
gets interpreted as the URL.

For example (I hope SF displays this correctly):

http://www.squirrelmail.org/ クリックして

http://www.squirrelmail.org/ クリックして

The first line contains a full-width, iso-2022-jp
encoded space after the trailing slash and in SM, ends
up entirely as the anchor link. The second line, that
is a regular iso-8859-1 encoded space, is correctly
parsed for the URL.

I believe this is an error in functions/url_parser.php
with the variable $url_parser_poss_ends.

Discussion

  • Tomas Kuliavas
    Tomas Kuliavas
    2005-11-15

    Logged In: YES
    user_id=225877

    used squirrelmail translation?

    do you have iso-2022-jp decoding support in squirrelmail?

     
  • tetsuo13
    tetsuo13
    2005-11-15

    Logged In: YES
    user_id=1015520

    PHP is correctly compiled to handle Japanese characters and
    SM correctly displays iso-2022-jp encooded messages, yes.

     
  • tetsuo13
    tetsuo13
    2005-11-15

    Logged In: YES
    user_id=1015520

    Change the browser's character encoding to UTF-8 to see the
    example I provided. SF was UTF-8 when I created this
    tracker, seems SF is iso-8859-1 today.

     
  • Tomas Kuliavas
    Tomas Kuliavas
    2005-11-15

    • labels: 102904 -->
     
  • Tomas Kuliavas
    Tomas Kuliavas
    2005-11-15

    Logged In: YES
    user_id=225877

    first url contains ideographic space. U+3000.

    If string is decoded, it is displayed with  . If
    string is not decoded, it depends on used character set.

    Character is represented with:
    * \xA1\xA1 in euc-jp, gb2312, cp949
    * \xA1\x40 in big5
    * \x21\x21 in iso-2022-jp (jis0208_1983)

    iso-2022-jp is not used by SquirrelMail in message display.
    If you are using Japanese translation, you see it in euc-jp
    character set. If you need fast solution, add " " and
    all other ideographic space variations (except "\x21\x21")
    to $url_parser_poss_ends.

    I am still thinking about fullwidth punctuation marks and
    about setting url end on any 8bit character or html entity.

     
  • Tomas Kuliavas
    Tomas Kuliavas
    2006-01-23

    • assigned_to: nobody --> tokul
     
  • Tomas Kuliavas
    Tomas Kuliavas
    2006-03-06

    • assigned_to: tokul --> nobody
     
  • Tomas Kuliavas
    Tomas Kuliavas
    2006-03-06

    Logged In: YES
    user_id=225877

    Issue should be fixed in SquirrelMail 1.5.2cvs
    functions/url_parser.php v.1.63

     
  • Tomas Kuliavas
    Tomas Kuliavas
    2006-05-06

    Logged In: YES
    user_id=225877

    Fixed in 1.4.7cvs.

     
  • Tomas Kuliavas
    Tomas Kuliavas
    2006-05-06

    • status: open --> closed-fixed