#906 tagmanager fails on reST files in utf-8 (w/ solution)

v1.23
closed-fixed
v1.22
5
2013-07-23
2012-10-18
No

Hi!

I'm using Geany 1.22 on Fedora Linux.

The problem:
reST formatted text containing Non-ASCII characters (e.g. German umlauts)
in section titles isn't properly shown in Geany's "Symbols" window: Those Non-ASCII titles aren't shown at all.

Reason:
The implemented vStringLength(name) counts the bytes and not the characters of the title. In utf-8/16 that may not be the same...
Comparing it to the length of the title adornment (line of underlines) gives wrong results.

Solution (at least for utf-8):
Do not count the utf-8 continuation bytes any longer.

I attached a simple solution by recalculating name_len.

Since I'm not a C-coder that solution might not be perfect, but for me it works.

Christian

Discussion

  • Diff file for tagmanager/rest.c

     
    Attachments
  • Lex Trotman
    Lex Trotman
    2012-10-18

    Since the text being parsed is always UTF-8 your simple solution should work.

     
  • Lex Trotman
    Lex Trotman
    2012-10-18

    • status: open --> open-accepted
     
    • milestone: --> v1.23
    • assigned_to: nobody --> colombanw
    • status: open-accepted --> closed-fixed
     
  • I fixed it slightly differently in Git for better handling of non-UTF-8 input, but the idea is the same. Thanks for reporting & investigating this!

    @elextr: no, input isn't always UTF-8. It's UTF-8 if we feed the parser with a buffer, but not necessarily if we parse the real file (e.g. with `geany -g`).

     
    • labels: Filetypes --> Filetypes, reStructuredText, parser
    • Found in: --> v1.22