Menu

#42 Non-Latin headings are not converted into proper anchor links

Default
closed-duplicate
nobody
None
5
2015-03-23
2014-04-11
No

Problem

If a heading includes non-Latin characters, they are ignored.

If none of the characters in a heading is Latin, an anchor link like "id1" is created. Anchor links like this do not make any sense.

Check live examples:

http://searchanise-supporters-guide.readthedocs.org/ru/latest/widget.html#id1 (from heading “Бесплатный”)

http://searchanise-supporters-guide.readthedocs.org/ru/latest/magento.html#searchanise (from heading “После установки расширения Searchanise админка недоступна”)

http://searchanise-supporters-guide.readthedocs.org/ru/latest/admin.html#id2 (from heading “Клиентская панель управления Searchanise”; note that in this case, for some reason, even the word Searchanise is ignored).

Solution

Cyrillic anchor names are valid and should be used (see Wikipedia for example: http://ru.wikipedia.org/wiki/Pantera#.D0.92.D0.BB.D0.B8.D1.8F.D0.BD.D0.B8.D0.B5_.D0.B8_.D1.82.D0.B5.D0.BD.D0.B4.D0.B5.D0.BD.D1.86.D0.B8.D0.B8).

Thanks!

Discussion

  • Günter Milde

    Günter Milde - 2015-02-23

    Ticket moved from /p/docutils/bugs/254/

     
  • Günter Milde

    Günter Milde - 2015-02-23

    The behaviour (as unhelpfull as it is for languages using a non-Latin script) is in conformance to the Docutils specifications.

    The auto-generated targets have internally both, a name (may contain non-Latin characters) and an id (conforms to the regular expression [a-z](-?[a-z0-9]+)*).
    You can use the name for referencing in the rST document::

    Schöne Grüße
    ============
    
    Greek Λογος
    ============
    
    Mit link zu `schöne Grüße`_ und 
    `Секций Б`_.
    
    Секций Б
    ========
    

    The html writer uses the matching id for the anchor.

    I agree, that this is unfortunate for links from external sources to places inside the document. However, any change would require careful consideration of the implications in all supported output formats as well as the problem of backwards compatibility/stability (cf. the closed ticket https://sourceforge.net/p/docutils/bugs/207/).
    One could consider an option to use the suitably encoded "name" in anchors and references if the output format supports it. Discussion of this might be best done on the docutils-devel mail list.

     

    Last edit: Günter Milde 2015-03-23
  • Günter Milde

    Günter Milde - 2015-03-23
    • status: open --> closed-duplicate
    • Group: sandbox --> Default
     
  • Günter Milde

    Günter Milde - 2015-03-23

    A similar request has been treated as "wontfix" for stability reasons but may be revived as an option.

    See https://sourceforge.net/p/docutils/feature-requests/41/

     

Log in to post a comment.