Menu

#169 Hyperlink extension rewriting

None
open
nobody
None
5
2020-12-17
2020-08-12
No

Implement a new feature in which the user can request that relative hyperlinks with particular extensions can have those extensions mapped to different respective extensions in output files. See the command-line options and the test cases added by this patch for the ways to activate this feature.

This is intended as one possible implementation of the "adaptable file extensions" TODO list item: https://docutils.sourceforge.io/docs/dev/todo.html#adaptable-file-extensions. This does take the implicit approach rather than the explicit approach (which would involve listing out the filenames that we want to rewrite in the output), which seems sufficient, but I would be happy to discuss transitioning this to an explicit approach.

1 Attachments

Discussion

  • John L. Clark

    John L. Clark - 2020-08-24

    Günter Milde provided some feedback on my initial patch: https://sourceforge.net/p/docutils/mailman/message/37091425/. I have attached a new patch that focuses only on the file extension mapping functionality, leaves off the changes to buildhtml (which I will include in a separate ticket), and uses unit tests for testing.

     
  • Günter Milde

    Günter Milde - 2020-08-24

    Thank you for the updated version.

    I suggest a name change (update -> map(ing) ) for the setting and in the prototype docstring, too. Actually, a more descriptive name for the config setting would be even better. Check https://docutils.sourceforge.io/docs/user/config.html for context and examples. (This documentation would need to be updated as well in the final patch. Adding a description may actually help finding the best name but may also be deferred to a later point when there is more clarity about the implentation and working.)

    What is your proposed syntax for command line use?

    Please do not inlcude the addition of a title template setting in this patch/ticket.

    Target filename extension mapping should be working for all output formats where it makes sense
    (LaTeX, HTML (all versions), ODT, ?). The setting might be best included in a generic writer section. We might have to find out how to achieve this in a clean way.

    Did you test with Python 2.7 and 3.x?

     
    • John L. Clark

      John L. Clark - 2020-08-28

      I suggest a name change (update -> map(ing) ) for the setting and in the
      prototype docstring, too.

      Thanks for helping me making sure all of these are consistent. I also appreciate your original suggestion; I was struggling with what wording would convey the meaning most clearly.

      What is your proposed syntax for command line use?

      I propose a command line such as:

      rst2html --map-extension rst html --map-extension jpg png input-filename.rst
      

      For local references and images, this would map the "rst" extention to "html", and would map the "jpg" extension to "png". This does make it difficult to include as a config setting, as it doesn't look like INI files quite support this kind of mapping configuration. As a side note, do we want to transition away from INI config files and towards something more expressive like JSON?

      Please do not inlcude the addition of a title template setting in this patch/ticket.

      Whoops! Looks like I was actually simulataneously working on three separate tiny features. That gave me a great excuse to learn a lot more about git in order to figure out how to move all my work into three separate branches!

      Target filename extension mapping should be working for all output formats
      where it makes sense[.]

      With my latest patch, attached below, I have added it to the the LaTeX and ODT Writers.

      Did you test with Python 2.7 and 3.x?

      I did; tox makes it easy. Still, I appreciate your encouragement, as I had only been testing with Python 3.8, and trying 2.7 revealing a library incosistency which I was able to work around.

      I have attached the new patch here, which addresses the above issues.

       
      • John L. Clark

        John L. Clark - 2020-11-18

        Günter Milde commented on the latest version of the patch on the mailing list. They suggested combining the comma- and colon-separated list syntax that Docutils has historically used to allow specifying this configuration within configuration files, in addition to on the command line. They also requested that I add documentation about this parameter to 'docs/user/config.txt'. The latest version of the patch includes these changes.

         
        • Günter Milde

          Günter Milde - 2020-12-17

          Thank you for the new patch. After a closer look at the patch and revisiting the existing suggestions at the TODO list (https://sourceforge.net/p/docutils/code/HEAD/tree/trunk/docutils/docs/dev/todo.txt) and the referenced docutils-users threads, I suggest to implement the extension rewriting either as a transform or in a post-processing step.
          Also, using regular expression replacement would cover more use cases (e.g. change of the domain part of an URL, keep the extensions of some text files that do not have an HTML equivalent, ...) and be easier to implement and document than a home-grown extraction and mapping procedure.
          Instead of a list of lists, we actually need a list of pairs (2-tuples) or an "ordered dictionary". Maybe the "field list" syntax :<pattern>: <repl> would be more appropriate to specify the arguments for re.sub().

           

          Last edit: Günter Milde 2020-12-17

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.