Menu

#113 writers/odf_odt: Use only ASCII filenames in ODF packages

None
closed-fixed
nobody
None
5
2023-04-18
2013-08-05
No

The odf_odt writer embeds images in its output files and uses the original filenames as part of the embedded filenames. Since the OpenDocument standard does not specify the filename charset, recode to ASCII (dropping non-representable characters) to be on the safe side.

The actual reason that brought about this patch is an invalid assumption about character sets in docutils.writers.odf_odt.Writer.store_embedded_files(). This has been reported as Debian bug http://bugs.debian.org/714317.

1 Attachments

Discussion

  • engelbert gruber

    the patch does two things. first

    remove decode('latin-1').encode('utf-8')
    the filename stored in zipfile.

    seams good to me. as the filename refererenced should not be
    changed and encoding/decoding should have happened in docutils.io anyway

    APPLIED in revision 7786

     
  • engelbert gruber

    second::

    def visit_image(self, node):

    @@ -2076,7 +2075,8 @@
    else:
    self.image_count += 1
    filename = os.path.split(source)[1]
    - destination = 'Pictures/1%08x%s' % (self.image_count, filename, )
    + destination = 'Pictures/1%08x_%s' % (self.image_count,
    + filename.encode("ascii", "ignore"))
    if source.startswith('http:'):
    try:

    i do not see why the first part removes encode and the second adds ?

    NOT APPLIED

     
  • Günter Milde

    Günter Milde - 2023-04-18
    • status: open --> closed-fixed
     
  • Günter Milde

    Günter Milde - 2023-04-18

    The related Debian bug was closed in 2013. Thanks for the patch.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.