The odf_odt writer embeds images in its output files and uses the original filenames as part of the embedded filenames. Since the OpenDocument standard does not specify the filename charset, recode to ASCII (dropping non-representable characters) to be on the safe side.
The actual reason that brought about this patch is an invalid assumption about character sets in docutils.writers.odf_odt.Writer.store_embedded_files(). This has been reported as Debian bug http://bugs.debian.org/714317.
the patch does two things. first
remove decode('latin-1').encode('utf-8')
the filename stored in zipfile.
seams good to me. as the filename refererenced should not be
changed and encoding/decoding should have happened in docutils.io anyway
APPLIED in revision 7786
second::
def visit_image(self, node):
@@ -2076,7 +2075,8 @@
else:
self.image_count += 1
filename = os.path.split(source)[1]
- destination = 'Pictures/1%08x%s' % (self.image_count, filename, )
+ destination = 'Pictures/1%08x_%s' % (self.image_count,
+ filename.encode("ascii", "ignore"))
if source.startswith('http:'):
try:
i do not see why the first part removes encode and the second adds ?
NOT APPLIED
The related Debian bug was closed in 2013. Thanks for the patch.