From: Guenter M. <mi...@us...> - 2017-05-18 19:42:58
|
Dear Dave, On 2017-05-16, Dave Kuhlman wrote: > I did the commit. > And, before I did so, I took Matěj's suggestion about using > ``locale.normalize(lang), so now you can use:: > $ rst2odt.py -l cs somedoc.txt somedoc.odt > $ rst2odt.py -l es somedoc.txt somedoc.odt > to get Czech and Spanish (Spain). Fine. > And, of course, you can override the region, for example:: > $ rst2odt.py -l cs-GB somedoc.txt somedoc.odt > $ rst2odt.py -l es-mx somedoc.txt somedoc.odt > to get British English and Mexican Spanish. I suggest the patch below to allow for BCP 47 tags like de-Latf-AT # second tag is script, region in 3rd position de-latf # second tag is script, no region given de-1901 # second tag is variant (here: spelling), no region given Further changes: The RuntimeError if locale.normalize fails to find a region tag is replaced with a Warning: a missing region tag does not prevent export of a functional output document. The RuntimeError for empty "self.visitor.language_code" is removed on the assumption that if a user calls ``--language=""``, this indicates that no language should be written into the output --- which is exactly what happens in this case. >From the function "languages.normalize_language_tag()", we only need the replacement of "_" by "-". This is better done with a string method. Günter Dir: /home/milde/Code/Python/docutils-svn/docutils/docutils/writers/odf_odt/ Index: __init__.py =================================================================== --- __init__.py (Revision 8069) +++ __init__.py (Arbeitskopie) @@ -572,38 +572,35 @@ s1 = self.get_stylesheet() # Set default language in document to be generated. # Language is specified by the -l/--language command line option. - # Allowed values are "ll", "ll-rr" or "ll_rr", where ll is language - # and rr is region. If region is omitted, we use + # The format is described in BCP 47. If region is omitted, we use # local.normalize(ll) to obtain a region. language_code = None region_code = None - if len(self.visitor.normalized_language_code) > 0: - language_ids = self.visitor.normalized_language_code[0].split('-') - if len(language_ids) == 2: - language_code = language_ids[0] - region_code = language_ids[1] - elif len(language_ids) == 1: - language_code = language_ids[0] + if self.visitor.language_code: + language_ids = self.visitor.language_code.replace('_','-') + language_ids = language_ids.split('-') + # first tag is primary language tag + language_code = language_ids[0].lower() + # 2-letter region subtag may follow in 2nd or 3rd position + for subtag in language_ids[1:]: + if len(subtag) == 2 and subtag.isalpha(): + region_code = subtag.upper() + break + elif len(subtag) == 1: + break # 1-letter tag is never before valid region tag + if region_code is None: rcode = locale.normalize(language_code) rcode = rcode.split('_') if len(rcode) > 1: - rcode = rcode[1] - rcode = rcode.split('.') - if len(rcode) >= 1: - region_code = rcode[0] + rcode = rcode[1].split('.') + region_code = rcode[0] if region_code is None: - raise RuntimeError( + self.document.reporter.warning( 'invalid language-region. ' 'Could not find region with locale.normalize(). ' 'If language is supplied, then you must specify ' - 'both lanauge and region (ll-rr). Examples: ' - 'es-mx (Spanish, Mexico), en-au (English, Australia).') - else: - raise RuntimeError( - 'invalid language-region. ' - 'Format must be "ll-rr" or "ll_rr", where ll is language ' - 'and rr is region. ' - 'See https://en.wikipedia.org/wiki/IETF_language_tag') + 'both language and region (ll-RR). Examples: ' + 'es-MX (Spanish, Mexico), en-AU (English, Australia).') # Update the style ElementTree with the language and region. # Note that we keep a reference to the modified node because # it is possible that ElementTree will throw away the Python @@ -888,8 +885,6 @@ self.language = languages.get_language( self.language_code, document.reporter) - self.normalized_language_code = languages.normalize_language_tag( - self.language_code) self.format_map = {} if self.settings.odf_config_file: from ConfigParser import ConfigParser |