From: Matěj C. <mc...@ce...> - 2017-05-03 21:01:17
|
If I run $ rst2odt -l cs kobylky.rst >kobylky.odt I would expect that kobylky.odt default style would be in the Czech language. Except when I open it in OOWriter (or LOWriter these days), it is still in English. The same result with ``-l cz_CZ`` (not sure which one is expected). Any thoughts? Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: David G. <go...@py...> - 2017-05-03 21:25:27
|
I assume you mean -l/--language. The option -p (in the subject) doesn't exist. On Wed, May 3, 2017 at 3:43 PM, Matěj Cepl <mc...@ce...> wrote: > If I run > > $ rst2odt -l cs kobylky.rst >kobylky.odt > > I would expect that kobylky.odt default style would be in the > Czech language. What do you mean by "default style"? > Except when I open it in OOWriter (or LOWriter > these days), it is still in English. The same result with ``-l > cz_CZ`` (not sure which one is expected). "cs" is expected for Czech. See the supported languages in docutils/languages/ (each language has a corresponding Python file there; Czech's is cs.py). If you try an unsupported language, you'll get a warning that ``language "x" not supported: Docutils-generated text will be in English.`` > Any thoughts? Try ``rst2html.py -l cs kobylky.rst >kobylky.html``; do you get what you expect? Looks like there's a bug in the ODT Writer. It doesn't handle admonition labels properly: it's not looking up the label in the target language. Please open a bug ticket for this. David Goodger <http://python.net/~goodger> |
From: Matěj C. <mc...@ce...> - 2017-05-03 22:03:17
|
On 2017-05-03, 21:24 GMT, David Goodger wrote: > On Wed, May 3, 2017 at 3:43 PM, Matěj Cepl <mc...@ce...> > wrote: >> If I run >> >> $ rst2odt -l cs kobylky.rst >kobylky.odt >> >> I would expect that kobylky.odt default style would be in the >> Czech language. > > What do you mean by "default style"? In LibreOffice Writer press F11 and select the root style (called “Default Style” in en_US locale), click other button of the mouse and select “Modify”. In the “Font” tab select language of the font. I see that it is probably not what’s meant to be the result of rst2odt, but it would be awesome if this value was configurable somehow, because I write in Czech, but I have ODT always generated in English. Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: Dave K. <dku...@da...> - 2017-05-10 23:51:11
|
Matěj, I believe I've fixed the two issues you have reported below. 1. The image now has, I believe, a reasonable width and height. But, you will have to see if you agree. 2. The label of the admonition/note has been translated. A fixed version of ``docutils/writers/odf_odt/__init__.py`` is attached to a separate message. You can use "-l" or "--language" and any of: cs-CZ cs_CZ cs-cz cs_cz They all have the same effect, as far as I can tell. But, "-l cs" does *not* work. I don't know whether it should or not. You can try "-l cs", but you end up with the language "cs-US", which I'm guessing you do not want. And, thanks for your guidance on this. Dave On Wed, May 10, 2017 at 12:24:48AM +0200, Matěj Cepl wrote: > On Tue, 2017-05-09 at 13:28 -0700, Dave Kuhlman wrote: > [snip] > A way better but not there yet. > > When running rst2odt --language=cs_CZ v_chramu.rst v_chramu.odt I > get the attached result with two problems (PDF generated via > rst2xetex and xelatex is correct on both counts): > > 1. image is completely screwed up > > 2. .. note:: doesn’t have the Czech label “Poznámka” > > However, styles seems to be correct. Also, I am not sure, whether > I should use cs, cs-CZ, or cs_CZ language code. > > Thank you for the great work so far. > > Matěj > [snip] -- Dave Kuhlman http://www.davekuhlman.org |
From: Guenter M. <mi...@us...> - 2017-05-18 19:42:58
|
Dear Dave, On 2017-05-16, Dave Kuhlman wrote: > I did the commit. > And, before I did so, I took Matěj's suggestion about using > ``locale.normalize(lang), so now you can use:: > $ rst2odt.py -l cs somedoc.txt somedoc.odt > $ rst2odt.py -l es somedoc.txt somedoc.odt > to get Czech and Spanish (Spain). Fine. > And, of course, you can override the region, for example:: > $ rst2odt.py -l cs-GB somedoc.txt somedoc.odt > $ rst2odt.py -l es-mx somedoc.txt somedoc.odt > to get British English and Mexican Spanish. I suggest the patch below to allow for BCP 47 tags like de-Latf-AT # second tag is script, region in 3rd position de-latf # second tag is script, no region given de-1901 # second tag is variant (here: spelling), no region given Further changes: The RuntimeError if locale.normalize fails to find a region tag is replaced with a Warning: a missing region tag does not prevent export of a functional output document. The RuntimeError for empty "self.visitor.language_code" is removed on the assumption that if a user calls ``--language=""``, this indicates that no language should be written into the output --- which is exactly what happens in this case. >From the function "languages.normalize_language_tag()", we only need the replacement of "_" by "-". This is better done with a string method. Günter Dir: /home/milde/Code/Python/docutils-svn/docutils/docutils/writers/odf_odt/ Index: __init__.py =================================================================== --- __init__.py (Revision 8069) +++ __init__.py (Arbeitskopie) @@ -572,38 +572,35 @@ s1 = self.get_stylesheet() # Set default language in document to be generated. # Language is specified by the -l/--language command line option. - # Allowed values are "ll", "ll-rr" or "ll_rr", where ll is language - # and rr is region. If region is omitted, we use + # The format is described in BCP 47. If region is omitted, we use # local.normalize(ll) to obtain a region. language_code = None region_code = None - if len(self.visitor.normalized_language_code) > 0: - language_ids = self.visitor.normalized_language_code[0].split('-') - if len(language_ids) == 2: - language_code = language_ids[0] - region_code = language_ids[1] - elif len(language_ids) == 1: - language_code = language_ids[0] + if self.visitor.language_code: + language_ids = self.visitor.language_code.replace('_','-') + language_ids = language_ids.split('-') + # first tag is primary language tag + language_code = language_ids[0].lower() + # 2-letter region subtag may follow in 2nd or 3rd position + for subtag in language_ids[1:]: + if len(subtag) == 2 and subtag.isalpha(): + region_code = subtag.upper() + break + elif len(subtag) == 1: + break # 1-letter tag is never before valid region tag + if region_code is None: rcode = locale.normalize(language_code) rcode = rcode.split('_') if len(rcode) > 1: - rcode = rcode[1] - rcode = rcode.split('.') - if len(rcode) >= 1: - region_code = rcode[0] + rcode = rcode[1].split('.') + region_code = rcode[0] if region_code is None: - raise RuntimeError( + self.document.reporter.warning( 'invalid language-region. ' 'Could not find region with locale.normalize(). ' 'If language is supplied, then you must specify ' - 'both lanauge and region (ll-rr). Examples: ' - 'es-mx (Spanish, Mexico), en-au (English, Australia).') - else: - raise RuntimeError( - 'invalid language-region. ' - 'Format must be "ll-rr" or "ll_rr", where ll is language ' - 'and rr is region. ' - 'See https://en.wikipedia.org/wiki/IETF_language_tag') + 'both language and region (ll-RR). Examples: ' + 'es-MX (Spanish, Mexico), en-AU (English, Australia).') # Update the style ElementTree with the language and region. # Note that we keep a reference to the modified node because # it is possible that ElementTree will throw away the Python @@ -888,8 +885,6 @@ self.language = languages.get_language( self.language_code, document.reporter) - self.normalized_language_code = languages.normalize_language_tag( - self.language_code) self.format_map = {} if self.settings.odf_config_file: from ConfigParser import ConfigParser |
From: Dave K. <dku...@da...> - 2017-05-19 18:22:57
|
Günter, I did the following: 1. Merged your changes (patch file below) into my local repository. 2. Did a bit of testing. I had an exception when a document contained an admonition. I fixed that. 3. Committed these changes to the central repository. Thank you for your help with this. Let me know if/when there is something more I can do. Dave K On Thu, May 18, 2017 at 07:42:35PM +0000, Guenter Milde wrote: > Dear Dave, > > On 2017-05-16, Dave Kuhlman wrote: > > > I did the commit. > > > And, before I did so, I took Matěj's suggestion about using > > ``locale.normalize(lang), so now you can use:: > > > $ rst2odt.py -l cs somedoc.txt somedoc.odt > > $ rst2odt.py -l es somedoc.txt somedoc.odt > > > to get Czech and Spanish (Spain). > > Fine. > > > And, of course, you can override the region, for example:: > > > $ rst2odt.py -l cs-GB somedoc.txt somedoc.odt > > $ rst2odt.py -l es-mx somedoc.txt somedoc.odt > > > to get British English and Mexican Spanish. > > > I suggest the patch below to allow for BCP 47 tags like > > de-Latf-AT # second tag is script, region in 3rd position > de-latf # second tag is script, no region given > de-1901 # second tag is variant (here: spelling), no region given > > > Further changes: > > The RuntimeError if locale.normalize fails to find a region tag is > replaced with a Warning: a missing region tag does not prevent export > of a functional output document. > > The RuntimeError for empty "self.visitor.language_code" is removed on the > assumption that if a user calls ``--language=""``, this indicates that no > language should be written into the output --- which is exactly what happens > in this case. > > From the function "languages.normalize_language_tag()", we only need the > replacement of "_" by "-". This is better done with a string method. > > > Günter > > > Dir: /home/milde/Code/Python/docutils-svn/docutils/docutils/writers/odf_odt/ > > Index: __init__.py > =================================================================== > --- __init__.py (Revision 8069) > +++ __init__.py (Arbeitskopie) > @@ -572,38 +572,35 @@ > s1 = self.get_stylesheet() > # Set default language in document to be generated. > # Language is specified by the -l/--language command line option. > - # Allowed values are "ll", "ll-rr" or "ll_rr", where ll is language > - # and rr is region. If region is omitted, we use > + # The format is described in BCP 47. If region is omitted, we use > # local.normalize(ll) to obtain a region. > language_code = None > region_code = None > - if len(self.visitor.normalized_language_code) > 0: > - language_ids = self.visitor.normalized_language_code[0].split('-') > - if len(language_ids) == 2: > - language_code = language_ids[0] > - region_code = language_ids[1] > - elif len(language_ids) == 1: > - language_code = language_ids[0] > + if self.visitor.language_code: > + language_ids = self.visitor.language_code.replace('_','-') > + language_ids = language_ids.split('-') > + # first tag is primary language tag > + language_code = language_ids[0].lower() > + # 2-letter region subtag may follow in 2nd or 3rd position > + for subtag in language_ids[1:]: > + if len(subtag) == 2 and subtag.isalpha(): > + region_code = subtag.upper() > + break > + elif len(subtag) == 1: > + break # 1-letter tag is never before valid region tag > + if region_code is None: > rcode = locale.normalize(language_code) > rcode = rcode.split('_') > if len(rcode) > 1: > - rcode = rcode[1] > - rcode = rcode.split('.') > - if len(rcode) >= 1: > - region_code = rcode[0] > + rcode = rcode[1].split('.') > + region_code = rcode[0] > if region_code is None: > - raise RuntimeError( > + self.document.reporter.warning( > 'invalid language-region. ' > 'Could not find region with locale.normalize(). ' > 'If language is supplied, then you must specify ' > - 'both lanauge and region (ll-rr). Examples: ' > - 'es-mx (Spanish, Mexico), en-au (English, Australia).') > - else: > - raise RuntimeError( > - 'invalid language-region. ' > - 'Format must be "ll-rr" or "ll_rr", where ll is language ' > - 'and rr is region. ' > - 'See https://en.wikipedia.org/wiki/IETF_language_tag') > + 'both language and region (ll-RR). Examples: ' > + 'es-MX (Spanish, Mexico), en-AU (English, Australia).') > # Update the style ElementTree with the language and region. > # Note that we keep a reference to the modified node because > # it is possible that ElementTree will throw away the Python > @@ -888,8 +885,6 @@ > self.language = languages.get_language( > self.language_code, > document.reporter) > - self.normalized_language_code = languages.normalize_language_tag( > - self.language_code) > self.format_map = {} > if self.settings.odf_config_file: > from ConfigParser import ConfigParser > > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Docutils-users mailing list > Doc...@li... > https://lists.sourceforge.net/lists/listinfo/docutils-users > > Please use "Reply All" to reply to the list. -- Dave Kuhlman http://www.davekuhlman.org |
From: Matěj C. <mc...@ce...> - 2017-05-11 10:01:21
|
On 2017-05-10, 23:30 GMT, Dave Kuhlman wrote: > I believe I've fixed the two issues you have reported below. You did, I think this is ready for merge. Thank you! > 1. The image now has, I believe, a reasonable width and height. > But, you will have to see if you agree. Well, it is not centered and it is not squeezed to 75% as expected, but it is at least unmangled, so any additional editing can be done in LOWriter. > 2. The label of the admonition/note has been translated. Shouldn’t it be centered as well? Otherwise, no problem. > They all have the same effect, as far as I can tell. But, "-l > cs" does *not* work. I don't know whether it should or not. > You can try "-l cs", but you end up with the language "cs-US", > which I'm guessing you do not want. I don’t think such thing exists (although, there is a rather crazy mix of English and Czech spoken by some Czecho-Americans), and certainly it is not what I would like to use. So, noted. Best, Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: Guenter M. <mi...@us...> - 2017-05-11 12:40:42
|
Dear David, On 2017-05-10, Dave Kuhlman wrote: >> However, styles seems to be correct. Also, I am not sure, whether >> I should use cs, cs-CZ, or cs_CZ language code. > You can use "-l" or "--language" and any of: > cs-CZ > cs_CZ > cs-cz > cs_cz > They all have the same effect, as far as I can tell. But, "-l cs" > does *not* work. I don't know whether it should or not. Generally, Docutils follows the "best current praxis" `BCP 47`__ which says that any specifiers (like the region subtag here) is optional. Czech-language mappings for language-dependent features of Docutils. are in the files .../docutils/languages/cs.py .../docutils/parsers/rst/languages/cs.py that are used for any of the above language tags. In some cases, special files for sub-locales exist, e.g., "pt_br.py" for Brasilanean Portugese. The HTML writers also insert the original language tag into the document header. The XML writer uses the language tag to support "localized" directive names. It does, however, not insert a language tag into the output document. This is IMO a bug. The LaTeX writer translates the tag into a Babel language name (or with XeTeX/LuaTeX a Polyglossia name), if there language is supported. Subvarieties (de-1901, de-CH, de-AT, say) are respected, if supported by Babel/Polyglossia, e.g. 'en-AU': 'australian', 'en-CA': 'canadian', 'en-GB': 'british', 'en-NZ': 'newzealand', 'en-US': 'american', __ http://www.rfc-editor.org/rfc/bcp/bcp47.txt > You can try "-l cs", but you end up with the language "cs-US", which > I'm guessing you do not want. This seems wrong in any case. It should stay "cs" without the optional region subtag. There is a function to iterate over "normalized" language tags: utils.normalize_language_tag() """Return a list of normalized combinations for a `BCP 47` language tag. Example: >>> normalize_language_tag('de_AT-1901') ['de-at-1901', 'de-at', 'de-1901', 'de'] """ used, e.g., in the latex writer, the smartquotes transform and ``parsers.rst.languages.get_language(language_code)``. Maybe this can also help in the odt writer? Günter |
From: Guenter M. <mi...@us...> - 2017-05-12 12:56:45
|
On 2017-05-11, Matěj Cepl wrote: > On 2017-05-11, 12:40 GMT, Guenter Milde wrote: >> .../docutils/languages/cs.py >> .../docutils/parsers/rst/languages/cs.py > Especially the second file looks quite hopelessly :( >>> You can try "-l cs", but you end up with the language >>> "cs-US", which >>> I'm guessing you do not want. >> This seems wrong in any case. It should stay "cs" without the optional >> region subtag. > I would think the default to be cs_CS (as in de_DE, fr_FR, > es_ES, it_IT etc., which are all “safe, default” dialects of the > main language) I agree, that the "main country" is a reasonable default for cases where some back-end application *requires* a region subtag.¹ > which is wrong however, because CS was the > abbreviation for Czechoslovakia and it does not exist anymore. Note the difference: "cs" is the tag for the Czech language while the region tag (ISO 3166-1 alpha-2 code) for the Czech Republic is "CZ". If a region subtag is required, "cs-CZ" should be used for the Czech language. Docutils should accept and work with the generic tag "cs" and expand it to "cs-CZ" in cases where it would be misinterpreted as "cs-US" by some dumb back-end. Günter ¹ However, it is not "safe", as *any* region subtag narrows the specification: 2.1. Syntax A language tag is composed from a sequence of one or more "subtags", each of which refines or narrows the range of language identified by the overall tag. -- BCP 47 2.2.4. Region Subtag Region subtags are used to indicate linguistic variations associated with or appropriate to a specific country, territory, or region. Typically, a region subtag is used to indicate variations such as regional dialects or usage, or region-specific spelling conventions. ... the region subtag MAY be omitted, as when it adds no distinguishing value to the tag. -- BCP 47 |
From: Dave K. <dku...@da...> - 2017-05-16 21:34:09
|
Günter and Matěj, I did the commit. And, before I did so, I took Matěj's suggestion about using ``locale.normalize(lang), so now you can use:: $ rst2odt.py -l cs somedoc.txt somedoc.odt $ rst2odt.py -l es somedoc.txt somedoc.odt to get Czech and Spanish (Spain). And, of course, you can override the region, for example:: $ rst2odt.py -l cs-GB somedoc.txt somedoc.odt $ rst2odt.py -l es-mx somedoc.txt somedoc.odt to get British English and Mexican Spanish. Thank you for that suggestion, Matěj. I did know about that. Dave K. On Tue, May 16, 2017 at 12:10:04PM +0000, Guenter Milde wrote: > On 2017-05-16, Matej Cepl wrote: > > On 16/05/17 01:42, Dave Kuhlman wrote: > > >> 1. Images that use ":width: xx%" are scaled to the line width. > ... > > >> 2. The header/title of an admonition (for example, a note) now > >> follows the style of the admonition header > ... > > >> 3. The unit test error that you found is fixed. > ... > > Wonderfull. Thanks a lot for taking care of this. > > >> Please let me know if we are getting closer, what else needs fixing, > >> etc. > > > This works for me. As far as me I would get this let be released. > > I second this, please commit and then it may be really time for 0.14! > > > I have one remaining question: > > I understand, that LO/OO expects both, language and region tag of a document > to be set. > I also understand that auto-filling the region tag is tricky. > > What happens exactly, if in an rst2odt-exported document the language tag is > set but the region tag is missing: > > a) an error/warning > b) the region is set to "US" > c) the region is set according to the users locale? > > What should Docutils do in these cases? > > >> By default, it was passing in "en" as the language, whereas now it > >> needs to be "en-US" or to be omitted. I overrode it with "en-US". > > Ther problem here is, that according to our specs, the default for > "language-code" is > > Default: English ("en"). Options: ``--language, -l``. > > Unless absolutely required otherwise, I suggest passing just "en". > > If this is not possible, the odt writer could use a different default > (this needs to be documented). > > Rationale: not only Mexican and Castillean Spaniards disagree about the > default region tag for a language, narrowing "en" to American English > must at least be documented. > > But this may also be sorted out later. > > > Thanks again, > > Günter > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Docutils-users mailing list > Doc...@li... > https://lists.sourceforge.net/lists/listinfo/docutils-users > > Please use "Reply All" to reply to the list. -- Dave Kuhlman http://www.davekuhlman.org |
From: Dave K. <dku...@da...> - 2017-05-16 21:54:04
|
> Thank you for that suggestion, Matěj. I did know about that. Rats. Meant to say I did *not* know. And, Matěj, thanks for being patient with me and thank you for all your help with this. Dave K. On Tue, May 16, 2017 at 02:29:00PM -0700, Dave Kuhlman wrote: > Günter and Matěj, > > I did the commit. > > And, before I did so, I took Matěj's suggestion about using > ``locale.normalize(lang), so now you can use:: > > $ rst2odt.py -l cs somedoc.txt somedoc.odt > $ rst2odt.py -l es somedoc.txt somedoc.odt > > to get Czech and Spanish (Spain). > > And, of course, you can override the region, for example:: > > $ rst2odt.py -l cs-GB somedoc.txt somedoc.odt > $ rst2odt.py -l es-mx somedoc.txt somedoc.odt > > to get British English and Mexican Spanish. > > Thank you for that suggestion, Matěj. I did know about that. > > Dave K. > > > On Tue, May 16, 2017 at 12:10:04PM +0000, Guenter Milde wrote: > > On 2017-05-16, Matej Cepl wrote: > > > On 16/05/17 01:42, Dave Kuhlman wrote: > > > > >> 1. Images that use ":width: xx%" are scaled to the line width. > > ... > > > > >> 2. The header/title of an admonition (for example, a note) now > > >> follows the style of the admonition header > > ... > > > > >> 3. The unit test error that you found is fixed. > > ... > > > > Wonderfull. Thanks a lot for taking care of this. > > > > >> Please let me know if we are getting closer, what else needs fixing, > > >> etc. > > > > > This works for me. As far as me I would get this let be released. > > > > I second this, please commit and then it may be really time for 0.14! > > > > > > I have one remaining question: > > > > I understand, that LO/OO expects both, language and region tag of a document > > to be set. > > I also understand that auto-filling the region tag is tricky. > > > > What happens exactly, if in an rst2odt-exported document the language tag is > > set but the region tag is missing: > > > > a) an error/warning > > b) the region is set to "US" > > c) the region is set according to the users locale? > > > > What should Docutils do in these cases? > > > > >> By default, it was passing in "en" as the language, whereas now it > > >> needs to be "en-US" or to be omitted. I overrode it with "en-US". > > > > Ther problem here is, that according to our specs, the default for > > "language-code" is > > > > Default: English ("en"). Options: ``--language, -l``. > > > > Unless absolutely required otherwise, I suggest passing just "en". > > > > If this is not possible, the odt writer could use a different default > > (this needs to be documented). > > > > Rationale: not only Mexican and Castillean Spaniards disagree about the > > default region tag for a language, narrowing "en" to American English > > must at least be documented. > > > > But this may also be sorted out later. > > > > > > Thanks again, > > > > Günter > > > > > > ------------------------------------------------------------------------------ > > Check out the vibrant tech community on one of the world's most > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > _______________________________________________ > > Docutils-users mailing list > > Doc...@li... > > https://lists.sourceforge.net/lists/listinfo/docutils-users > > > > Please use "Reply All" to reply to the list. > > -- > > Dave Kuhlman > http://www.davekuhlman.org > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Docutils-users mailing list > Doc...@li... > https://lists.sourceforge.net/lists/listinfo/docutils-users > > Please use "Reply All" to reply to the list. -- Dave Kuhlman http://www.davekuhlman.org |
From: Matěj C. <mc...@ce...> - 2017-05-17 01:01:13
|
On 2017-05-16, 21:48 GMT, Dave Kuhlman wrote: > And, Matěj, thanks for being patient with me and thank you for > all your help with this. You did all the real work. Thank you! Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: David G. <go...@py...> - 2017-05-03 22:12:48
|
On Wed, May 3, 2017 at 5:02 PM, Matěj Cepl <mc...@ce...> wrote: > On 2017-05-03, 21:24 GMT, David Goodger wrote: >> On Wed, May 3, 2017 at 3:43 PM, Matěj Cepl <mc...@ce...> >> wrote: >>> If I run >>> >>> $ rst2odt -l cs kobylky.rst >kobylky.odt >>> >>> I would expect that kobylky.odt default style would be in the >>> Czech language. >> >> What do you mean by "default style"? > > In LibreOffice Writer press F11 and select the root style > (called “Default Style” in en_US locale), click other button of > the mouse and select “Modify”. In the “Font” tab select language > of the font. > > I see that it is probably not what’s meant to be the result of > rst2odt, but it would be awesome if this value was configurable > somehow, because I write in Czech, but I have ODT always > generated in English. I don't think that was ever meant to be a feature of the Docutils ODT Writer. You can add a Feature Request for it though. David Goodger <http://python.net/~goodger> |
From: Dave K. <dku...@da...> - 2017-05-03 23:26:11
|
An ODF document is a Zip file containing files such as content.xml and styles.xml. I believe I've found the places in styles.xml that need to be modified in order to change the language in the "Font" tab in the "Default Style" that you describe. So, I'll look into it. I need to study it a bit more. A question -- How can I learn what effect that change has in LibreOffice Writer? It would be helpful for me to know that so that I could test the changes I make. I'm the original implementer of rst2odt by the way. Dave K On Wed, May 03, 2017 at 05:12:01PM -0500, David Goodger wrote: > On Wed, May 3, 2017 at 5:02 PM, Matěj Cepl <mc...@ce...> wrote: > > On 2017-05-03, 21:24 GMT, David Goodger wrote: > >> On Wed, May 3, 2017 at 3:43 PM, Matěj Cepl <mc...@ce...> > >> wrote: > >>> If I run > >>> > >>> $ rst2odt -l cs kobylky.rst >kobylky.odt > >>> > >>> I would expect that kobylky.odt default style would be in the > >>> Czech language. > >> > >> What do you mean by "default style"? > > > > In LibreOffice Writer press F11 and select the root style > > (called “Default Style” in en_US locale), click other button of > > the mouse and select “Modify”. In the “Font” tab select language > > of the font. > > > > I see that it is probably not what’s meant to be the result of > > rst2odt, but it would be awesome if this value was configurable > > somehow, because I write in Czech, but I have ODT always > > generated in English. > > I don't think that was ever meant to be a feature of the Docutils ODT > Writer. You can add a Feature Request for it though. > > David Goodger > <http://python.net/~goodger> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Docutils-users mailing list > Doc...@li... > https://lists.sourceforge.net/lists/listinfo/docutils-users > > Please use "Reply All" to reply to the list. -- Dave Kuhlman http://www.davekuhlman.org |
From: Matej C. <mc...@ce...> - 2017-05-03 23:17:22
Attachments:
signature.asc
|
On 04/05/17 00:58, Dave Kuhlman wrote: > A question -- How can I learn what effect that change has in > LibreOffice Writer? It would be helpful for me to know that so that > I could test the changes I make. I thought I wrote it in my previous message: > In LibreOffice Writer press F11 and select the root style > (called “Default Style” in en_US locale), click other button of > the mouse and select “Modify”. In the “Font” tab select language > of the font. Isn’t that it? Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: Matěj C. <mc...@ce...> - 2017-05-11 15:37:57
|
On 2017-05-11, 12:40 GMT, Guenter Milde wrote: > .../docutils/languages/cs.py > .../docutils/parsers/rst/languages/cs.py Especially the second file looks quite hopelessly :( >> You can try "-l cs", but you end up with the language >> "cs-US", which >> I'm guessing you do not want. > > This seems wrong in any case. It should stay "cs" without the optional > region subtag. I would think the default to be cs_CS (as in de_DE, fr_FR, es_ES, it_IT etc., which are all “safe, default” dialects of the main language), which is wrong however, because CS was the abbreviation for Czechoslovakia and it does not exist anymore. Best, Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: Guenter M. <mi...@us...> - 2017-05-21 10:23:01
|
Dear Dave, On 2017-05-19, Dave Kuhlman wrote: > I did the following: > 1. Merged your changes (patch file below) into my local repository. > 2. Did a bit of testing. I had an exception when a document > contained an admonition. I fixed that. > 3. Committed these changes to the central repository. > Thank you for your help with this. > Let me know if/when there is something more I can do. I have one more patch that makes the code failsafe for Python implementations missing the locale module (e.g. Jython). If you could add a short summary of changed to the HISTORY.txt file, we should be ready for release. Thanks, Günter Index: __init__.py =================================================================== --- __init__.py (Revision 8070) +++ __init__.py (Arbeitskopie) @@ -24,7 +24,10 @@ import copy import urllib2 import docutils -import locale +try: + import locale # module missing in Jython +except ImportError: + pass from docutils import frontend, nodes, utils, writers, languages from docutils.readers import standalone from docutils.transforms import references @@ -589,7 +592,10 @@ elif len(subtag) == 1: break # 1-letter tag is never before valid region tag if region_code is None: - rcode = locale.normalize(language_code) + try: + rcode = locale.normalize(language_code) + except NameError: + rcode = language_code rcode = rcode.split('_') if len(rcode) > 1: rcode = rcode[1].split('.') @@ -596,11 +602,10 @@ region_code = rcode[0] if region_code is None: self.document.reporter.warning( - 'invalid language-region. ' - 'Could not find region with locale.normalize(). ' - 'If language is supplied, then you must specify ' - 'both language and region (ll-RR). Examples: ' - 'es-MX (Spanish, Mexico), en-AU (English, Australia).') + 'invalid language-region.\n' + ' Could not find region with locale.normalize().\n' + ' Please specify both language and region (ll-RR).\n' + ' Examples: es-MX (Spanish, Mexico), en-AU (English, Australia).') # Update the style ElementTree with the language and region. # Note that we keep a reference to the modified node because # it is possible that ElementTree will throw away the Python |
From: Dave K. <dku...@da...> - 2017-05-22 23:34:14
|
Günter, Here is an update -- I did the following: 1. Merged your change for Jython import of locale module. 2. Added notes to HISTORY.txt. 3. Committed changes to the central repository. Dave K On Sun, May 21, 2017 at 10:22:39AM +0000, Guenter Milde wrote: > Dear Dave, > > On 2017-05-19, Dave Kuhlman wrote: > > > I did the following: > > > 1. Merged your changes (patch file below) into my local repository. > > > 2. Did a bit of testing. I had an exception when a document > > contained an admonition. I fixed that. > > > 3. Committed these changes to the central repository. > > > Thank you for your help with this. > > > Let me know if/when there is something more I can do. > > I have one more patch that makes the code failsafe for Python > implementations missing the locale module (e.g. Jython). > > If you could add a short summary of changed to the HISTORY.txt file, > we should be ready for release. > > Thanks, > > Günter > > > > Index: __init__.py > =================================================================== > --- __init__.py (Revision 8070) > +++ __init__.py (Arbeitskopie) > @@ -24,7 +24,10 @@ > import copy > import urllib2 > import docutils > -import locale > +try: > + import locale # module missing in Jython > +except ImportError: > + pass > from docutils import frontend, nodes, utils, writers, languages > from docutils.readers import standalone > from docutils.transforms import references > @@ -589,7 +592,10 @@ > elif len(subtag) == 1: > break # 1-letter tag is never before valid region tag > if region_code is None: > - rcode = locale.normalize(language_code) > + try: > + rcode = locale.normalize(language_code) > + except NameError: > + rcode = language_code > rcode = rcode.split('_') > if len(rcode) > 1: > rcode = rcode[1].split('.') > @@ -596,11 +602,10 @@ > region_code = rcode[0] > if region_code is None: > self.document.reporter.warning( > - 'invalid language-region. ' > - 'Could not find region with locale.normalize(). ' > - 'If language is supplied, then you must specify ' > - 'both language and region (ll-RR). Examples: ' > - 'es-MX (Spanish, Mexico), en-AU (English, Australia).') > + 'invalid language-region.\n' > + ' Could not find region with locale.normalize().\n' > + ' Please specify both language and region (ll-RR).\n' > + ' Examples: es-MX (Spanish, Mexico), en-AU (English, Australia).') > # Update the style ElementTree with the language and region. > # Note that we keep a reference to the modified node because > # it is possible that ElementTree will throw away the Python > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Docutils-users mailing list > Doc...@li... > https://lists.sourceforge.net/lists/listinfo/docutils-users > > Please use "Reply All" to reply to the list. -- Dave Kuhlman http://www.davekuhlman.org |
From: Dave K. <dku...@da...> - 2017-05-12 22:56:11
|
Matěj, I've got another fix for our issues. You will need to evaluate them yourself. 1. For language and region/country, now, with ODT writer, if you use the -l/--language command line option, then you must specify both language and country, for example, es_mx (Mexican Spanish), en-au (Australian English), and of course cs-cz. This is because the ODT writer needs both, if it changes the language. People have been fighting over whose language and/or religion is superior for hundreds of years. I say, let them keep fighting; I'm not going to step in the middle of that. If you go to UAM (Universidad Autónoma Madrid), I'm sure they will tell you that the default should be es-es; if you go to UNAM (Universidad Nacional Autónoma de México) they will tell you the default should be es-mx. In a previous email, Günter Milde says: Docutils should accept and work with the generic tag "cs" and expand it to "cs-CZ" in cases where it would be misinterpreted as "cs-US" by some dumb back-end. But, I don't know how to do that in the general case. 2. Image size -- I've reworked this quite severely. The ODT writer now tries to achieve a little consistency by converting all units to centimeters. For your image, you might want to start with something like the following and adjust from there:: .. image:: images/frantisek_xaversky.jpg :scale: 15% :align: center :alt: Svatý František Xaverský v Mikulášském chrámu The reason for the small scale factor is that your image is quite large. The size returned by PIL (the Python Image Library) for your image (frantisek_xaversky.jpg) is 2448 X 3264. That's pixels, I suppose. The conversion factor that I found at http://www.unitconversion.org/unit_converter/typography-ex.html is 1 px = 0.0264 cm. So, for example, 2448 * 0.0264 * .15 = 9.69 cm, which is a reasonable starting width for your image on a page, and you can adjust from there. I've attached a new version of docutils/writers/odf_odt/__init__.py to a separate message. Please let me know what you think. Dave K. -- Dave Kuhlman http://www.davekuhlman.org |
From: Dave K. <dku...@da...> - 2017-05-04 00:26:10
|
On Thu, May 04, 2017 at 01:17:13AM +0200, Matej Cepl wrote: > On 04/05/17 00:58, Dave Kuhlman wrote: > > A question -- How can I learn what effect that change has in > > LibreOffice Writer? It would be helpful for me to know that so that > > I could test the changes I make. > > I thought I wrote it in my previous message: > > > In LibreOffice Writer press F11 and select the root style > > (called “Default Style” in en_US locale), click other button of > > the mouse and select “Modify”. In the “Font” tab select language > > of the font. > > Isn’t that it? Matěj, Yes. That was helpful. I'm wondering if you will expect other kinds of styles to be changed also. For example, under the "Tools" menu there are "Language" settings. And, in the "Styles and Formating" window their are also character styles (when you click on the character icon). The test I made does seem the change those also, which I guessing you want to happen. You've given me enough to start work with. Thanks. I'll report back when I've made a little progress and hopefully have something you can test. Dave K > > Matěj > > -- > https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... > GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 > > Of course I'm respectable. I'm old. Politicians, ugly buildings, > and whores all get respectable if they last long enough. > --John Huston in "Chinatown." > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Docutils-users mailing list > Doc...@li... > https://lists.sourceforge.net/lists/listinfo/docutils-users > > Please use "Reply All" to reply to the list. -- Dave Kuhlman http://www.davekuhlman.org |
From: Matěj C. <mc...@ce...> - 2017-05-04 09:15:52
Attachments:
signature.asc
|
On Wed, 2017-05-03 at 16:58 -0700, Dave Kuhlman wrote: > Yes. That was helpful. I'm wondering if you will expect other > kinds of styles to be changed also. For example, under the > "Tools" menu there are "Language" settings. That’s just direct formatting forcing itself over the styles. Don’t touch it. > And, in the "Styles and Formating" window their are also > character styles (when you click on the character icon). The > test I made does seem the change those also, which I guessing > you want to happen. I believe character styles cover just changes over the underlying paragraph style, so these follow "Default Style" as well (or not, it seems that for example "Internet Link" has language set to "None", which is sensible). At least I see here, that the footnote reference is in Czech, when I create a footnote, which is what I want. In short, I believe that changing "Default Style" (or whatever is its name in other locale, but I guess inside the ODT file itself the displayed name does not matter, does it?) is enough for now. We may discover later that we have created something wrong, but we will deal with it then, I guess. I have asked my colleague (I work for Red Hat) about this and his conclusion was it should be enough to set language for style:default-style style:family="paragraph" and style:default- style style:family="graphic" into a fresh styles.xml. > You've given me enough to start work with. Thanks. I'll > report back when I've made a little progress and hopefully have > something you can test. BTW, I did not tell you how much I am grateful for having so well working rst2odt script as we have. Thank you! Matěj -- http://matej.ceplovi.cz/blog/, Jabber: mcepl<at>ceplovi.cz GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Blessed be the God […] who comforts us in all our affliction, so that we may be able to comfort those who are in any affliction, with the comfort with which we ourselves are comforted by God. -- 2. Corinthians 1:3-4 |
From: Matej C. <mc...@ce...> - 2017-05-13 08:25:45
Attachments:
signature.asc
|
On 13/05/17 00:33, Dave Kuhlman wrote: > The reason for the small scale factor is that your image is quite > large. The size returned by PIL (the Python Image Library) > for your image (frantisek_xaversky.jpg) is 2448 X 3264. That's > pixels, I suppose. The conversion factor that I found at > http://www.unitconversion.org/unit_converter/typography-ex.html > is 1 px = 0.0264 cm. So, for example, 2448 * 0.0264 * .15 = > 9.69 cm, which is a reasonable starting width for your image on a > page, and you can adjust from there. I think we have misunderstanding here. http://docutils.sourceforge.net/docs/ref/rst/directives.html#image says, that ``width`` parameter has as units “length or percentage of the current line width”. It seems to me you think that percent means the scaling factor. It doesn’t. rst2xetex does it perfectly well (see PDF I have attached to one of the earlier emails to you). Best, Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: Dave K. <dku...@da...> - 2017-05-16 00:06:12
|
Matěj, Here is what I believe I fixed: 1. Images that use ":width: xx%" are scaled to the line width. Note that if, after generating the .odt file, you edit it with LibreOffice Writer, and you use menu item Format/Page to modify the page width, the image size does *not* automatically adjust. Is this a needed feature? If it is, you could try to right-click on the image and edit its properties. But, none of that seemed satisfactory to me. 2. The header/title of an admonition (for example, a note) now follows the style of the admonition header (for example, rststyle-admon-note-hdr, rststyle-admon-warning-hdr, etc). I was mistakenly using the admonition title after it had been translated (for example, to "Poznámka!") to generate the style. And, there is no style named "rststyle-admon-Poznámka!-hdr". 3. The unit test error that you found is fixed. By default, it was passing in "en" as the language, whereas now it needs to be "en-US" or to be omitted. I overrode it with "en-US". The modified files are attached to a separate message. Wow. This is turning out to be more fun than I thought it would. Please let me know if we are getting closer, what else needs fixing, etc. Dave -- Dave Kuhlman http://www.davekuhlman.org |
From: Matej C. <mc...@ce...> - 2017-05-16 07:55:29
Attachments:
signature.asc
|
On 16/05/17 01:42, Dave Kuhlman wrote: > 1. Images that use ":width: xx%" are scaled to the line width. Note > that if, after generating the .odt file, you edit it with > LibreOffice Writer, and you use menu item Format/Page to modify > the page width, the image size does *not* automatically adjust. > Is this a needed feature? If it is, you could try to right-click > on the image and edit its properties. But, none of that seemed > satisfactory to me. I don’t think it is necessary. I believe that the primary use of rst2odt is for sharing the document for review (tracking changes and notes are working really well with LOWriter) or for providing camera-ready copies of documents for those who prefer Word-like format. For both cases I would just need to be sure that images look like when run through other rst2* programs. If somebody wants to take .odt for serious editing, she can very well make sure images work for her. > 2. The header/title of an admonition (for example, a note) now > follows the style of the admonition header (for example, > rststyle-admon-note-hdr, rststyle-admon-warning-hdr, etc). I was > mistakenly using the admonition title after it had been > translated (for example, to "Poznámka!") to generate the style. > And, there is no style named "rststyle-admon-Poznámka!-hdr". Perfect. Thanks. > 3. The unit test error that you found is fixed. By default, it was > passing in "en" as the language, whereas now it needs to be > "en-US" or to be omitted. I overrode it with "en-US". Awesome. > Wow. This is turning out to be more fun than I thought it would. > > Please let me know if we are getting closer, what else needs fixing, > etc. This works for me. As far as me I would get this let be released. Thank you very much, Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |
From: Matěj C. <mc...@ce...> - 2017-05-16 15:01:12
|
On 2017-05-16, 12:10 GMT, Guenter Milde wrote: > Unless absolutely required otherwise, I suggest passing just > "en". > > If this is not possible, the odt writer could use a different default > (this needs to be documented). > > Rationale: not only Mexican and Castillean Spaniards disagree about the > default region tag for a language, narrowing "en" to American English > must at least be documented. What about locale.normalize() (https://is.gd/dZe2bZ)? In [1]: import locale In [2]: locale.normalize('en') Out[2]: 'en_US.ISO8859-1' In [3]: locale.normalize('cs') Out[3]: 'cs_CZ.ISO8859-2' In [4]: locale.normalize('es') Out[4]: 'es_ES.ISO8859-1' In [5]: locale.normalize('fr') Out[5]: 'fr_FR.ISO8859-1' In [6]: (that encoding is bad, I would prefer UTF-8 all the time, but that’s another point) Best, Matěj -- https://matej.ceplovi.cz/blog/, Jabber: mc...@ce... GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8 Of course I'm respectable. I'm old. Politicians, ugly buildings, and whores all get respectable if they last long enough. --John Huston in "Chinatown." |