From: <mi...@us...> - 2014-03-20 10:51:13
|
Revision: 7747 http://sourceforge.net/p/docutils/code/7747 Author: milde Date: 2014-03-20 10:51:10 +0000 (Thu, 20 Mar 2014) Log Message: ----------- Address [ 249 ] and [ 250 ]. CSV table fails on python 2 with unicode. While the problem of Unicode characters in the options under Python 2 cannot be solved with reasonable effort, the patch improves error reporting and documents the limitation. Modified Paths: -------------- trunk/docutils/HISTORY.txt trunk/docutils/docs/ref/rst/directives.txt trunk/docutils/docutils/parsers/rst/directives/tables.py Modified: trunk/docutils/HISTORY.txt =================================================================== --- trunk/docutils/HISTORY.txt 2014-02-28 14:28:07 UTC (rev 7746) +++ trunk/docutils/HISTORY.txt 2014-03-20 10:51:10 UTC (rev 7747) @@ -16,6 +16,16 @@ Changes Since 0.11 ================== +* docs/ref/rst/directives.txt + + - Update "math" and "csv-table" descriptions. + +* docutils/parsers/rst/states.py + + - Improve error report when a non-ASCII character is specified as + delimiter, quote or escape character under Python 2. + Fixes [ 249 ] and [ 250 ]. + * docutils/writers/latex2e/__init__.py - Fix [ 239 ] Latex writer glues paragraphs with figure floats. Modified: trunk/docutils/docs/ref/rst/directives.txt =================================================================== --- trunk/docutils/docs/ref/rst/directives.txt 2014-02-28 14:28:07 UTC (rev 7746) +++ trunk/docutils/docs/ref/rst/directives.txt 2014-03-20 10:51:10 UTC (rev 7747) @@ -486,7 +486,7 @@ tokens are stored in nested `inline elements`_ with class arguments according to their syntactic category. The actual highlighting requires a style-sheet (e.g. one `generated by Pygments`__, see the -`sandbox/stylesheets`_ for examples). +`sandbox/stylesheets`__ for examples). The parsing can be turned off with the syntax_highlight_ configuration setting and command line option or by specifying the language as `:class:`_ @@ -515,11 +515,11 @@ Example:: The content of the following directive :: - .. code:: python + .. code:: python - def my_function(): - "just a test" - print 8/2 + def my_function(): + "just a test" + print 8/2 is parsed and marked up as Python source code. @@ -538,23 +538,28 @@ (New in Docutils 0.8) -The "math" directive inserts block(s) with mathematical content +The "math" directive inserts blocks with mathematical content (display formulas, equations) into the document. The input format is -*LaTeX math syntax* (see, e.g. the `Short Math Guide`_) with support -for Unicode symbols, for example:: +*LaTeX math syntax*\ [#math-syntax]_ with support for Unicode +symbols, for example:: .. math:: α_t(i) = P(O_1, O_2, … O_t, q_t = S_i λ) +Support is limited to a subset of *LaTeX math* by the conversion +required for many output formats. For HTML, the the `math_output`_ +configuration setting (or the corresponding ``--math-output`` +command line option) select between alternative output formats with +different subsets of supported elements. If a writer does not +support math typesetting at all, the content is inserted verbatim. + +.. [#math-syntax] The supported LaTeX commands include AMS extensions + (see, e.g., the `Short Math Guide`_). + + For inline math, use the `"math" role`_. -Support for math may be limited by the output format. If a writer does -not support math typesetting, the content is inserted verbatim. -For HTML, the output format can be set with the `math_output`_ -configuration setting (or the corresponding ``--math-output`` command -line option). - .. _Short Math Guide: ftp://ftp.ams.org/ams/doc/amsmath/short-math-guide.pdf .. _"math" role: roles.html#math .. _math_output: ../../user/config.html#math-output @@ -803,8 +808,6 @@ Working limitations: -* Whitespace delimiters are supported only for external CSV files. - * There is no support for checking that the number of columns in each row is the same. However, this directive supports CSV generators that do not insert "empty" entries at the end of short rows, by @@ -812,6 +815,15 @@ .. Add "strict" option to verify input? +.. [#whitespace-delim] Whitespace delimiters are supported only for external + CSV files. + +.. [#ASCII-char] With Python 2, the valuess for the ``delimiter``, + ``quote``, and ``escape`` options must be ASCII characters. (The csv + module does not support Unicode and all non-ASCII characters are + encoded as multi-byte utf-8 string). This limitation does not exist + under Python 3. + The following options are recognized: ``widths`` : integer [, integer...] @@ -841,28 +853,29 @@ The text encoding of the external CSV data (file or URL). Defaults to the document's encoding (if specified). -``delim`` : char | "tab" | "space" - A one-character string used to separate fields. Defaults to ``,`` - (comma). May be specified as a Unicode code point; see the - unicode_ directive for syntax details. +``delim`` : char | "tab" | "space" [#whitespace-delim]_ + A one-character string\ [#ASCII-char]_ used to separate fields. + Defaults to ``,`` (comma). May be specified as a Unicode code + point; see the unicode_ directive for syntax details. ``quote`` : char - A one-character string used to quote elements containing the - delimiter or which start with the quote character. Defaults to - ``"`` (quote). May be specified as a Unicode code point; see the - unicode_ directive for syntax details. + A one-character string\ [#ASCII-char]_ used to quote elements + containing the delimiter or which start with the quote + character. Defaults to ``"`` (quote). May be specified as a + Unicode code point; see the unicode_ directive for syntax + details. ``keepspace`` : flag Treat whitespace immediately following the delimiter as significant. The default is to ignore such whitespace. ``escape`` : char - A one-character string used to escape the delimiter or quote - characters. May be specified as a Unicode code point; see the - unicode_ directive for syntax details. Used when the delimiter is - used in an unquoted field, or when quote characters are used - within a field. The default is to double-up the character, - e.g. "He said, ""Hi!""" + A one-character\ [#ASCII-char]_ string used to escape the + delimiter or quote characters. May be specified as a Unicode + code point; see the unicode_ directive for syntax details. Used + when the delimiter is used in an unquoted field, or when quote + characters are used within a field. The default is to double-up + the character, e.g. "He said, ""Hi!""" .. Add another possible value, "double", to explicitly indicate the default case? Modified: trunk/docutils/docutils/parsers/rst/directives/tables.py =================================================================== --- trunk/docutils/docutils/parsers/rst/directives/tables.py 2014-02-28 14:28:07 UTC (rev 7746) +++ trunk/docutils/docutils/parsers/rst/directives/tables.py 2014-03-20 10:51:10 UTC (rev 7747) @@ -170,14 +170,14 @@ def __init__(self, options): if 'delim' in options: - self.delimiter = str(options['delim']) + self.delimiter = CSVTable.encode_for_csv(options['delim']) if 'keepspace' in options: self.skipinitialspace = False if 'quote' in options: - self.quotechar = str(options['quote']) + self.quotechar = CSVTable.encode_for_csv(options['quote']) if 'escape' in options: self.doublequote = False - self.escapechar = str(options['escape']) + self.escapechar = CSVTable.encode_for_csv(options['escape']) csv.Dialect.__init__(self) @@ -225,9 +225,12 @@ except SystemMessagePropagation, detail: return [detail.args[0]] except csv.Error, detail: + message = str(detail) + if sys.version_info < (3,) and '1-character string' in message: + message += '\nwith Python 2.x this must be an ASCII character.' error = self.state_machine.reporter.error( 'Error with CSV data in "%s" directive:\n%s' - % (self.name, detail), nodes.literal_block( + % (self.name, message), nodes.literal_block( self.block_text, self.block_text), line=self.lineno) return [error] table = (col_widths, table_head, table_body) This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |