[Docutils-checkins] SF.net SVN: docutils:[9947] trunk/docutils

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Revision: 9947
          http://sourceforge.net/p/docutils/code/9947
Author:   milde
Date:     2024-10-13 12:36:14 +0000 (Sun, 13 Oct 2024)
Log Message:
-----------
"include" directive: handle duplicate names and IDs in sub-documents

When the :parser: option is specified, the included file is parsed
into a dummy `<document>` and children are appended.
Let internal bookkeeping attributes reference the main document
equivalents, so that element names and IDs share a namespace
with the including document.

Modified Paths:
--------------
    trunk/docutils/docs/ref/rst/directives.rst
    trunk/docutils/docs/user/config.rst
    trunk/docutils/docutils/parsers/rst/directives/misc.py
    trunk/docutils/test/test_parsers/test_rst/test_directives/test_include.py

Added Paths:
-----------
    trunk/docutils/test/test_parsers/test_rst/test_directives/includes/include.xml

Removed Paths:
-------------
    trunk/docutils/test/data/duplicate-id.xml

Modified: trunk/docutils/docs/ref/rst/directives.rst
===================================================================

--- trunk/docutils/docs/ref/rst/directives.rst	2024-10-13 00:24:58 UTC (rev 9946)
+++ trunk/docutils/docs/ref/rst/directives.rst	2024-10-13 12:36:14 UTC (rev 9947)
@@ -1603,6 +1603,10 @@
     Parse the included content with the specified parser.
     See the `"parser" configuration setting`_ for available parsers.
 
+    .. Caution::
+       There is is no check whether the inserted elements are valid at the
+       point of insertion. It is recommended to validate_ the document.
+
     (New in Docutils 0.17)
 
 ``start-after`` : text_
@@ -2273,6 +2277,7 @@
 .. _"title" configuration setting: ../../user/config.html#title
 .. _toc_backlinks: ../../user/config.html#toc-backlinks
 .. _use_latex_toc: ../../user/config.html#use-latex-toc
+.. _validate: ../../user/config.html#validate
 
 .. _reStructuredText Standard Definition Files: definitions.html
 

Modified: trunk/docutils/docs/user/config.rst
===================================================================
--- trunk/docutils/docs/user/config.rst	2024-10-13 00:24:58 UTC (rev 9946)
+++ trunk/docutils/docs/user/config.rst	2024-10-13 12:36:14 UTC (rev 9947)
@@ -672,10 +672,12 @@
 
 List of "classes" attribute values (comma-separated_).
 Values are appended. [#append-values]_
-Matching elements are removed from the `document tree`_.
+Matching elements are removed from the `document tree`_
+(by the `StripClassesAndElements` transform_).
 
 .. WARNING:: Potentially dangerous: may lead to an invalid document tree
    and subsequent writer errors.  Use with caution.
+   It is recommended to validate_ the document.
 
 *Default*: empty list.  *Option*: ``--strip-elements-with-class``.
 
@@ -777,7 +779,9 @@
 validate
 --------
 
-Validate the parsing result.
+Validate the parsing result.  Report elements that do not comply
+with the restrictions set out in the `Docutils Generic document
+type definition`_.
 
 *Default*: False.  *Options*: ``--validate``, ``--no-validation``.
 
@@ -2497,6 +2501,7 @@
 
 .. References
 
+.. _Docutils Generic document type definition:
 .. _Docutils Document Tree:
 .. _Document Tree: ../ref/doctree.html
 
@@ -2507,6 +2512,8 @@
 .. _publish_string(): ../api/publisher.html#publish-string
 .. _publish_from_doctree(): ../api/publisher.html#publish-from-doctree
 
+.. _transform: ../api/transforms.html
+
 .. _severity level: ../peps/pep-0258.html#error-handling
 
 .. RestructuredText Directives

Modified: trunk/docutils/docutils/parsers/rst/directives/misc.py
===================================================================
--- trunk/docutils/docutils/parsers/rst/directives/misc.py	2024-10-13 00:24:58 UTC (rev 9946)
+++ trunk/docutils/docutils/parsers/rst/directives/misc.py	2024-10-13 12:36:14 UTC (rev 9947)
@@ -217,6 +217,9 @@
         settings._source = self.options['source']
         document = utils.new_document(settings._source, settings)
         document.include_log = self.state.document.include_log
+        document.ids = self.state.document.ids
+        document.nameids = self.state.document.nameids
+        document.nametypes = self.state.document.nametypes
         parser = self.options['parser']()
         parser.parse(text, document)
         self.state.document.parse_messages.extend(document.parse_messages)

Deleted: trunk/docutils/test/data/duplicate-id.xml
===================================================================
--- trunk/docutils/test/data/duplicate-id.xml	2024-10-13 00:24:58 UTC (rev 9946)
+++ trunk/docutils/test/data/duplicate-id.xml	2024-10-13 12:36:14 UTC (rev 9947)
@@ -1,5 +0,0 @@
-<section>
-    <title ids="s4">nice heading</title>
-    <paragraph>Text with <strong ids="s4">strong
-    statement</strong> and more text.</paragraph>
-</section>

Added: trunk/docutils/test/test_parsers/test_rst/test_directives/includes/include.xml
===================================================================
--- trunk/docutils/test/test_parsers/test_rst/test_directives/includes/include.xml	                        (rev 0)
+++ trunk/docutils/test/test_parsers/test_rst/test_directives/includes/include.xml	2024-10-13 12:36:14 UTC (rev 9947)
@@ -0,0 +1,5 @@
+<section>
+    <title names="nice\ heading">nice heading</title>
+    <paragraph>Text with <strong ids="common-id">strong
+    statement</strong> and more text.</paragraph>
+</section>


Property changes on: trunk/docutils/test/test_parsers/test_rst/test_directives/includes/include.xml
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision
\ No newline at end of property
Modified: trunk/docutils/test/test_parsers/test_rst/test_directives/test_include.py
===================================================================
--- trunk/docutils/test/test_parsers/test_rst/test_directives/test_include.py	2024-10-13 00:24:58 UTC (rev 9946)
+++ trunk/docutils/test/test_parsers/test_rst/test_directives/test_include.py	2024-10-13 12:36:14 UTC (rev 9947)
@@ -17,7 +17,7 @@
     # so we import the local `docutils` package.
     sys.path.insert(0, str(Path(__file__).resolve().parents[4]))
 
-from docutils import parsers, utils
+from docutils import core, parsers, utils
 from docutils.frontend import get_default_settings
 from docutils.parsers.rst import Parser
 from docutils.utils import new_document
@@ -25,12 +25,11 @@
 from test.test_parsers.test_rst.test_directives.test_code \
     import PYGMENTS_2_14_PLUS
 
+FILE_DIR = Path(__file__).resolve().parent
+TEST_ROOT = FILE_DIR.parents[2]
 
-TEST_ROOT = Path(__file__).resolve().parents[3]
-
-
 # optional 3rd-party markdown parser
-md_parser_name = 'recommonmark'
+md_parser_name = 'pycmark'
 try:  # check availability
     md_parser_class = parsers.get_parser_class(md_parser_name)
 except ImportError:
@@ -52,6 +51,8 @@
         settings.warning_stream = ''
         settings.halt_level = 5
         for name, cases in totest.items():
+            if name == 'with transforms':
+                continue  # see test_publish() below
             for casenum, (case_input, case_expected) in enumerate(cases):
                 with self.subTest(id=f'totest[{name!r}][{casenum}]'):
                     document = new_document('test data', settings.copy())
@@ -59,7 +60,23 @@
                     output = document.pformat()
                     self.assertEqual(case_expected, output)
 
+    def test_publish(self):
+        # Special case for tests of issue reporting.
+        # To see the system message from the duplicate id, we need transforms.
+        settings = {'_disable_config': True,
+                    'output_encoding': 'unicode',
+                    'warning_stream': '',
+                    }
+        name = 'with transforms'
+        for casenum, (sample, expected) in enumerate(totest[name]):
+            with self.subTest(id=f'totest[{name!r}][{casenum}]'):
+                output = core.publish_string(sample,
+                                             source_path='test data',
+                                             parser=Parser(),
+                                             settings_overrides=settings)
+                self.assertEqual(expected, output)
 
+
 try:
     chr(0x11111111)
 except ValueError as detail:
@@ -68,11 +85,10 @@
     unichr_exception = ''
 
 
-# prepend this directory (relative to the test root):
+# prepend this directory (relative to the cwd):
 def mydir(path):
-    return os.path.relpath(
-        os.path.join(TEST_ROOT, 'test_parsers/test_rst/test_directives', path),
-        os.getcwd()).replace('\\', '/')
+    return os.path.relpath(os.path.join(FILE_DIR, path),
+                           os.getcwd()).replace('\\', '/')
 
 
 include1 = mydir('include1.rst')
@@ -90,7 +106,7 @@
 include16 = mydir('includes/include16.rst')
 include_literal = mydir('include_literal.rst')
 include_md = mydir('include.md')
-include_xml = TEST_ROOT/'data/duplicate-id.xml'
+include_xml = mydir('includes/include.xml')
 include = TEST_ROOT/'data/include.rst'
 latin2 = TEST_ROOT/'data/latin2.rst'
 utf_16_file = TEST_ROOT/'data/utf-16-le-sig.rst'
@@ -833,6 +849,33 @@
                 .. end of inclusion from "{include10}"
 """],
 [f"""\
+Inclusion 1
+===========
+Name clash: The included file uses the same section title.
+
+.. include:: {include1}
+   :parser: rst
+""",
+f"""\
+<document source="test data">
+    <section dupnames="inclusion\\ 1" ids="inclusion-1">
+        <title>
+            Inclusion 1
+        <paragraph>
+            Name clash: The included file uses the same section title.
+        <section dupnames="inclusion\\ 1" ids="inclusion-1-1">
+            <title>
+                Inclusion 1
+            <system_message backrefs="inclusion-1-1" level="1" line="2" source="{include1}" type="INFO">
+                <paragraph>
+                    Duplicate implicit target name: "inclusion 1".
+            <paragraph>
+                This file is used by \n\
+                <literal>
+                    test_include.py
+                .
+"""],
+[f"""\
 Include file with whitespace in the path:
 
 .. include:: {include11}
@@ -1465,32 +1508,6 @@
         File "include15.rst": example of rekursive inclusion.
 """],
 [f"""\
-Include Docutils XML file:
-
-.. include:: {include_xml}
-   :parser: xml
-
-The duplicate id is reported and would be appended
-by the "universal.Messages" transform.
-""",
-"""\
-<document source="test data">
-    <paragraph>
-        Include Docutils XML file:
-    <section>
-        <title ids="s4">
-            nice heading
-        <paragraph>
-            Text with \n\
-            <strong ids="s4">
-                strong
-                statement
-             and more text.
-    <paragraph>
-        The duplicate id is reported and would be appended
-        by the "universal.Messages" transform.
-"""],
-[f"""\
 No circular inclusion.
 
 .. list-table::
@@ -1531,7 +1548,7 @@
 <document source="test data">
     <paragraph>
         Include Markdown source.
-    <section ids="title-1" names="title\\ 1">
+    <section depth="1" ids="section-1">
         <title>
             Title 1
         <paragraph>
@@ -1542,14 +1559,88 @@
                 also emphasis
         <paragraph>
             No whitespace required around a
-            <reference name="phrase reference" refuri="/uri">
+            <reference refuri="/uri">
                 phrase reference
             .
+        <target ids="phrase-reference" names="phrase\\ reference" refuri="/uri">
     <paragraph>
         A paragraph.
 """],
 ]
 
+# Transforms are required for these tests: The system_message about
+# duplicate name/id is appended by the "universal.Messages" transform.
+totest['with transforms'] = [
+[f"""\
+.. _common id:
 
+Include Docutils XML file:
+
+.. include:: {include_xml}
+   :parser: xml
+""",
+f"""\
+<document source="test data">
+    <target refid="common-id">
+    <paragraph ids="common-id" names="common\\ id">
+        Include Docutils XML file:
+    <section>
+        <title names="nice\\ heading">
+            nice heading
+        <paragraph>
+            Text with \n\
+            <strong ids="common-id">
+                strong
+                statement
+             and more text.
+    <section classes="system-messages">
+        <title>
+            Docutils System Messages
+        <system_message level="3" line="3" source="{include_xml}" type="ERROR">
+            <paragraph>
+                Duplicate ID: "common-id" used by <target ids="common-id" names="common\\ id"> and <strong ids="common-id">
+"""],
+[f"""\
+Inclusion 1
+===========
+Name clash: The included file uses the same section title `inclusion 1`_.
+
+.. include:: {include1}
+   :parser: rst
+
+Inclusion 2
+===========
+""",
+"""\
+<document source="test data">
+    <section dupnames="inclusion\\ 1" ids="inclusion-1">
+        <title>
+            Inclusion 1
+        <paragraph>
+            Name clash: The included file uses the same section title \n\
+            <problematic ids="problematic-1" refid="system-message-1">
+                `inclusion 1`_
+            .
+        <section dupnames="inclusion\\ 1" ids="inclusion-1-1">
+            <title>
+                Inclusion 1
+            <paragraph>
+                This file is used by \n\
+                <literal>
+                    test_include.py
+                .
+    <section ids="inclusion-2" names="inclusion\\ 2">
+        <title>
+            Inclusion 2
+    <section classes="system-messages">
+        <title>
+            Docutils System Messages
+        <system_message backrefs="problematic-1" ids="system-message-1" level="3" line="3" source="test data" type="ERROR">
+            <paragraph>
+                Duplicate target name, cannot be used as a unique reference: "inclusion 1".
+"""],
+]
+
+
 if __name__ == '__main__':
     unittest.main()

This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.