Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo


#1314 duplicate id in fo output with auto-generated glossary

Oliver Kiddle

I use an auto-generated glossary. Very occasionally, I get errors due to duplicate IDs. In the hope that it might perhaps workaround the issue, I tried setting generate.consistent.ids to 1. This instead, caused the problem to occur every time. Which made it easier to debug the problem...

The First ID is from here in fo/glossary.xsl:

372 <fo:block id="{$id}">
373   <xsl:call-template name="glossary.titlepage"/>
374 </fo:block>

The Second is from:

730 <fo:list-item xsl:use-attribute-sets="glossentry.list.item.properties">
731 <xsl:call-template name="anchor">

It seems the problem is that because the glossary is read in using the document() function, elements within the glossary can have the same ID as elements from the source document.

I can workaround the problem by overriding object.id and inserting the following to detect when it is in the glossary.

<xsl:if test="generate-id(document($glossary.collection, .))=generate-id(/)">

I can send a patch for this but I suspect there is a better solution. And in any case, there may be similar problems where document() is used elsewhere.


  • Robert Stayton
    Robert Stayton

    • assigned_to: Robert Stayton
  • Robert Stayton
    Robert Stayton

    If I understand you correctly, this is not an issue of the XSL stylesheets inserting duplicate ids into the output from a single id instance, but an issue of duplicate ids in the source material. In this case, some of the id values used in your glossary data file duplicate the id values used in some of your documents. If that is the case, then this falls into the same category as any included material having duplicate id instances, and is not something the stylesheet can easily fix. That's because if there is a cross reference to that id value, the stylesheet has no way of determining which one should be used.

    The real solution here, as with any included material, is to ensure that your glossary ids are unique across your document set. Adding the "G" to the glossary ids will fix the problem.

    If I'm misunderstanding the situation, let me know.

  • Oliver Kiddle
    Oliver Kiddle

    No. This very much is an issue of the XSL stylesheets inserting duplicate ids into the output. These IDs occur in a context for which there is no id value specified in the source material.

    Note that the glossary is a separate XML document accessed via document() from the stylesheets. This is done with <glossary role="auto"> in the main source document and my stylesheet specifying the glossary.collection parameter to point to a common glossary.xml file. I don't have any id attribute specified on my <glossterm> elements or the <glossentry>/term/def elements. The XSL stylesheets add ids to the glossary items; I have firstterm.only.link set to 1 and glossterm.auto.link set to 0.

    With generate.consistent.ids, it is counting the input element position within the source document so for two input documents, the same set of IDs regularly occur. That's why I tried to have it prefix the ID with a G when generating an ID based on an element in glossary.xml. In the default case, it is using generate-id() to get IDs: I can't say whether it is a bug that xsltproc gives me occasional duplicates where the input document differs but it is certainly ugly.

  • Jirka Kosek
    Jirka Kosek

    If generate-id() is used and returns duplicates for nodes from different documents then it is bug in XSLT processor, in this case xsltproc. Please try with Saxon. If it works in Saxon, then please report this bug against xsltproc.