If an element's content model contains two or more
parameter entities which resolve to empty strings, each
followed by white space, then the parser incorrectly
treats the next non-space character it finds (in this
case 'a') as a token separator: "Bad separator in
content model: a" (XpParser.pas line 2177).
Example DTD (eadfrag.dtd):
<!-- EAD fragment to demonstrate XMLPartner bug with
parameter entities -->
<!ENTITY % m.desc.base.dep
''
>
<!ENTITY % m.organization.dep
''
>
<!ELEMENT ead
((%m.desc.base.dep; %m.organization.dep;
accessrestrict)*)
>
<!ATTLIST ead
id ID #IMPLIED
>
Example document:
<?xml version='1.0'?>
<!DOCTYPE ead PUBLIC '+//ISBN 1-931666-00-8//DTD
ead.dtd (Encoded Archival
Description (EAD) Version 2002)//EN' 'eadfrag.dtd'>
<ead id="test1"/>
As you'll gather from the test data, I actually hit
this bug while trying to parse an EAD (Encoded Archival
Description) document.
Logged In: YES
user_id=966026
The one-level case I posted can be dealt with by changing
lines 2138-9 to:
if TryRead(Xpc_ParamEntity) then begin
ParseParameterEntityRef(True, False);
{!!.51}
SkipWhiteSpace(True);
{!!.59 rbl}
end;
However, adding another level of parameter entity references:
<!-- EAD fragment to demonstrate XMLPartner bug with
parameter entities -->
<!ENTITY % m.desc.base.dep
''
>
<!ENTITY % m.organization.dep
''
>
<!ENTITY % m.desc.base
'%m.desc.base.dep; %m.organization.dep; accessrestrict'
>
<!ENTITY % m.desc.full
'%m.desc.base; | dsc'
>
<!ELEMENT ead
((%m.desc.full;)*)
>
<!ATTLIST ead
id ID #IMPLIED
>
causes the bug to re-appear. The fundamental problem, it
seems to me, is that parameter entities should be resolved
at a lower level in the parsing process, before you start
trying to read tokens at all. What does anyone else think?