TurboPower XML Partner / Bugs / #8 Parsing of successive empty parameter entities

#8 Parsing of successive empty parameter entities

Status: open

Owner: Richard Light

Labels: None

Priority: 5

Updated: 2004-09-01

Created: 2004-08-28

Creator: Richard Light

Private: No

If an element's content model contains two or more
parameter entities which resolve to empty strings, each
followed by white space, then the parser incorrectly
treats the next non-space character it finds (in this
case 'a') as a token separator: "Bad separator in
content model: a" (XpParser.pas line 2177).

Example DTD (eadfrag.dtd):

<!ENTITY % m.desc.base.dep
''

>
<!ENTITY % m.organization.dep
''

<!ELEMENT ead
((%m.desc.base.dep; %m.organization.dep;
accessrestrict)*)

<!ATTLIST ead
id ID #IMPLIED
>

Example document:

<?xml version='1.0'?>
<!DOCTYPE ead PUBLIC '+//ISBN 1-931666-00-8//DTD
ead.dtd (Encoded Archival
Description (EAD) Version 2002)//EN' 'eadfrag.dtd'>
<ead id="test1"/>

As you'll gather from the test data, I actually hit
this bug while trying to parse an EAD (Encoded Archival
Description) document.

Discussion

Richard Light - 2004-08-28

Logged In: YES
user_id=966026

The one-level case I posted can be dealt with by changing
lines 2138-9 to:

if TryRead(Xpc_ParamEntity) then begin
ParseParameterEntityRef(True, False);
{!!.51}
SkipWhiteSpace(True);
{!!.59 rbl}
end;

However, adding another level of parameter entity references:



<!ENTITY % m.desc.base.dep
''

>
<!ENTITY % m.organization.dep
''

>

<!ENTITY % m.desc.base
'%m.desc.base.dep; %m.organization.dep; accessrestrict'

>

<!ENTITY % m.desc.full
'%m.desc.base; | dsc'

>

<!ELEMENT ead
((%m.desc.full;)*)

>

<!ATTLIST ead
id ID #IMPLIED
>

causes the bug to re-appear. The fundamental problem, it
seems to me, is that parameter entities should be resolved
at a lower level in the parsing process, before you start
trying to read tokens at all. What does anyone else think?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Richard Light - 2004-09-01

assigned_to: nobody --> richardlight
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Parsing of successive empty parameter entities

Group

Searches

Help

#8 Parsing of successive empty parameter entities

Discussion