#7 #FIXED entity ref not expanded in HTML 4.01 DTD

open
nobody
dtdparse (5)
5
2011-05-20
2011-05-20
No

A parameter entity is not being expanded. The discrepancy is that the output is:

<attribute name="version"
type="#FIXED"
value="CDATA"
default="%HTML.Version;"/>

but it should be:

<attribute name="version"
type="#FIXED"
value="CDATA"
default="-//W3C//DTD HTML 4.01 Transitional//EN"/>

I'm attaching:

html.dtd -- a vastly truncated version of the HTML 4.01 DTD containing nothing but what's needed to demonstrate the bug

run.txt -- what the run of dtdparse looked like, complete. As you'll see, curiously, the entity text is correctly expanded. The expanded version doesn't appear in the default attribute value, though.

sgml.soc -- the catalog file invoked on the command line in the attached 'run' file. It's very vanilla.

sgml.dcl -- the sgml declaration referenced in sgml.soc. It's very vanilla, too.

Discussion

  • Steven R. Newcomb

    what the run of dtdparse looked like, complete

     
  • Steven R. Newcomb

    Well, I was only permitted to attach one file to this, so here are the rest of them in a less-convenient form:

    #######################################################
    run.txt
    #######################################################
    /tmp srn@zorba% dtdparse --catalog sgml.soc html.dtd
    Reading sgml.soc...
    Public ID: unknown
    System ID: html.dtd
    SGML declaration: sgml.dcl
    Parse complete.
    <!DOCTYPE dtd PUBLIC "-//Norman Walsh//DTD DTDParse V2.0//EN"
    "dtd.dtd" [
    ]>
    <dtd version='1.0'
    unexpanded='1'
    title="?untitled?"
    namecase-general="1"
    namecase-entity=""
    xml="0"
    system-id="html.dtd"
    public-id=""
    declaration="sgml.dcl"
    created-by="DTDParse V2.00"
    created-on="Fri May 20 11:27:50 2011"
    >
    <entity name="version"
    type="param"
    >
    <text-expanded>version CDATA #FIXED '-//W3C//DTD HTML 4.01 Transitional//EN'</text-expanded>
    <text>version CDATA #FIXED '%HTML.Version;'</text>
    </entity>

    <entity name="HTML.Version"
    type="param"
    >
    <text-expanded>-//W3C//DTD HTML 4.01 Transitional//EN</text-expanded>
    <text>-//W3C//DTD HTML 4.01 Transitional//EN</text>
    </entity>

    <element name="HTML" stagm="O" etagm="O"
    content-type="element">
    <content-model-expanded>
    <empty/>
    </content-model-expanded>
    <content-model>
    <empty/>
    </content-model>
    </element>

    <attlist name="HTML">
    <attdecl>
    %version;
    </attdecl>
    <attribute name="version"
    type="#FIXED"
    value="CDATA"
    default="%HTML.Version;"/>
    </attlist>

    </dtd>
    Done.
    /tmp srn@zorba% dtdparse --version
    Version: dtdparse v2.00

    Usage:
    dtdparse [options] [dtdfile]

    /tmp srn@zorba%

    #######################################################
    html.dtd
    #######################################################
    <!ENTITY % HTML.Version "-//W3C//DTD HTML 4.01 Transitional//EN">

    <!ENTITY % version "version CDATA #FIXED '%HTML.Version;'">

    <!ELEMENT HTML O O EMPTY>
    <!ATTLIST HTML
    %version;
    >

    #######################################################
    sgml.soc
    #######################################################
    SGMLDECL "sgml.dcl"

    #######################################################
    sgml.dcl
    #######################################################
    <!SGML "ISO 8879:1986"
    --
    SGML Declaration for HyperText Markup Language version 4.0

    With support for the first 17 planes of ISO 10646 and
    increased limits for tag and literal lengths etc.
    --

    CHARSET
    BASESET "ISO Registration Number 177//CHARSET
    ISO/IEC 10646-1:1993 UCS-4 with
    implementation level 3//ESC 2/5 2/15 4/6"
    DESCSET 0 9 UNUSED
    9 2 9
    11 2 UNUSED
    13 1 13
    14 18 UNUSED
    32 95 32
    127 1 UNUSED
    128 32 UNUSED
    160 55136 160
    55296 2048 UNUSED -- SURROGATES --
    57344 8191 57344

    CAPACITY SGMLREF
    TOTALCAP 150000
    GRPCAP 150000
    ENTCAP 150000

    SCOPE DOCUMENT
    SYNTAX
    SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
    17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127
    BASESET "ISO 646IRV:1991//CHARSET
    International Reference Version
    (IRV)//ESC 2/8 4/2"
    DESCSET 0 128 0

    FUNCTION
    RE 13
    RS 10
    SPACE 32
    TAB SEPCHAR 9

    NAMING LCNMSTRT ""
    UCNMSTRT ""
    LCNMCHAR ".-_:"
    UCNMCHAR ".-_:"
    NAMECASE GENERAL YES
    ENTITY NO
    DELIM GENERAL SGMLREF
    SHORTREF SGMLREF
    NAMES SGMLREF
    QUANTITY SGMLREF
    ATTCNT 120 -- increased to 60 and then doubled because htmlloose.dtd uses 62 at one point --
    ATTSPLEN 65536 -- These are the largest values --
    LITLEN 65536 -- permitted in the declaration --
    NAMELEN 65536 -- Avoid fixed limits in actual --
    PILEN 65536 -- implementations of HTML UA's --
    TAGLVL 100
    TAGLEN 65536
    GRPGTCNT 150
    GRPCNT 64

    FEATURES
    MINIMIZE
    DATATAG NO
    OMITTAG YES
    RANK NO
    SHORTTAG YES
    LINK
    SIMPLE NO
    IMPLICIT NO
    EXPLICIT NO
    OTHER
    CONCUR NO
    SUBDOC NO
    FORMAL YES
    APPINFO NONE
    >

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks