#58 Element name storage optimization

closed-accepted
None
5
2002-08-27
2002-08-27
No

Expat stores an element's "raw name"
(=unconverted) as well as its encoded form.

Storage of the raw name can be optimized by
having the tag->rawName member point directly
into the parse buffer. However, this currently
is only done for the last buffer chunk, since
previous parse buffers are discarded.
So, most of the time raw names are stored in
a designated buffer, causing expensive memory
allocations.

One can optimize this by only storing those
raw names in a separate buffer whose elements
are still open when the parse buffer is about to
be discarded. In other words, the raw names of
elements that are opened *and* closed while
the same buffer is parsed are never stored,
since the life time of their raw names is shorter
than the life time of the parse buffer.

The attached patch (xmlparse.c.diff) tries
to achieve that.

Performance benefits should be most noticeable
for large XML documents that are not deeply nested.

Discussion

  • Karl Waclawek

    Karl Waclawek - 2002-08-27

    Patched xmlparse.c

     
  • Fred L. Drake, Jr.

    • assigned_to: nobody --> kwaclaw
    • status: open --> open-accepted
     
  • Fred L. Drake, Jr.

    Logged In: YES
    user_id=3066

    Works for me -- no objections to checking it in.

     
  • Karl Waclawek

    Karl Waclawek - 2002-08-27

    Logged In: YES
    user_id=290026

    OK, checked it in.
    Hope you didn't just run tests, but checked
    my code too.

     
  • Karl Waclawek

    Karl Waclawek - 2002-08-27
    • status: open-accepted --> closed-accepted
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks