Expat stores an element's "raw name"
(=unconverted) as well as its encoded form.
Storage of the raw name can be optimized by
having the tag->rawName member point directly
into the parse buffer. However, this currently
is only done for the last buffer chunk, since
previous parse buffers are discarded.
So, most of the time raw names are stored in
a designated buffer, causing expensive memory
One can optimize this by only storing those
raw names in a separate buffer whose elements
are still open when the parse buffer is about to
be discarded. In other words, the raw names of
elements that are opened *and* closed while
the same buffer is parsed are never stored,
since the life time of their raw names is shorter
than the life time of the parse buffer.
The attached patch (xmlparse.c.diff) tries
to achieve that.
Performance benefits should be most noticeable
for large XML documents that are not deeply nested.
Log in to post a comment.