Menu

#144 Request for skippedEntity handler

Feature Request
closed-accepted
None
5
2002-06-13
2002-05-04
No

It would be very useful if Expat reported skipped
entities, like in the SAX2 specification.

I have identified the following situations for that:

B) External Entities are reported as skipped:
- if no external entity ref handler is set
- if the entity ref handler returns a special value
(e.g. we can define 2 as meaning: "skip this one")

B) Internal Entities are reported as skipped:
- SetDefaultHandler was called (which turns off
expansion of internal general entities)

C) Any entity reference is reported as skipped
- if no declaration is found & that is not an error
(otherwise return a well-formedness error)

Karl

Discussion

  • Karl Waclawek

    Karl Waclawek - 2002-05-09

    Logged In: YES
    user_id=290026

    I propose the following signature for the handler:

    enum XML_Skip_Reason {
    XML_SKIP_UNDEFINED,
    XML_SKIP_NOHANDLER,
    XML_SKIP_REQUESTED
    };

    typedef void (*XML_SkippedEntityHandler)
    (void *userData,
    const XML_Char *entityName,
    int is_parameter_entity,
    const XML_Char *systemId,
    const XML_Char *publicId,
    enum XML_Skip_Reason skipReason);

    where the values of skipReason have the following meanings:

    - XML_SKIP_UNDEFINED: entity was skipped because no
    declaration was found, and this was not an error
    - XML_SKIP_NOHANDLER: entity was skipped because there was
    no ExternalEntityRefHandler installed
    - XML_SKIP_REQUESTED: the ExternalEntityRefHandler returned
    a value of 2, which means the handler requested the
    entity to be skipped

    I hope this makes sense. Comments welcome!

    Karl

     
  • Fred L. Drake, Jr.

    Logged In: YES
    user_id=3066

    This feature description and proposed callback interface
    sounds good to me. We might want to think about how such a
    handler would interact with (or be combined with) a handler
    so that defined general entities (including "standard" ones
    like < and friends) can be reported, for applications
    that need to produce output with minimal changes. (This is
    commonly needed if the output is going to land in front of a
    human rather than another processing tool.)

    Let's target this for 1.95.4. Assigning to Karl since he's
    indicated specific interest. ;-)

     
  • Fred L. Drake, Jr.

    • milestone: --> Feature Request
    • assigned_to: nobody --> kwaclaw
     
  • Karl Waclawek

    Karl Waclawek - 2002-05-22

    Logged In: YES
    user_id=290026

    Thinking some more about it, I believe that the signature
    I proposed is overkill, and we can get away with his:

    typedef void (*XML_SkippedEntityHandler)
    (void *userData,
    const XML_Char *entityName,
    int is_parameter_entity);

    Reasons:

    In the old proposal there were two cases when PublicId
    and SystemId would have been reported:

    1) The application decided to skip the entity and passed
    a return value of 2 from the ExternalEntityRefHandler
    2) No ExternalEntityRefHandler was installed

    I think both of them don't need a skippedEntityHandler,
    because

    For 1) It is of no particular usefulness if the application
    code in the ExternalEntityRefHandler delegates the
    skip-notification back to Expat. This can be done directly
    from the handler at least as easily and efficiently, and
    Expat itself does not need this information, since the
    very fact of nothing being parsed is all that is important
    to it.

    For 2) If no ExternalEntityRefHandler is installed, then why
    install a skippedEntityHandler? They would have
    essentially the same signature, and in the end that would
    mean the same as in 1) - telling Expat we want to skip the
    entity. Again, that can already be easily achieved with the
    exisiting API.

    So, which events then remain that would require
    a skippedEntityHandler? Only when entity refs are
    encountered for which no declaration was read, *and*
    when this is not an error.

    Now, as far as Fred's suggestion of combining this
    with some InternalEntityRefHandler, is concerned:

    In that case we should also report the entity value.
    Would we not be mixing two different problems here?

    Karl

     
  • Karl Waclawek

    Karl Waclawek - 2002-05-22

    Logged In: YES
    user_id=290026

    I forgot case B) from the initial request.
    This would, of course, still be valid,
    but would also not require more than
    the simple callback interface I proposed.

    Karl

     
  • Karl Waclawek

    Karl Waclawek - 2002-05-24

    Logged In: YES
    user_id=290026

    Have a look at patch # 559910, where the latest, simplified
    proposal is implemented.

    Karl

     
  • Rolf Ade

    Rolf Ade - 2002-06-08

    Logged In: YES
    user_id=13222

    As longer, as I think about it, I more and more believe, it
    was a mistake, to change the reporting of undeclared
    entities along the line as described in bug 544679 without
    also adding a skippedEntitiy handler.

    (I already mentioned my objection in the discussion of
    544679, but maybe I wasn't loud enough.)

    Please consider adding the skippedEntity handler, as
    described by Karl.

    Without a skippedEntity handler, it isn't possible to
    detect a misstyped internal entitiy, if your document has a
    external subset or external parameter entities, even if you
    parse all external entities.

    This may break existing applications (well, it breaks at
    least one of mine), and should have been mentioned in the
    announcement (even if the new behaviour is more correct,
    according to the _letters_ of the XML rec.)

    And I think, it was a bad idea, to fix 544679 without adding
    a skippedEntity handler at the same time.

    rolf

     
  • Karl Waclawek

    Karl Waclawek - 2002-06-09

    Logged In: YES
    user_id=290026

    Rolf,

    stop twisting my arm - I checked the patch in. :-)
    It may be necessary to make changes to it
    when we add the InternalEntityRefHandler.

    Karl

     
  • Fred L. Drake, Jr.

    Logged In: YES
    user_id=3066

    Closed since this has already been checked in. If it needs
    tweaking, thats either a bug report or a request for more
    performance or whatever (a feature request). Since this
    doesn't seem like a performance-relevant feature, I'm not
    going to expect the later.

     
  • Fred L. Drake, Jr.

    • status: open --> closed-accepted
     

Log in to post a comment.