XML Diff / Discussion / Open Discussion: XML parser

Sean McCombe - 2000-10-04

We will need to source an XML parser (presumably there must be one out there somewhere) so that we can parse the HTML documents.

This implies the ability to convert the retrieved/stored HTML documents to a syntactically correct HTML-schema XML document. We would need to write a design for this conversion, including heuristics in those cases where the change required to the HTML document is not obvious (for example, where it is not obvious where the missing HTML end tags should be).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2000-10-05
  
  We should use SAX or DOM Parsers I think. I know of two which are
  free and in Java.
  
  1. Apache Projects XERXES : go to http://xml.apache.org/
  SAX 2 and DOM 1 and 2(beta)
  
  2. J Clarks XP : go to http://www.jclark.com/ (Sax only ?)
  
  Isn't there a Oracle-something too ?
  
  martin
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Sean McCombe - 2000-10-09
    
    I like the look of Xerces at xml.apache.org. It appears to be fairly complete and sophisticated, and of course, it comes from the apache group so it can't be too bad at all to use.
    
    I'm all for adopting it as our standard XML parsing component, and by the looks of it, you are too Martin. I'd say Chris would be too if he weren't so busy at Uni right now :)
    
    I'll add a link to the home page just now.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

XML parser

Forums

Help

XML parser

XML parser

Forums

Help

XML parser document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

XML parser