[Ebiness-crawler] Parser : User lex in Linux or UNIX
Status: Alpha
Brought to you by:
o3dozone
|
From: Mari K. <gmk...@ya...> - 2001-05-31 13:19:47
|
ebi...@li... wrote: Send Ebiness-crawler mailing list submissions to ebi...@li... To subscribe or unsubscribe via the World Wide Web, visit http://lists.sourceforge.net/lists/listinfo/ebiness-crawler or, via email, send a message with subject or body 'help' to ebi...@li... You can reach the person managing the list at ebi...@li... When replying, please edit your Subject line so it is more specific than "Re: Contents of Ebiness-crawler digest..." Today's Topics: 1. Re: decent Html parser? (Allan Reffson Granja Lima) --__--__-- Message: 1 Date: Mon, 28 May 2001 11:13:12 -0300 (BRT) From: Allan Reffson Granja Lima To: Mike Davis cc: ebi...@li... Subject: Re: [Ebiness-crawler] decent Html parser? Can be a template in C++ to recognize a "Regular expression" like href="">'? Looking for "regex++"... -------------------------------------------- Allan Reffson Granja Lima (al...@li...) ICQ: 34004301 Mestrado em Ciencia da Computacao On Sat, 26 May 2001, Mike Davis wrote: > Hi guys, > > Well, I've already written an parser that simply extracts ' tags from Html for the crawler, but was wondering if anyone knows of a> good, stable parser that will generically parse Html? I think this would> be far superior.> > I think the 'expat' library (hosted on SF - http://expat.sourceforge.net/)> is the standard for Xml parsing, so we can use that for any Xml/xHtml we> come across.> > Mike> > > _______________________________________________> Ebiness-crawler mailing list> Ebi...@li...> http://lists.sourceforge.net/lists/listinfo/ebiness-crawler> --__--__--_______________________________________________Ebiness-crawler mailing lis...@li...://lists.sourceforge.net/lists/listinfo/ebiness-crawlerEnd of Ebiness-crawler Digest">' > tags from Html for the crawler, but was wondering if anyone knows of a > good, stable parser that will generically parse Html? I think this would > be far superior. > > I think the 'expat' library (hosted on SF - http://expat.sourceforge.net/) > is the standard for Xml parsing, so we can use that for any Xml/xHtml we > come across. > > Mike > > > _______________________________________________ > Ebiness-crawler mailing list > Ebi...@li... > http://lists.sourceforge.net/lists/listinfo/ebiness-crawler > It is easy to write Parsers in lex (lexical analyser) come with Linux or UNIX. It is easy to develop and output will be a C program satisfying parser reqmts. I can do this, If you could tell me what exactly has to be parsed. mari. --__--__-- _______________________________________________ Ebiness-crawler mailing list Ebi...@li... http://lists.sourceforge.net/lists/listinfo/ebiness-crawler End of Ebiness-crawler Digest " When you do nothing, Nothing works. " --------------------------------- Do You Yahoo!? Yahoo! Mail Personal Address - Get email at your own domain with Yahoo! Mail. |