[Htmlparser-developer] RE: question about using HTMLParser in Apache JMeter
Brought to you by:
derrickoswald
From: Derrick O. <der...@au...> - 2003-09-29 20:00:13
|
Peter, =20 Yes, you have permission. In fact we would be honoured and endeavor to assist you in any way necessary. =20 It's funny you should mention images and DOM. The latest versions of htmlparser includes an example application that does a very similar task; getting the images behind thumbnails (see lib/thumbelina.jar or package org.htmlparser.lexerapplications.thumbelina). It uses the low level Lexer package to avoid having to form the entire document model. I would check to see if something like this meets your needs. =20 If you need more than that (i.e. table parsing, balancing end tags, etc.) you'll have to go with the full parser. Unfortunately, the Lexer hasn't been completely integrated into the parser yet and the current CVS snapshot is a bit of a mess. With a bit of patience, this too will come to pass. =20 As far as performance comparisons go, I've only heard anecdotal evidence that htmlparser is faster. I suppose this could be an area of investigation. =20 Derrick -----Original Message----- From: peter lin [mailto:jmw...@ya...]=20 Sent: September 29, 2003 8:53 AM To: Derrick Oswald Subject: question about using HTMLParser in Apache JMeter =20 Hi derrick, =20 =20 I am a commiter on Apache's Jakarta JMeter project. I was wondering if we can get permission to use it. Since Apache foundation can't use LGPL code without permission, I'm hoping you're open to the idea. =20 here is a quick description of how I want to use it. JMeter currently is a load testing tool for HTTP, FTP, JDBC and Java. The HTTP plugin uses JTidy to parse the HTML and extract the images for download. =20 test plans with more than 20 clients performs poorly because of the high cost of DOM. JTidy generates DOM documents. One trick is to turn off download images in JMeter, but that doesn't solve the real problem. I want to replace JTidy with HTMLParser. I haven't done any performance comparison yet, but I'm guessing it should use less memory. =20 has anyone done a performance comparison between JTidy and HTMLParser? =20 peter lin =20 =20 =20 =20 _____ =20 Do you Yahoo!? The <http://shopping.yahoo.com/?__yltc=3Ds%3A150000443%2Cd%3A22708228%2Cslk%3= A text%2Csec%3Amail> New Yahoo! Shopping - with improved product search |