Re: [Htmlparser-user] Hints on how to change image tag locations andwriteoutdocument
Brought to you by:
derrickoswald
From: Raghavender S. <kin...@ho...> - 2002-05-10 19:32:47
|
Hi Somik, I have mentioned about the out of memory error problem earlier. last time for every iteration of for loop I was adding the whole page to my string buffer. so it was giving me the out of memory error. I removed that now. it was working fine till yesterday. now I find that error again. this time nothing to do with string buffer...and it looks like a real problem. I can send you the main class and the yahoo.txt I have. try running it. Thanks, Raghav >From: "Somik Raha" <so...@ya...> >Reply-To: htm...@li... >To: <htm...@li...> >Subject: Re: [Htmlparser-user] Hints on how to change image tag locations >andwriteoutdocument >Date: Fri, 10 May 2002 00:43:19 +0900 > >Hi Raghav, > On analyzing yahoo.txt, I found that you have incorrect html. There is >a script tag that has not been closed. So naturally the script scanner goes >bonkers. Rename the extension to .html, and open this file in IE, and you >will find that IE also cant handle this. > I verified from www.yahoo.com, and found that they do have the correct ></script> tag provided. So I guess your yahoo.txt file is faulty. > >Regards, >Somik > ----- Original Message ----- > From: Raghavender Srimantula > To: htm...@li... > Sent: Thursday, May 09, 2002 4:53 AM > Subject: Re: [Htmlparser-user] Hints on how to change image tag >locations andwriteoutdocument > > > Hi Somik, > I was using the 1.1 version of htmlparser. I save the www.yahoo.com >content > in a flat file yahoo.txt. and I run the parser against this. throws a > nullpointerexception in HTMLScriptScanner. this seems to be a new >addition > for 1.1. I will send the stacktrace, the main program and the yahoo.txt. > actually I cannot send the stacktrace. I made some changes and the line > numbers dont match. but if you run this program you would see the > nullpointerexception. > Thanks, > Raghav > > > >From: "Somik Raha" <so...@ya...> > >Reply-To: htm...@li... > >To: <htm...@li...> > >Subject: Re: [Htmlparser-user] Hints on how to change image tag >locations > >and writeoutdocument > >Date: Mon, 6 May 2002 13:59:11 +0900 > > > >Hi Raghav, > > I sent another mail sometime back to you - > > > >"HTMLLinkTag.linkData() - this gives you an enumeration - and in the > >enumeration will be your HTMLImageTag." > >HTMLNode node; > >HTMLImageTag imageTag; > >for (Enumeration e = linkTag.linkData();e.hasMoreElements();) { > > node = (HTMLNode)e.nextElement(); > > if (node instanceof HTMLImageTag) { > > imageTag = (HTMLImageTag)node; > > // your code here > > } > >} > > > >Regards, > >Somik > >----- Original Message ----- > >From: "Raghavender Srimantula" <kin...@ho...> > >To: <htm...@li...> > >Sent: Monday, May 06, 2002 10:43 AM > >Subject: Re: [Htmlparser-user] Hints on how to change image tag >locations > >and writeoutdocument > > > > > > > Hi Somik, > > > this question is regarding "not all images are being retrieved". I >mean > >the > > > images under <a tag. I did try to open the attachment you sent me. I > >could > > > not find anything. but seeing the previous mails I could read that >it is > >not > > > a bug. but still if I do want to retrieve all the images how do I do >it. > > > Thanks, > > > Raghav > > > > > > > > > >From: "Somik Raha" <so...@ya...> > > > >Reply-To: htm...@li... > > > >To: <htm...@li...> > > > >Subject: Re: [Htmlparser-user] Hints on how to change image tag > >locations > > > >and write outdocument > > > >Date: Tue, 30 Apr 2002 11:37:26 +0900 > > > > > > > >Hi Raghav, > > > > Ah - this was a question by Annette Doyle (titled "Not all >image > >tags > > > >are returned"). I am attaching my reply. > > > > > > > >Regards > > > >Somik > > > > > > > >----- Original Message ----- > > > >From: "Raghavender Srimantula" <kin...@ho...> > > > >To: <htm...@li...> > > > >Sent: Tuesday, April 30, 2002 11:16 AM > > > >Subject: Re: [Htmlparser-user] Hints on how to change image tag > >locations > > > >and write outdocument > > > > > > > > > > > > > hi Somik, > > > > > I found one more interesting thing here. when I am trying to get >all > >the > > > > > images the image scanner would give me images > > > > > <img > >src="http://us.i1.yimg.com/us.yimg.com/i/mntl/sh/mom02/title4.gif" > > > > > width=296 height=27 border=0 usemap=#tm> > > > > > so if I do a imagetag.getImageLocation(), I would get > > > > > http://us.i1.yimg.com/us.yimg.com/i/mntl/sh/mom02/title4.gif > > > > > > > > > > but is the html content is like this > > > > > <a href=s/6006><img > > > >src=http://us.i1.yimg.com/us.yimg.com/i/us/hj/hjys.gif > > > > > border=0 width=70 height=22></a> > > > > > which starts with <a and ends with </a>, then the image scanner >will > >not > > > > > give me http://us.i1.yimg.com/us.yimg.com/i/us/hj/hjys.gif when >I do > >a > > > > > imagetag.getImageLocation(). this is not even classified as an > >ImageTag. > > > > > this is classified as LinkTag. how to get this image. > > > > > > > > > > the above content is from www.yahoo.com. on the netscape browser >if > >you > > > >goto > > > > > view-->pageinfo, you will see a bunch of images. > > > > > but when you run the htmlparser you can get only one image. > > > > > > > > > > Thanks, > > > > > Raghav > > > > > > > > > > > > > > > >From: "Somik Raha" <so...@ya...> > > > > > >Reply-To: htm...@li... > > > > > >To: <htm...@li...> > > > > > >Subject: Re: [Htmlparser-user] Hints on how to change image tag > > > >locations > > > > > >and write outdocument > > > > > >Date: Tue, 30 Apr 2002 09:15:38 +0900 > > > > > > > > > > > >Can you describe your application ? Was it parsing a single >page > >when > > > >the > > > > > >problem occurred ? > > > > > > > > > > > >Regards, > > > > > >Somik > > > > > >----- Original Message ----- > > > > > >From: "Raghavender Srimantula" <kin...@ho...> > > > > > >To: <htm...@li...> > > > > > >Cc: <htm...@li...> > > > > > >Sent: Tuesday, April 30, 2002 8:36 AM > > > > > >Subject: Re: [Htmlparser-user] Hints on how to change image tag > > > >locations > > > > > >and write outdocument > > > > > > > > > > > > > > > > > > > Hi Somik, > > > > > > > I encountered a strange problem today. while I was running > > > > > >htmlparser...I > > > > > > > got a java.lang.OutOfMemoryError. seems that lot of objects >are > > > >being > > > > > > > allocated. where exactly is this happening. I mean could you > >give > >me > > > >an > > > > > >idea > > > > > > > where or in which file the potential problem could be. > > > > > > > Raghav > > > > > > > > > > > > > > > > > > > > > >From: "Somik Raha" <so...@ya...> > > > > > > > >Reply-To: htm...@li... > > > > > > > >To: <htm...@li...> > > > > > > > >CC: <htm...@li...> > > > > > > > >Subject: Re: [Htmlparser-user] Hints on how to change image >tag > > > > > >locations > > > > > > > >and write out document > > > > > > > >Date: Sat, 27 Apr 2002 18:22:34 +0900 > > > > > > > > > > > > > > > >Hi Annette, > > > > > > > > Pls find attached a program to get you started. This > >program > > > >will > > > > > >do > > > > > > > >what you want - you will need to modify the construct that > >checks > > > >for > > > > > >the > > > > > > > >image tag - and replace it with the location of your >choice. > > > > > > > > Also - I found one bug thanks to this requirement - >image > >tags > > > > > >params > > > > > > > >were not being correctly put in. Though it needs a deeper >look, > >I > > > >have > > > > > >done > > > > > > > >a quick fix for now, and all test cases are passing (with >one > >test > > > >case > > > > > >in > > > > > > > >HTMLImageScannerTest trapping this bug). > > > > > > > > Please check out the latest html parser source code >from > >CVS. > > > > > > > > > > > > > > > >Regards, > > > > > > > >Somik > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > From: Doyle, Annette > > > > > > > > To: htm...@li... > > > > > > > > Sent: Friday, April 26, 2002 10:08 PM > > > > > > > > Subject: [Htmlparser-user] Hints on how to change image >tag > > > > > >locations > > > > > > > >and write out document > > > > > > > > > > > > > > > > > > > > > > > > Could you please give me some hints as how to change >only > >image > > > >tag > > > > > > > >locations and then, (or at the same time) write out the >html > > > >document > > > > > >to > > > > > > > >file (with new image tag locations)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks- > > > > > > > > > > > > > > > > Annette Doyle > > > > > > > > > > > > > > > ><< ImageTagRetriever.java >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >_________________________________________________________________ > > > > > > > Join the world's largest e-mail service with MSN Hotmail. > > > > > > > http://www.hotmail.com > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Htmlparser-user mailing list > > > > > > > Htm...@li... > > > > > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > > > > > > > > > > > >_______________________________________________ > > > > > >Htmlparser-user mailing list > > > > > >Htm...@li... > > > > > >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > > > > > > > > > > > > > > > > > > > >_________________________________________________________________ > > > > > Send and receive Hotmail on your mobile device: > >http://mobile.msn.com > > > > > > > > > > > > > > > _______________________________________________ > > > > > Htmlparser-user mailing list > > > > > Htm...@li... > > > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > ><< > > > > > > >[Htmlparser-developer]Re_[Htmlparser-user]Notallimagetagsarereturned[NotaBu > >g].eml > > > > >> > > > > > > > > > > > > > > > _________________________________________________________________ > > > MSN Photos is the easiest way to share and print your photos: > > > http://photos.msn.com/support/worldwide.aspx > > > > > > > > > _______________________________________________________________ > > > > > > Have big pipes? SourceForge.net is looking for download mirrors. We > >supply > > > the hardware. You get the recognition. Email Us: > >ban...@so... > > > _______________________________________________ > > > Htmlparser-user mailing list > > > Htm...@li... > > > https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > > > >_______________________________________________ > >Htmlparser-user mailing list > >Htm...@li... > >https://lists.sourceforge.net/lists/listinfo/htmlparser-user > > > > > _________________________________________________________________ > Get your FREE download of MSN Explorer at >http://explorer.msn.com/intl.asp. > _________________________________________________________________ Join the worlds largest e-mail service with MSN Hotmail. http://www.hotmail.com |