From: Robert R. <ro...@fr...> - 2002-03-16 08:06:17
|
Hello, I'm attaching two files that I use to retrieve pages. First I get them on local disk with wget under win (also for Linux) and then you can parse them with other tools. First file is .pl file, second is input file for wget which sites to download and make local mirror. Hope this helps a little bit, Regards, Robert Rozman ----- Original Message ----- From: "Eric Azinger" <zi...@pa...> Cc: "Mr. House" <mis...@li...> Sent: Saturday, March 16, 2002 4:12 AM Subject: [misterhouse-users] Grabbing pages off the net > Does anyone know if there's already a script that pulls down a page, and strips > unwanted data? > > I'm seeing alot of great content I'd like to stick into my own UI, and I would love > to build a script that I can give the URL of the page, download the page, strip what > I don't want, and then write what I do want it to a file I insert via SSI later. > > There are weather and traffic related sites that have amazing maps that might change > every day, but are nested between predictable comments. I'd like to build a way to > strip up to the desired content, and then a method to strip after it, and then write > what I want to an include file. > > Do you think it's possible to do something like this in PERL, and on an NT4 box? > > Thanks, > > Eric > > |