I'm new to this list and I have a project that I have to do for class and I am stuck.  The project is thus:  I have to write a program to read a list of web addresses and then connect to the web get the title of the web page and the file size.  I also have to have the program decide wether the web address is in correct format.  I got it to read the first web address and get the file size but I dont know how to make it go throughout the whole list.

 
This is the official assignment:
 


Web Page Tester and Title Locator

A file is available with a listing of web page URL's. This file is titled "addresses.txt" (Doc Sharing).
Write a program to read these URL's, load the web page associated with these URL's, determine the title of
the web page and write the title of the web page to a data file, "titles.txt".
Also, create an HTML output file ("output.html") that shows a listing of the URL and the associated web page title.  Also, list the file size of the valid html pages in kilobytes (kB)
The following is an example of the format:


Web Page Title Detector

URL                     Web Page Title             File Size (kB)

xxxxx.xxx.xxx           xxxxxxxxxxx                128.8
xxxxx.xxx.xxx           xxxxxxxxxxx                  33.2
xx.x.xx                 Invalid URL                           0.0
Etc...


Note: Due to transfer corruption, several web page URL's in "addresses.txt" may be invalid, so each address should be
checked for at least the following conditions:
(A) Incorrect protocol type (must be http://)
(B) misformed URL address (not of the format XXX.XXXX.XXX)
(C) a domain other than (.com, .net, .gov, .edu, .org)

NOTES:
(1) Also, use one API module, or function from one of the available Jython resources which has not be presented in class
(2) This project is to be completed with pair programming (remember the precepts of pair programming).
(3) Make sure that you use functional decomposition. Which tasks should be broken down and donw within a function? Now, build, test, and use that function!

Turn In:
(a) A printout of the program properly commented
(b) A copy of the output file  "titles.txt"; print your name, date and problem number at the top of the output file 
(c) A copy of the output file  "output.html"
(d) Zip all three files and place in the Drop Box for HW-8

 

 

And this is the web page list:

 

http://www.google.com
htdp://www.google.com
http://www.google.del
http://www.yahoo.con
http://www.yahoo.drt.con
http://www.yahoo.com.dx
http://www.msn.com
http://..msn.com
http://www.msn.hvt
http://www.loc.gov
http://www.gov.loc
ftp://www.loc.gov

 

I am not asking for anyone to write my code but I am asking for someone to help me head in the right direction.

 

Thank You:

Justin Van Schuyver