Menu

HTML saving

Help
angelo78
2005-02-19
2013-04-27
  • angelo78

    angelo78 - 2005-02-19

    Hi,
    I want to extract links from a specific url and
    save his html to a file.
    I can't find a way to do that with downloading only one time the html.
    Can someone please tell me how to do that?
    I want that the program will download only one time
    and than save the html and extract the links
    (with the prefix of the url from i downloaded it).
    Thanks

     
    • Derrick Oswald

      Derrick Oswald - 2005-02-20

      There is an example of what you want to do in org.htmlparser.parserapplications.SiteCapturer

      That example uses custom tags, but the principal of printing out a list of nodes with toHtml() is the same in any case:

      // get a node list somehow, either iterating or with a filter
                  NodeList list = parser.parse (null);
                  try
                  {
                      out = new PrintWriter (new FileOutputStream (file));
                      for (int i = 0; i < list.size (); i++)
                          out.print (list.elementAt (i).toHtml ());
                      out.close ();
                  }
                  catch (FileNotFoundException fnfe)
                  {
                      fnfe.printStackTrace ();
                  }

       
    • angelo78

      angelo78 - 2005-02-21

      Can I parse a file but give to the parser or to a linkTag
      the URL prefix I want?
      Thanks

       
      • Derrick Oswald

        Derrick Oswald - 2005-02-21

        I believe the SiteCapturer example program does what you want, except the prefix is set for local storage. You should be able to modify this.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.