Menu

Manipulating the Nodes

Help
Ryan Smith
2005-12-05
2013-04-27
  • Ryan Smith

    Ryan Smith - 2005-12-05

    What I am trying to do is parse HTML and change certain values in the HTML, such as image, form and link attributes. There doesn't appear to be a clear way to do this. Ideally, I'd like to parse the HTML into a Node and search the Node for the things I need to change then convert it back to a String.

    Anybody doing this? Thanks.

     
    • Derrick Oswald

      Derrick Oswald - 2005-12-06

      Most tags have such functionality; for example the ImageTag has setImageURL(String url). After modifying it you use toHtml() to convert it back to a string.

       
    • Ryan Smith

      Ryan Smith - 2005-12-06

      Thanks for the reply. I'm actually doing that, but how do i get the entire HTML, not just the changed tag? I want to start with an entire HTML page, change aspects of it and output the entire HTML page. When I run my code, nothing is output because there are no elements in the parser after calling the parse method (or so it appears).

      Here is what I have:

      Parser parser = new Parser();
      parser.setInputHTML(responseBody);

      // prifix all <img src=""> URLs
      TagNameFilter imgFilter = new TagNameFilter("img");
      NodeList imgNodeList = parser.parse(imgFilter);
      for (int i=0; i<imgNodeList.size(); i++) {
          System.out.println("Processing <img>");
          TagNode imgNode = (TagNode) imgNodeList.elementAt(i);
          String imgSrc = imgNode.getAttribute("src");
          imgNode.setAttribute("src", urlPrefix + imgSrc);
      }
      parser.reset();

      // write the output
      NodeIterator nodeItr = parser.elements();
      while (nodeItr.hasMoreNodes()) {
          Node node = nodeItr.nextNode();
          System.out.println("Rendering - " + node.toHtml());
          response.getOutputStream().print(node.toHtml());
      }

       
      • Derrick Oswald

        Derrick Oswald - 2005-12-06

        You need to gather all the nodes first, then apply the filter and process it and then print out the full list:

        NodeList all = parser.Parse (null);
        Nodelist imgNodeList = all.extractAllNodesThatMatch (imgFilter);
        // ... processing as above
        System.out.println (all.toHtml ());

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.