I am in the process of developing a script that will take an HTML file and generate a JSP.  Certain HTML tags will be replaced with JSP tags from a tag library based on IDs found on the HTML tags.

 

I thought that I could use the wildcard to get all the tags in the HTML and process each tag.  If it needs to be replaced, then replace it, otherwise just echo back.

 

The problem I am having is that it seems that I am getting duplicate tags when using the wildcard construct.  Here is a sample test query:

 

<result>
  {
    for $e in input()//HTML//*
    return $e
  }
</result>

when applied against the following HMTL:

 

<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<HTML>
  <HEAD>
    <TITLE>test</TITLE>
  </HEAD>

  <BODY>
    <H1>test</H1>
    <TABLE>
      <TR ID="dataRow">
 <TD>
   cell1
 </TD>
      </TR>
      <TR>
 <TD>
   cell2
 </TD>
      </TR>
    </TABLE>
  </BODY>
</HTML>

results in the following:

 

<?xml version="1.0" encoding="UTF-8"?>
<result>
   <HEAD>

      <TITLE>test</TITLE>

   </HEAD>
   <TITLE>test</TITLE>
   <BODY>

      <H1>test</H1>

      <TABLE>

         <TR ID="dataRow">

            <TD>
          cell1
        </TD>

         </TR>

         <TR>

            <TD>
          cell2
        </TD>

         </TR>

      </TABLE>

   </BODY>
   <H1>test</H1>
   <TABLE>
      <TR ID="dataRow">

         <TD>
          cell1
        </TD>
      </TR>
      <TR>

         <TD>
          cell2
        </TD>
      </TR>
    </TABLE>
   <TR ID="dataRow">

      <TD>
          cell1
        </TD>
      </TR>
   <TD>
          cell1
        </TD>
   <TR>

      <TD>
          cell2
        </TD>
      </TR>
   <TD>
          cell2
        </TD>
</result>

 

My question is this:  Is there a way that I can setup the wildcards to only get the "top-level" tags (HEAD, BODY) and then recurse through each of those?  It seems like the results are echoing back the complete HTML tag, then the BODY, then the H1, then the TABLE, then the individual TRs, etc.

 

TIA,

Phil