    I'm currently working to build a webscraper on certain apartment sites, and i've run into a slight problem. I was able to figure out XML web scraper on a more simple formatted HTML source, namely one that was nested properly. I've run into one where the format is more difficult. (multiple span references so i cannot iterate over a span or class) Can anyone help/suggest an alternative to my problem? below is the code followed by the desired output:

        <file action="write" path="edr/edrxml.xml">
                  <template><![CDATA[ <extracted time="${sys.datetime("dd.MM.yyyy, HH:mm:ss")}" > ]]></template>
                    <xq-param name="doc">
                            <http url=""/>
                        declare variable $doc as node() external;
                        for $allListings in $doc//div[@id="floorPlanOneMatrix"]
                            for $plantitle in $allListings//span[@class="firstFloorPlanTitle"]
                            let $pt := $plantitle//span[@style="color: rgb(0,56,147)"]
                        let $pr := $allListings//span[@class="firstFloorPlanRates"]
                                    <price> <header>100 Midtown</header><pt>{normalize-space(data($pt[1]))}</pt> <pr>{normalize-space(data($pr[1]))}</pr></price>
            <![CDATA[ </extracted> ]]>

      So disregard the previous.... i know what the problem is, but i'm still a bit short on the syntax.

      The Problem: I'm trying to read through the code twice, namely, create two separate xml readers over the same code.

      My code: <xq-expression>
      declare variable $doc as node() external;
      declare variable $doc2 as node() external;
      for $allListings in $doc//div[@id="floorPlanOneMatrix"]

      for $planrates in $allListings//span[@class="firstFloorPlanRates"]
      let $pr := $planrates

                      for $allListings1 in $doc2//div[@id="floorPlanOneMatrix"]
                          for $plantitle in $allListings1//span[@class="firstFloorPlanTitle"]
                          let $pt := $plantitle//span[@style="color: rgb(0,56,147)"]
                                  <price> <header>100 Midtown</header> <pt>{normalize-space(data($pt[1]))}</pt> <pr>{normalize-space(data($pr[1]))}</pr></price>

      This creates a "squared effect", where each title is paired with each rate. I know the order will be fine if read separately into single arrays, but i know the problem involves breaking out of the for loop (which is pulling each rate) and then conducting a separate read through the titles.

      Can anyone suggest syntax enabling me to run two separate readers here? Thanks,



