Thanks again, Michael.

I was thinking that it was wasteful to recreate new versions of simplex each time, but figured it was not a significant cause of performance issues. It is helpful to read your post, though, because I had never actually used the "xsl:perform-sort" instruction to do anything. Since most of the performance issues I have had with XSL have been fixable by either saxon:discard-document or by use of keys, I have been a bit lazy about going with a solution I know works rather than the best one, and this has (obviously) dissuaded me from learning new ways to do things.

Thanks again, and I hope to contribute in the future to the XSL community.
-David


On Sun, Feb 3, 2013 at 1:14 PM, Michael Kay <mike@saxonica.com> wrote:
Yes, I repeated your experiment and got the same result. The Java
profile confirms (as does running in the debugger) that there is less
string-to-double conversion going on, but this is not having any impact
on the bottom line. One possibility is that there are adverse effects
elsewhere due to the fact that the types of other nodes, notably the
simplex structure, is not known. I tried changing $simplex from a
document node to element(p, xs:untyped)* wherever it appears, but this
had no noticeable effect.

Then I noticed that having made this change, you don't need to
reconstruct the $simplex elements just in order to sort them. You can
instead do this:

         <xsl:variable name="sorted.simplex" as="element(p, xs:untyped)*">
             <xsl:perform-sort select="$simplex">
                 <xsl:sort select="xs:double(@WSSR)"/>
             </xsl:for-each>
         </xsl:variable>

Equally, $new.simplex no longer needs to make copies of elements, it can
simply reference the existing elements. This does appear to make a small
but useful difference.

Then I tried converting the $simplex structures to a sequence of maps
(that is, map(string, double)*) and this got the run-time down to 14.4 secs.

I haven't tried the next stage of converting the $data structure to a
sequence of maps, because the changes are fairly extensive, but I'm sure
it would give another useful saving.

Michael Kay
Saxonica


On 03/02/2013 06:32, David Rudel wrote:
Michael,

To practice use of type validation, I decided to see if made any
difference if the elements in the data set were validated as integers or
not. I may not have done this properly, but I did not see any improvement.

I realize you did not guarantee any improvement from this, but I also
want to make sure I'm doing this right. If I may impose on your good
will, could you see if I did this properly?

Modified version is attached.

Here are the changes I made:
A. I added <<default-validation="preserve">> to the xsl:stylesheet tag.
I don't think this is actually necessary.

B. I declared an inline schema as:

     <xsl:import-schema>
         <xs:schema>
         <xs:element name="point" type="data-point"/>
         <xs:complexType name="data-point">
             <xs:attribute name="x" type="xs:integer"/>
             <xs:attribute name="ys" type="xs:integer"/>
             <xs:attribute name="yc" type="xs:integer"/>
             <xs:attribute name="yg" type="xs:integer"/>
             <xs:attribute name="z" type="xs:integer"/>
         </xs:complexType>
         </xs:schema>
     </xsl:import-schema>


C. When loading values into my $data variable, I used
<<validate="strict">> as so:

         <xsl:variable name="data" as="schema-element(point)*">
                         <xsl:for-each
select="$PaG/student[@user.ID=current()/@user.ID]/point">
                             <xsl:copy-of select="." validation="strict"/>
                         </xsl:for-each>
             </xsl:variable>


D. When indicating parameters for the called template, I declared
"$data" as having type point:

<xsl:template name="EPSolver">
         <xsl:param name="simplex"/>
         <xsl:param name="data" as="schema-element(point)*"/>
         <xsl:param name="count"/>
         <xsl:param name="tolerance" as="xs:double"/>


E. Similarly, when calling the template I used "as" to specify the type:
             <xsl:call-template name="EPSolver">
                         <xsl:with-param name="simplex" select="$simplex1"/>
                         <xsl:with-param name="data" select="$data"
as="schema-element(point)*"/>
                         <xsl:with-param name="count" select="1"/>
                         <xsl:with-param name="tolerance" select="0.001"/>
                     </xsl:call-template>

Is there something else I should do validation-wise to try to cut-down
on cost of converting strings to integers?

Thanks for all your help,
David



------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help



--

"A false conclusion, once arrived at and widely accepted is not dislodged easily, and the less it is understood, the more tenaciously it is held." - Cantor's Law of Preservation of Ignorance.