Thread: [Kernowforsaxon-help] Average timing
Brought to you by:
ajwelch
From: Florent G. <dar...@ya...> - 2007-06-23 19:25:45
|
Andrew, About the average timing feature, I thought about the following points. You can see a screenshot of what the option frame could looks like at: http://www.fgeorges.org/tmp/kernow-timing-options.jpg You can enable/disable the timing. Then you can set the number of time the transformation should be performed before measuring time (to initialize the environement, load all classes, etcetera), and the number of time to execute the transformation to actually measure time. And finally you can you want to measure only the compilation phase, or the transformation phase, or both. In addition, this have not to be restricted to XSLT transformation. When enabled, the feature can measure average time of evaluating an XQuery expression, or even to validate a document. What do you think about those thoughts? Regards, --drkm _____________________________________________________________________________ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail |
From: Andrew W. <and...@gm...> - 2007-06-25 08:49:08
|
> About the average timing feature, I thought about the > following points. You can see a screenshot of what the > option frame could looks like at: > > http://www.fgeorges.org/tmp/kernow-timing-options.jpg > > You can enable/disable the timing. Then you can set the > number of time the transformation should be performed before > measuring time (to initialize the environement, load all > classes, etcetera), and the number of time to execute the > transformation to actually measure time. And finally you > can you want to measure only the compilation phase, or the > transformation phase, or both. > > In addition, this have not to be restricted to XSLT > transformation. When enabled, the feature can measure > average time of evaluating an XQuery expression, or even to > validate a document. > > What do you think about those thoughts? That looks good, although there is no initializing of the environment, loading of classes etc as the JVM is already running (its not like running Saxon from the command line) The only real "startup" cost is creating the Transformer (or Templates when compiling the stylesheet), and Kernow doesn't start the timing until after that point - see SingleFileTransformer - this is the class that's used for both "single file" and "standalone" transforms. I would suggest that the "ignore initialization" options aren't needed, but apart from that it looks good. A bit of refactoring will need to be done before it's possible to run the transform n times and store the time for each run (it will give me chance to tidy the code a bit and add some comments :) Hopefully I'll do it today, but if not tomorrow.... |
From: Florent G. <dar...@ya...> - 2007-06-25 10:57:41
|
Andrew Welch wrote: Hi > That looks good, although there is no initializing of the > environment, loading of classes etc as the JVM is already > running (its not like running Saxon from the command line) Two points here: 1/ I always, always heard to do a few initializing passes when doing timing. The computer architecture is very complex, and there are a lot of caching or lazy computing from the processor core to the application level. IMHO this is a good practice, even if you think you can argue you don't need it in a specific case. 2/ I wonder if the class loader load all classes before main() is called, or the first time a class is used. I think it is the second case. Anyway, I did a simple test to see if the first timing is quite different from the others: import java.util.Calendar; import javax.xml.transform.Source; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.stream.StreamSource; import net.sf.saxon.TransformerFactoryImpl; public class Main { public static void main(String[] args) throws Throwable { TransformerFactory factory; Transformer trans; Calendar before; Calendar after; long time; for ( int i = 0; i < 10; ++i ) { factory = new TransformerFactoryImpl(); before = Calendar.getInstance(); trans = factory.newTransformer( new StreamSource("timing.xsl") ); trans.transform(new StreamSource("../hamlet.xml"), new StreamResult(System.out)); after = Calendar.getInstance(); time = after.getTimeInMillis() - before.getTimeInMillis(); System.err.println("ms: " + time); } } } The stylesheet is an identity transform and the input doc is from http://www.cafeconleche.org/examples/shakespeare/ (very useful to do some tests). The result of running it twice is: (drkm)[12] ~/java/tests/timing$ make > /dev/null ms: 1141 ms: 181 ms: 220 ms: 170 ms: 140 ms: 141 ms: 90 ms: 160 ms: 150 ms: 150 (drkm)[13] ~/java/tests/timing$ make > /dev/null ms: 1211 ms: 191 ms: 240 ms: 180 ms: 150 ms: 151 ms: 90 ms: 160 ms: 160 ms: 150 So I think this is worth adding this possibility (maybe defaulted to 0). IMHO interesting default values are 2 and 5. This should be enough for most of timing. What do you think? Regards, --drkm _____________________________________________________________________________ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail |
From: Andrew W. <and...@gm...> - 2007-06-25 11:15:34
|
> > That looks good, although there is no initializing of the > > environment, loading of classes etc as the JVM is already > > running (its not like running Saxon from the command line) > > Two points here: > > 1/ I always, always heard to do a few initializing > passes when doing timing. The computer architecture > is very complex, and there are a lot of caching or > lazy computing from the processor core to the > application level. IMHO this is a good practice, > even if you think you can argue you don't need it in > a specific case. > > 2/ I wonder if the class loader load all classes before > main() is called, or the first time a class is used. > I think it is the second case. [snip] > So I think this is worth adding this possibility (maybe > defaulted to 0). IMHO interesting default values are 2 and > 5. This should be enough for most of timing. What do you > think? I agree, however in Kernow after the first run everything is initialized - so on the second run the first iteration won't be an outlier. I don't see any reason not to include the feature though - something like "ignore first n runs" should be simple enough. I've pretty much done the refactoring now so I'll commit the code... |
From: Florent G. <dar...@ya...> - 2007-06-25 12:49:02
|
Andrew Welch wrote: Hi > I've pretty much done the refactoring now so I'll > commit the code... Great! I'm sure that will save time when one (for instance me :-p) will want to compare two constructs in terms of efficiency. Thanks, --drkm ___________________________________________________________________________ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses http://fr.answers.yahoo.com |
From: Andrew W. <and...@gm...> - 2007-06-25 14:32:37
|
> > I've pretty much done the refactoring now so I'll > > commit the code... > > Great! I'm sure that will save time when one (for instance me :-p) > will want to compare two constructs in terms of efficiency. Ok, all checked in. Please take a look and see what you think. I think I may need to modify the way the times are displayed - currently milliseconds are only displayed for times under 3 seconds... but for performance testing its good to see these (and it looks odd when the average time doesn't quite equal (total / (numOfRuns - numToIgnore)) because the ms aren't displayed for the total. Anyway, let me know if this is what you had in mind. cheers andrew |
From: Florent G. <dar...@ya...> - 2007-06-26 22:15:48
|
Andrew Welch wrote: Hi > Ok, all checked in. Please take a look and see what you > think. Perfect! Some possible enhancements... 1/ Permit to separate compile- and run- time. 2/ Perform timing in other actions than single transform. For example XQuery and why not, validation. > I think I may need to modify the way the times are > displayed - currently milliseconds are only displayed for > times under 3 seconds... > but for performance testing its good to see these (and it > looks odd when the average time doesn't quite equal (total > / (numOfRuns - numToIgnore)) because the ms aren't > displayed for the total. Yes. Maybe you can remove the test within Timer.getDurationInWords() and always display milliseconds? Thanks for the feature. Regards, --drkm _____________________________________________________________________________ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail |
From: Florent G. <dar...@ya...> - 2007-06-26 23:25:16
|
Florent Georges wrote: > Perfect! Some possible enhancements... I forgot one thing. When I tested the timing, I used a simple identity transform, with a huge input file. The timing was really different if I run it with the output within the Kernow window or in a file. I guess this is because huge text display in Swing. But this is a case where we don't really want the output and using a file for that is too more work ;-) Maybe an option "ignore output" on the transform panel could be interesting? Regards, --drkm ___________________________________________________________________________ Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses http://fr.answers.yahoo.com |
From: Andrew W. <and...@gm...> - 2007-06-27 09:14:35
|
On 6/27/07, Florent Georges <dar...@ya...> wrote: > I forgot one thing. When I tested the timing, I used a > simple identity transform, with a huge input file. The > timing was really different if I run it with the output > within the Kernow window or in a file. I guess this is > because huge text display in Swing. Yes - writing the output to the JTextArea takes a long time. In practice though, you would always be writing to a file and just checking the output window for any notifications like xsl:message output. > But this is a case where we don't really want the output > and using a file for that is too more work ;-) Maybe an > option "ignore output" on the transform panel could be > interesting? Good idea, I'll add it to the TODO list. -- http://andrewjwelch.com |
From: Andrew W. <and...@gm...> - 2007-06-27 09:10:11
|
> Perfect! Some possible enhancements... > > 1/ Permit to separate compile- and run- time. Good point - I'll add that. > 2/ Perform timing in other actions than single transform. > For example XQuery and why not, validation. Yes, it's on the TODO list :) > Yes. Maybe you can remove the test within > Timer.getDurationInWords() and always display milliseconds? It's certainly something that needs looking at, it's not quite right the moment. I can just make the time all milliseconds because some tranforms that I've measured in the past (like the Sudoku solver) took up to a minute. I'll think about it. |
From: Florent G. <dar...@ya...> - 2007-06-27 09:17:14
|
Andrew Welch wrote: Hi > > Yes. Maybe you can remove the test within > > Timer.getDurationInWords() and always display > > milliseconds? > It's certainly something that needs looking at, it's not > quite right the moment. I can just make the time all > milliseconds because some tranforms that I've measured in > the past (like the Sudoku solver) took up to a minute. Just to be sure: I meant to always display the ms *part*, not the whole time only in ms. Even if you have minutes and seconds, you can still have ms, that's not anoying IMHO, and that should be a pretty simple solution. Regards, --drkm _____________________________________________________________________________ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail |