From: Rui L. <ru...@il...> - 2013-04-26 14:34:20
|
Forwarding to the list for archival purposes... Hopefully useful for other users. Thanks, Rui -------- Original Message -------- Subject: RE: [PerfSuite-users] psrun for daemon applications Date: Wed, 24 Apr 2013 19:46:01 +0000 From: Hassan, Ahmad <ahm...@sa...> To: Rui Liu <ru...@il...> Hi Rui, This works amazingly well. Thanks for providing the metrics.xml template. Kind Regards, Ahmad SAP -----Original Message----- From: Rui Liu [mailto:ru...@il...] Sent: 23 April 2013 21:53 To: Hassan, Ahmad Subject: Re: [PerfSuite-users] psrun for daemon applications Hi Ahmad, Thanks a lot for trying out different approaches and reporting the results! Glad to know it's getting closer. Now you are becoming an advanced user of PerfSuite. :-) > But the output is somewhat different from what I want. I am looking for aggregated values of all four attributes that are present in individual XML files so that I can calculate CPI, Total LLC misses and LLC accesses. > Any ideas? You can try the user metrics definition feature of psprocess. I created a user metrics def file for your goal and attached it. Then do: psprocess -b -M . -m metrics.xml combined.xml -o psout.txt The "metrics.xml" file contains the definitions of 5 customized metrics: CPI, the counts of PAPI_TOT_CYC, PAPI_TOT_INS, PAPI_L3_TCA, and PAPI_L3_TCM. In the 2nd half of the output -- "Aggregate Statistics", the last column, named "Sum", contains the sum of all the numbers for the metric in "combined.xml". The above will give you the total LLC misses and LLC accesses. For CPI, you'll have to manually divide the values of the PAPI_TOT_CYC "sum" by the PAPI_TOT_INS "sum". The CPI metric definition included in "metrics.xml" is just for illustration purpose. It calculates CPI for the particular XML file (in this case for the particular thread), so the sum of all of them probably does not make sense. :-) That's why I wrote in the above to manually calculate it using sum of PAPI_TOT_CYC/sum of PAPI_TOT_INS. And even when calculated that way, I'm unsure how useful it is, since each thread might be doing different things, some (such as the main) thread might be idling. But maybe in your case it could be useful, as a general indication of program behavior. The output of psprocess mixes the user metrics with the system ones, sorting them alphabetically by the descriptions. I used a prefix string "user- " for the user defined ones, so they appear together at the end. This is not required; it just made it easier to find them in the output. Examine the attached "metrics.xml" file and the PerfSuite system default one for PAPI at $PREFIX/share/perfsuite/xml/pshwpc/PAPI_metrics.xml, and you will understand the syntax. Then you can customize it to your needs. Thanks, Rui On 04/23/2013 12:34 PM, Hassan, Ahmad wrote: > Hi Rui, > > Okay I have tried psprocess utility on a various xml files as follows: > > The individual xml files have following four fields: > <hwpcevent type="preset" name="PAPI_TOT_CYC" derived="no">127709818</hwpcevent> > <hwpcevent type="preset" name="PAPI_TOT_INS" derived="no">162427885</hwpcevent> > <hwpcevent type="preset" name="PAPI_L3_DCA" derived="yes">734563</hwpcevent> > <hwpcevent type="preset" name="PAPI_L3_TCM" derived="no">303576</hwpcevent> > > psprocess -c *.xml -o combined.xml > psprocess -b combined.xml -o psout.txt > > But the output is somewhat different from what I want. I am looking for aggregated values of all four attributes that are present in individual XML files so that I can calculate CPI, Total LLC misses and LLC accesses. > > Any ideas? > > Kind Regards, Ahmad > SAP > > -----Original Message----- > From: Rui Liu [mailto:ru...@il...] > Sent: 23 April 2013 15:36 > To: Hassan, Ahmad > Subject: Re: [PerfSuite-users] psrun for daemon applications > > Hi Ahmad, > > Glad to know that XML files are generated now. :-) > >> ... I ran few benchmarks on a 10G dataset and queries touch around 2G of data but the numbers that I am getting from psrun are very small. In addition to that, the number are more or less same for all queries and for different dataset sizes. This gives me an impression that pthreads that are being created by database are not being monitored. I am using the following command: >> psrun -f -c /tmp/perfsuite-1.1.2/perf.xml -o 01 -F text -d all -S 12 -X ./rundb.sh > > If you want to monitor the spawn threads, in addition to the forked processes, you need to give the "-p" (standing for "pthreads") option to psrun as well. That is, in your case, do: > psrun -f -p -c /tmp/perfsuite-1.1.2/perf.xml ... > The "-p" option will cause psrun to generate one XML file per thread, otherwise, only one file is generated, containing the results for just the main thread -- this is likely what happened in your case. > > Thanks, > Rui > > On 04/23/2013 06:12 AM, Hassan, Ahmad wrote: >> Hi Rui, >> >> I found the XML files for database now. It turned out to be very basic mistake. The location of the generated xml file is the relevant location where the database stores its back up files. Thanks for your help. I ran few benchmarks on a 10G dataset and queries touch around 2G of data but the numbers that I am getting from psrun are very small. In addition to that, the number are more or less same for all queries and for different dataset sizes. This gives me an impression that pthreads that are being created by database are not being monitored. I am using the following command: >> >> psrun -f -c /tmp/perfsuite-1.1.2/perf.xml -o 01 -F text -d all -S 12 -X ./rundb.sh >> >> I have attached the results of one query that I executed. The query finishes in millisecs because it is an inmemory database. Any ideas why the numbers are very low and insensitive to dataset size please? >> >> Thanks. >> >> Kind Regards, Ahmad >> SAP >> >> >> -----Original Message----- >> From: Rui Liu [mailto:ru...@il...] >> Sent: 22 April 2013 22:37 >> To: Hassan, Ahmad >> Subject: Re: [PerfSuite-users] psrun for daemon applications >> >> Hi Ahmad, >> >> I temporarily removed the mailing list so we don't cause too many mails for them. :-) >> >>> 1) There is a DB script which runs the database process as a daemon so If I do that then the 'psrun' immediately creates the XML after starting the database process. So this xml file doesn't show the numbers for the running database because the DB is still in the running phase. If I stop the database, the xml file stays the same : >>> psrun -f DB_exec start >>> 2) In the second case, instead of using the wrapper script 'DB_exec' for starting the database process, I started the database binary myself and put '&' in the end: >>> psrun -f run_db params & >>> In the second case, no xml file is generated even after I kill the db process through SIGTERM or SIGINT or SIGKILL. >> >> >>> One of the thing I notice that, If I kill the 'top' process using SIGKILL then no XML is created. The xml is only created for 'top' test case if I use SIGTERM or SIGINT. >>> The database has its own signal handler for different signals, can this be a potential cause of what we are seeing? >>> One of the possible solution would be that, if we can trigger 'psrun' to generate XML file while the application is running. Would that be possible? Or some other good solution that you would suggest. >> >> Thanks a lot for trying different ways and reporting the results! >> >> Yes, my understanding is that SIGKILL is brute force, it won't be calling the exit handler, so no PerfSuite XML file will be written. >> >> psrun seemed to have been designed with this need in mind. :-) There is an option "-S" in psrun, to just allow the solution you suggested -- allowing a user to specify a signal number to trigger the writing of the XML file. Details of this option is available by doing "man psrun". "man 7 signal" will show you the values of the signals. In my text below, I used "16" -- the value for SIGUSR1. >> >> Could you please try: >> 1) write a simple script, such as "/tmp/do.sh", containing something similar to: >> ---------- >> #!/bin/sh >> >> run_db params & >> ---------- >> 2) run it with "psrun -f -S <a_signal_number> /tmp/do.sh", then do "ps uf" to find the PID, then "kill -<a_signal_number> <PID>". >> >> I tried the signal number 16: "psrun -f -S 16 /tmp/do.sh" and "kill -16 <PID>", and a valid XML file was generated. :-) >> >> Thanks, >> Rui >> >> >> On 04/22/2013 04:05 PM, Hassan, Ahmad wrote: >>> Hi Rui, >>> >>> Another strange observation. Even if I don't put the database process in the background, but use CTRL^c to stop the process, even then the XML file is not generated for the database. For example the use case I am running is: >>> >>> psrun -f run_db_binary param1 param2 >>> >>> press CTRL^c >>> >>> No xml file. So it seems that there is some kind of dependency on application here? >>> >>> Best Regards, Ahmad >>> SAP >>> >>> >>> -----Original Message----- >>> From: Rui Liu [mailto:ru...@il...] >>> Sent: 22 April 2013 20:48 >>> To: Hassan, Ahmad >>> Cc: per...@li... >>> Subject: Re: [PerfSuite-users] psrun for daemon applications >>> >>> Hi Ahmad, >>> >>> Thanks a lot for your interest in using PerfSuite! >>> >>> What you saw might be the expected behavior -- you might just have to wait for the database daemon process to stop, either a normal finish or being killed, for the XML file to be generated by psrun. >>> >>> I tried a simple test to mimic the behavior you saw: a file "/tmp/do.sh", which contains just: >>> -------------------- >>> #!/bin/sh >>> >>> top -b -n30 > /tmp/top.txt & >>> -------------------- >>> The "&" character makes the "top" process to run in the background, similar to a daemon process. >>> >>> After I ran "psrun -f /tmp/do.sh", psrun returned immediately, just as you said "psrun returns as soon as the database starts and goes into daemon mode". However, "top" kept running, and "psrun" kept measuring it. In the meanwhile, "top" was running the 30 iterations we asked for and no PerfSuite XML file was generated -- this seemed to be what you observed. . After "top" successfully finished, psrun generated a valid file with the count values. >>> >>> In another test, a valid XML file was also generated when I used "kill" to stop the "top" process. I used "ps uf" to find the PID of the "top" process, then used "kill <pid>". If your db process is a long-running one (say it will take days for it to finish), or it never finishes by itself, you could try to kill it. >>> >>> The reason for the behavior is that "psrun" installs an exit handler function for a process it monitors, and when the process exits (either by itself, or by receiving a signal such as the one "kill" sends to it), the exit handler is called and the exit handler writes out the XML file. >>> >>> Could you please try to let your db daemon process exit normally or kill it (try the default signal "SIGTERM" first), check whether a PerfSuite XML file is generated and let us know the result? Please allow the process to run for some time, otherwise most counts could be zero. >>> >>> Thanks, >>> Rui >>> >>> Rui Liu >>> NCSA >>> current PerfSuite maintainer >>> >>> On 04/22/2013 01:56 PM, Hassan, Ahmad wrote: >>>> Hi Team, >>>> >>>> I have installed perfsuite v1.1.2. I want to profile the whole database and measure PAPI events. As the database runs as a daemon process so I couldn't figure out how to use Perfsuite to profile the running database daemon until the daemon stops. I tried the following: >>>> >>>> psrun -f -c papi_sandybridge.xml db_exec start >>>> >>>> But in the above case, psrun returns the output as soon as database starts and goes into daemon mode. Any suggestions please? >>>> >>>> Thanks. >>>> >>>> Kind Regards, Ahmad >>>> >>>> SAP |