Re: [Quexs-discuss] Performance issues
Web based system for Computer Assisted Telephone Interviewing (CATI)
Brought to you by:
azammitdcarf
|
From: Adam Z. <ada...@ac...> - 2010-03-16 04:20:07
|
Dear Saulius, The first query you list that inspects approx 2,000,000 rows. I think this is due to the questionnaire_sample_quota_row_exclude table. I have had another look at it, and one possible issue that I have identified is that the sample questionnaire_id and sample_id could be listed multiple times if they apply to multiple quotas. This is set up so that it is easy to remove quota_exclude rows when removing a quota. The issue would appear this way. For example, if there are 10,000 sample records for this questionnaire You also say there are 150 quota_rows If a sample record is excluded in 100 of those 150 quota rows, then it will be JOINed 100 times (if this applies to 50% of the sample, then the SQL will return 50 * 10,000 = 500,000 rows) It would be good if you could confirm this by running the following SQL: SELECT count(*) FROM questionnaire_sample as qs JOIN sample as s ON (s.import_id = qs.sample_import_id) LEFT JOIN questionnaire_sample_quota_row_exclude AS qsqre ON (qsqre.questionnaire_id = qs.questionnaire_id AND qsqre.sample_id = s.sample_id) WHERE qs.questionnaire_id = 1 (Please replace the last line: WHERE qs.questionnaire_id = 1 with the questionnaire_id of the questionnaire you are running - see the questionnaire table to confirm this) Regards, Adam Zammit On 14 March 2010 10:49, Saulius <zi...@ce...> wrote: > Hi Adam, > > > Before I optimized (removed old values) Quexs tables had following number or records: > > Sample ~150 000 ( each sample record contains around 9 columns / it depends on a questionnaire ) > Sample_var ~ 850 000 > Questionnaire_sample_quota_row ~ 3140 ( 150 related to this particular questionnaire) > Call ~ 69000 > Case ~ 41000 > Operator ~ 358 ( though around 10 related to the questionnaire). > > Please let me know if you need more information. > > Regards, > Saulius > > > > Nuo: "Adam Zammit" > Siųsta: 2010 03 11 05:07 > Tema: Re: [Quexs-discuss] Performance issues > > > Dear Saulius,These queries are taking a long time.Can you let me know how many records and columns your sample containsand also how many quota's you have set up? These all effect how manyrows have to be analysed to provide the next case.One thing that may assist is to confirm that all possible fields thatare being analysed are indexed. I would have tried to do this whendesigning the database, but may have left out an important field whichcould slow it down.Regards,Adam ZammitOn 10 March 2010 09:06, Saulius wrote:>> Hi Chris,>> The graph is multiplying load information by 100. On the server load average now looks something like this: 0.81, 0.64, 0.53.>> So if on the graph is 200 ... this > would be equivalent to load of 2.0.>> Regards,> Saulius>>>> Nuo: "Chris Mylonas"> Siųsta: 2010 03 09 05:00> Tema: Re: Re: Re: [Quexs-discuss] Performance issues>> Hmmm..... your graph actually says 50.0 - 200.0 for load average. Not a %, nor a whole number like 0.50 - 2.0 ->> If you did the top command from the console (bash), and just watch it for 5 minutes. Could you confirm the load-average value. If it's >50.0, then you've got something a little more serious because that's a lot more than "300%">> 1. What processes are eating up the CPU/MEM - here httpd is taking up 7% per process - I'd be guessing your MySQL would be consuming a lot - unless > conference rooms are doing a lot of audio stream mixing?> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND> 6390 apache 15 0 72880 39m 7016 S 0.0 7.6 2:08.88 httpd> 6394 apache 17 0 76584 38m 6988 S 0.0 7.5 2:28.22 httpd> 6392 apache 15 0 72204 36m 6920 S 0.0 7.2 2:23.38 httpd> 11197 apache 16 0 69484 34m 6900 S 0.0 6.8 2:05.74 httpd> 6391 apache 15 0 66288 33m 6956 S 0.0 6.5 2:34.36 httpd> 13423 apache 15 0 70292 28m 6888 S 0.0 5.5 1:36.22 httpd> 25070 apache 15 0 64428 27m 7060 S 0.0 5.4 0:11.19 httpd> 6822 > apache 16 0 71112 26m 6960 S 0.0 5.2 2:27.80 httpd> 25072 apache 16 0 63512 26m 7204 S 0.0 5.2 0:12.51 httpd> 25074 apache 16 0 65276 25m 6896 S 0.0 5.0 0:12.10 httpd> 25071 apache 15 0 61380 24m 7040 S 0.0 4.8 0:12.89 httpd> 9270 apache 16 0 72904 23m 6980 S 0.0 4.7 1:15.86 httpd> 6393 apache 17 0 70636 23m 7076 S 0.0 4.7 2:21.22 httpd> 6397 apache 15 0 74720 22m 7044 S 0.0 4.5 2:17.07 httpd> 6395 apache 16 0 71900 21m 6944 S 0.0 4.3 2:25.41 httpd> 25075 apache 16 0 58468 21m 7084 S 0.0 4.2 0:13.05 > httpd>>> 2. What state is the CPU in most of the time?> Cpu(s): 1.3%us, 0.0%sy, 0.0%ni, 54.0%id, 44.7%wa, 0.0%hi, 0.0%si, 0.0%st>> If you've got a value that is high for "wa" then your bottleneck is I/O - the CPU is hungry waiting for data, and it can't get enough of it quick enough.>>>>>>>> On Tue, Mar 9, 2010 at 11:13 AM, Saulius zi...@ce...> wrote:> Hi Chris,>> I think I unintentionally misled you. The load reaches 3 (10min average) ..so I assumed that this is for all processors, but you are right if we have 4 proc. cores - 300% server load would be equivalent to 12.>> I have attached 30min average graph which shows some of the peeks (though not as big as 10min > average).>> When the load shoots up the system cannot take a next case ... operators have to wait for few minutes to get a case assigned to them. I tried to tune the database by removing old sample variables and the performance improved. The sample table is growing fast so im afraid i will run into the same issues again.>> BTW I noticed that Quexs system refreshes screen every few seconds. Is it really necessary? Would be more efficient only to refresh when the system tries to assign a case?>>> Regards,> Saulius>>> Nuo: "Chris Mylonas"> Siųsta: 2010 03 08 01:45> Tema: Re: Re: [Quexs-discuss] Performance issues>>>>> Thanks,>> So it had a load-average of about 12.> Out of interest, > was it affecting the quality of the calls?>> Cheers> Chris>>>>> On Mon, Mar 8, 2010 at 4:39 AM, Saulius zi...@ce...> wrote:> Hi Chris,>> The figure I mentioned is load average during 10min ( http://en.wikipedia.org/wiki/Load_%28computing%29)>>> Regards,> Saulius>>> Nuo: "Chris Mylonas"> Siųsta: 2010 03 07 05:26> Tema: Re: [Quexs-discuss] Performance issues>>> Hi,>> That seems very excessive. When you say 300%, where are you getting this figure from?>> Chris>>>>> On Sun, Mar 7, 2010 at 7:28 AM, Saulius zi...@ce...> wrote:> Hello,>> We have started a project using Quexs. I noticed that our server's load shoots up to 300% but only 10 operators are working at the same > time. The server specs are as follows ( dedicated server with 4 x Xeon processors & 8 GB RAM). I checked MySQL slow sql log and it shows that following SQLs are causing the server load.>>> It seems that the main "offender" is the case SQL as it tries to join so many tables using costly LEFT JOINs. Would be possible to tune this SQL or apply different strategy to select a next available case? Perhaps Operators, Operator skills, shifts, call restrictions checks can be done outside this SQL?>> -------------------------------------------------------> # Query_time: 27 Lock_time: 0 Rows_sent: 1 Rows_examined: 2203024> SET timestamp=1267623502;> SELECT c.case_id as > caseid,s.*,apn.*,a.*,sh.*,op.*,cr.*,si.*,CONVERT_TZ(NOW(), 'System' , s.Time_zone_name) as resptime>> FROM `case` as c> LEFT JOIN `call` as a on (a.call_id = c.last_call_id)> JOIN (sample as s, sample_import as si) on (s.sample_id = c.sample_id and si.sample_import_id = s.import_id)> JOIN (questionnaire_sample as qs, operator_questionnaire as o, questionnaire as q, operator as op, outcome as ou) on (c.questionnaire_id = q.questionnaire_id and op.operator_id = '216' and > qs.sample_import_id = s.import_id and o.operator_id = op.operator_id and o.questionnaire_id = qs.questionnaire_id and q.questionnaire_id = o.questionnaire_id and ou.outcome_id = c.current_outcome_id)> LEFT JOIN shift as sh on (sh.questionnaire_id = q.questionnaire_id and (CONVERT_TZ(NOW(),'System','UTC') >= sh.start) AND (CONVERT_TZ(NOW(),'System','UTC') CONVERT_TZ(NOW(),'System','UTC')))> LEFT JOIN appointment as apn on (apn.case_id = c.case_id AND apn.completed_call_id is NULL AND (CONVERT_TZ(NOW(),'System','UTC') >= apn.start) > AND (CONVERT_TZ(NOW(),'System','UTC') = cr.start and TIME(CONVERT_TZ(NOW(), 'System' , s.Time_zone_name)) Nėra pakitimų pranešimų: pridėti pranešimą') ELSE CONCAT('Rodyti pakitimų pranešimus') END AS link, c.completions as completions, CONCAT('Rodyti operatoriaus darbą') as operform> FROM `shift` as s> JOIN operator as o on (o.operator_id = '1')>> LEFT JOIN shift_report as sr on (sr.shift_id = s.shift_id)> LEFT JOIN ( SELECT count(*) as completions,sh.shift_id> FROM `call` as a, `case` as b, > shift as sh> WHERE a.outcome_id = '10'>> AND a.case_id = b.case_id> AND b.questionnaire_id = '1'>> AND sh.start = a.start>> GROUP BY sh.shift_id) as c on (s.shift_id = c.shift_id)> WHERE s.questionnaire_id = '1'>> GROUP BY shift_id> ORDER BY s.start ASC;>> ----------------------------->> # Query_time: 32 Lock_time: 0 Rows_sent: 1 Rows_examined: 905808> SET timestamp=1267518450;> SELECT s.sample_id as > sample_id,c.case_id as case_id,qs.questionnaire_id as questionnaire_id,CONVERT_TZ(NOW(), 'System' , s.Time_zone_name) as resptime, q.testing as testing> FROM sample as s> JOIN (questionnaire_sample as qs, operator_questionnaire as o, questionnaire as q, operator as op, sample_import as si, operator_skill as os) on (op.operator_id = '200' and qs.sample_import_id = s.import_id and o.operator_id = op.operator_id and o.questionnaire_id = qs.questionnaire_id and q.questionnaire_id = o.questionnaire_id and si.sample_import_id = s.import_id and os.operator_id = > op.operator_id and os.outcome_type_id = 1)>> LEFT JOIN `case` as c on (c.sample_id = s.sample_id and c.questionnaire_id = qs.questionnaire_id)> LEFT JOIN call_restrict as cr on (cr.day_of_week = DAYOFWEEK(CONVERT_TZ(NOW(), 'System' , s.Time_zone_name)) and TIME(CONVERT_TZ(NOW(), 'System' , s.Time_zone_name)) >= cr.start and TIME(CONVERT_TZ(NOW(), 'System' , s.Time_zone_name)) = sh.start) AND (CONVERT_TZ(NOW(),'System','UTC') http://p.sf.net/sfu/intel-sw-dev>>>> _______________________________________________> Quexs-discuss mailing > list> Que...@li...> https://lists.sourceforge.net/lists/listinfo/quexs-discuss>>>> --> ======================>> Chris Mylonas>> Web Systems Administrator> Omnium Research Group> College Of Fine Arts> University of NSW> www.omnium.net.au>> Communications, Internetworking, Software Design & Prototyping> Consultant> www.mrvoip.com.au>> Open Source Software Projects> Founder> www.opencsta.org> www.nursepaging.com>> ======================>>>>>>>> --> ======================>> Chris Mylonas>> Web Systems Administrator> Omnium Research Group> College Of Fine Arts> University of NSW> www.omnium.net.au>> Communications, Internetworking, Software Design & Prototyping> Consultant> > www.mrvoip.com.au>> Open Source Software Projects> Founder> www.opencsta.org> www.nursepaging.com>> ======================>>>> dfghdg>>> --> ======================>> Chris Mylonas>> Web Systems Administrator> Omnium Research Group> College Of Fine Arts> University of NSW> www.omnium.net.au>> Communications, Internetworking, Software Design & Prototyping> Consultant> www.mrvoip.com.au>> Open Source Software Projects> Founder> www.opencsta.org> www.nursepaging.com>> ======================>>>> dfghdg>> ------------------------------------------------------------------------------> Download Intel® Parallel Studio Eval> Try the new software tools for yourself. Speed compiling, find bugs> > proactively, and fine-tune applications for parallel performance.> See why Intel Parallel Studio got high marks during beta.> http://p.sf.net/sfu/intel-sw-dev> _______________________________________________> Quexs-discuss mailing list> Que...@li...> https://lists.sourceforge.net/lists/listinfo/quexs-discuss> > > > |