From: SourceForge.net <no...@so...> - 2004-12-26 05:28:24
|
Support Requests item #1090907, was opened at 2004-12-24 12:15 Message generated for change (Comment added) made by rmeden You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=424136&aid=1090907&group_id=39046 Category: tv_grab_na_dd Group: None Status: Open Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Robert Eden (rmeden) Summary: tv_grab_na_dd uses 100% cpu doing nothing.. Initial Comment: downloaded: xmltv-0.5.37.tar.bz2, make install... signed up on labs.zap2it.com ran tv_grab_na_dd --configure entered name/pass ran tv_grab_na_dd: using config filename /home/x/.xmltv/tv_grab_na_dd.conf WARNING: Password in config file, protect as required Fetching from DataDirect: Fetched 23443 k/bytes in 86 seconds ###################################### ############ loading data: ############################### And it sits there using 100% cpu... doing nothing. ---------------------------------------------------------------------- >Comment By: Robert Eden (rmeden) Date: 2004-12-25 23:28 Message: Logged In: YES user_id=270469 I just tried it on ExpressVu on my system... "Fetching from data direct". Download 27MB of data in 73 seconds. That's 375 Kbytes/sec. Looks reasonable to me (My DSL is about 120 Kbytes/sec, but the data is compressed) During the fetch this is not status bar because it's a single call to the SOAP module.. no way to do it. There is a quick status bar afterwards. I actually didn't add that, it was added during the GUI migration... it really doesn't serve any purpose, but I think they wanted it. "Loading Data". This is the first pass through the data building cross refernece data structures. 100% CPU. Memory grew to 188M Virtual, 184 Resident (Resident = Virtual means no paging). Status updated every 10 seconds or so. "Writing Schedule". Second pass through the data and writing of the programs. Status Bar updated every 20 seconds or so. 100% CPU. For me, there were 92k programs processed in 621 seconds. (Pentium 4, 3Ghz) The raw DD file has 812k records to index and report. It's a lot of work. The system never appeared to hang; the status bars were updating. My suggestion is to reduce the number of channels (and hence the amount of data) at the DataDirect web site, or simple wait. I know the XML libraries used aren't the most CPU efficient, but I do try to be CPU efficient in my code. I'm open to suggestions of course. Robert ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-12-25 17:27 Message: Logged In: NO I have most of my ram free when running it.. it isn't swapping. === from top === Mem: 643136k av, 631664k used, 11472k free, 0k shrd, 229028k buff 363300k actv, 17976k in_d, 9420k in_c Swap: 128512k av, 22128k used, 106384k free 110716k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 28594 user 25 0 98812 96M 2352 R 99.4 15.3 5:17 0 tv_grab_na_dd === When I run it, it says: Fetching from DataDirect: and sits there. Then, when it is done: Fetched 15744 k/bytes in 41 seconds ########################################## ######## The progress bar just spits out at once at the end? Not really representing progress during the operation. Then, it sits there again seemingly doing nothing at 100% cpu... Then it spits out: loading data: ################### Then sits there again... hard to tell if there is progress. Maybe one "#" is added every minute or so but hard to notice.. Anyway, I can compile gcc faster than this.. If grabbing the data from the service is so quick, processing it really shouldn't take 30 minutes at 100% cpu. I really wonder about this. I tried "strace -p <pid>" to see what it was doing.. it is just reading really slowly (or seemed slow since my cpu is pegged). I don't understand how it can be this slow.. if I had more time I'd look into it some more. Have you tried this data direct service on linux? Sign up for bell expressvu (canada satellite) and all the channels and maybe try it and see. I have used all the recommended (latest) perl modules required by xmltv. ---------------------------------------------------------------------- Comment By: Robert Eden (rmeden) Date: 2004-12-24 22:37 Message: Logged In: YES user_id=270469 hmmm that doesn't sound excessive. There were memory issues on Windows, but I haven't heard a lot of thrashing under Linux. How much free memory did you system have while it was running? Was there a lot of paging going on? 23MB is a pretty huge pull from Data Direct. No telling how long it would have taken you with the old grabber! At what point did it appear to hang? I can't control the fetch (it's in the soap modules), but I'm pretty sure I update the status bars for everything else. I'm concerned there could be a bug somewhere... Robert ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2004-12-24 19:00 Message: Logged In: NO I have 640 megs of ram on this p3-450 running linux (used as a router). It only used 120 megs according to 'top'. I cut down the number of channels, and tried again. It took 30 minutes at full 100% cpu use. The .xml file it output was 27 megs. I'll try to split it up next time, but this still seems quite extreme-- and I had no real indication of progress for a long time so I guess I thought it was locked up. Fetching from DataDirect: Fetched 15759 k/bytes in 45 seconds ########################################## ######## loading data: ########################################## ######## WARNING: multiple channel mappings for 'APTN' WARNING: Multiple channel mappings found, please adjust DataDirect lineup Writing schedule: ########################################## ######## MESSAGE: Your subscription will expire: 2005-03- 24T17:37:40Z Downloaded 46008 programs in 1818 seconds ---------------------------------------------------------------------- Comment By: Robert Eden (rmeden) Date: 2004-12-24 12:22 Message: Logged In: YES user_id=270469 how much memory do you have? what OS are you running? *WOW* 23MB pulled from data direct? That's going to take a *LOT* of memory to process. You may want to try breaking it up into multiple days, reducing the number of data direct channels. Robert ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=424136&aid=1090907&group_id=39046 |