From: Jason H. <jh...@ap...> - 2010-01-08 04:05:24
|
Sounds like SQLite isn't playing well with multiple unicorn processes. I've only used SQLite for development, but did a bit of reading about using SQLite in production. The basic recommendation seems to be to increase the timeout setting in the production section of database.yml. Only one process can have the database file open for writing at one time, any other process trying to open it for writing has to wait. The default is 5000 ms (5s). You might try cranking it up to 15000 or 20000 (15-20s) and see if that helps. Folks generally seem to think SQLite can handle a fair bit of traffic, but if bumping the timeout up doesn't work you might consider switching to MySQL or the like. We use MySQL. Obviously not quite a trivial to set up as SQLite, but it has worked pretty flawlessly for us. Jason On Jan 6, 2010, at 11:30 AM, Kenneth Williams wrote: > I'm seeing a ton of "SQLite3::BusyException" errors followed by a 500 internal server error in the logs. Nothing else that stands out though, unless I'm missing something. Would you like to see the trace output that follows this error? Also, I'm curious if sqlite is a good option long term? I've never used it before, usually sticking to MySQL or Oracle instead. Thanks again for your help on this. > > /usr/lib/ruby/site_ruby/1.8/facter/ec2.rb:8: warning: method redefined; discarding old can_connect? > /usr/lib/ruby/site_ruby/1.8/facter/ec2.rb:16: warning: method redefined; discarding old metadata > SQLite3::BusyException: database is locked: UPDATE "facts" SET "value" = '1235105', "updated_at" = '2010-01-06 01:50:03' WHERE "id" = 2336 > 500 "Internal Server Error" > Error submitting results: > <html xmlns="http://www.w3.org/1999/xhtml"> > <head> > <title>Action Controller: Exception caught</title> > <style> > body { background-color: #fff; color: #333; } > > body, p, ol, ul, td { > font-family: verdana, arial, helvetica, sans-serif; > font-size: 13px; > line-height: 18px; > } > > pre { > background-color: #eee; > padding: 10px; > font-size: 11px; > } > > a { color: #000; } > a:visited { color: #666; } > a:hover { color: #fff; background-color:#000; } > </style> > </head> > <body> > > <h1> > ActiveRecord::StatementInvalid > > in ResultsController#create > > </h1> > <pre>SQLite3::BusyException: database is locked: UPDATE "clients" SET "updated_at" = '2010-01-06 01:50:03' WHERE "id" = 35</pre> > > > On Tue, Jan 5, 2010 at 5:55 PM, Jason Heiss <jh...@ap...> wrote: > Ah, right, sorry. If the client's status is good (0) and nothing needed to be changed then there will also be 0 results. > > Jason > > On Jan 5, 2010, at 5:38 PM, Kenneth Williams wrote: > >> I was under the impression that sometimes 0 results was okay, if for instance, if there where no files that needed to be updated? Like this host for example has a status of 0, even though it received 0 results on the last run: >> >> Client: web073 >> >> Name: web073 >> Status: 0 >> >> Time Hours Ago # of Results Total Message Size >> View 2010-01-06 01:30:36 UTC 0 0 0 >> View 2010-01-06 00:30:36 UTC 1 2 2754 >> View 2010-01-05 23:30:36 UTC 2 0 0 >> >> Cron is logging on 10 hosts, I'll check back on it after it's had time to gather some logs. Thanks for the help! >> >> On Tue, Jan 5, 2010 at 5:19 PM, Jason Heiss <jh...@ap...> wrote: >> Yeah, so that confirms a connection error. Since the client couldn't connect to the server it did not receive any configuration for any files, and thus submitted 0 results, just an overall status message indicating the failure. Hopefully capturing the output from the cron job will be informative. >> >> Jason >> >> On Jan 5, 2010, at 4:20 PM, Kenneth Williams wrote: >> >>> Yeah the status value is 1 for the broken clients. What's really strange is the most recent entries in the time line links are empty or successful, no failures. For example, here's the most recent one... >>> >>> Client: web077 >>> >>> Name: web077 >>> Status: 1 >>> >>> Time Hours Ago # of Results Total Message Size >>> View 2010-01-05 23:41:44 UTC 0 0 0 >>> View 2010-01-05 22:41:44 UTC 1 0 0 >>> View 2010-01-05 21:41:44 UTC 2 1 1226 >>> View 2010-01-05 20:41:44 UTC 3 2 3294 >>> View 2010-01-05 19:41:44 UTC 4 0 0 >>> View 2010-01-05 18:41:44 UTC 5 0 0 >>> >>> If I click the view link for hours 0 or 1 I get "We could not find any results in the system for that search.", the only time something shows up is when I put a change on my etch server, like with hours 2 or 3, and those look fine: >>> >>> Results >>> View all these combined >>> Client File Time Success Message Size >>> View web077 /etc/httpd/conf.d/vhosts.conf 2010-01-05 21:20:50 UTC true 1226 >>> >>> I'll change my crontab to log output instead of /dev/null and see what I get. >>> >>> Thanks for the additional info about how yours is setup, it's helpful to know I've got this mostly setup the way it should be ;) >>> >>> On Mon, Jan 4, 2010 at 1:37 PM, Jason Heiss <jh...@ap...> wrote: >>> What status value do the broken clients report? 1? >>> >>> If you're looking at a broken client in the web UI, is the "Message" field empty? If so, click the "24 hrs" timeline link, then the "0" hours ago "View" link, do any of the files show "false" in the "Success" column? >>> >>> The client will return a non-zero status to the server if it encounters any form of Ruby exception while processing. This would be failure to connect to the etch server or some error processing the configuration data sent by the server. In your case some sort of error connecting to the server seems most likely, although interestingly those clients are able to connect to report their results. Looking over the code, it seems like currently the message associated with any sort of connection error is printed to stderr, but not sent to the server. In which case you'd have "broken" clients with a status of 1 but no message. Is your cron job sending stdout/stderr to /dev/null? You might try letting a few clients email that to you or dump it to a file to see if you can catch the error. >>> >>> I'll modify the client code to add the exception message to the message sent to the server. >>> >>> FWIW, we run unicorn with 20 workers in our production environment. Behind nginx, although as you indicated the front-end web proxy doesn't seem to make a difference. >>> >>> I concur that the warning from facter is likely unrelated. >>> >>> Jason >>> >>> On Dec 30, 2009, at 2:39 PM, Kenneth Williams wrote: >>> >>>> Hi all! >>>> >>>> I've started moving out of my test environment and beginning to move to production use. As part of that I've gone from using unicorn with one worker to testing four workers and an Apache proxy. Everything seems to work, and scales better when deploying to more hosts as you'd expect, but the etch dashboard reports hosts as broken using this setup. I've tested it in various combinations, using just unicorn without apache and multiple workers directly, and with apache using multiple masters with only one worker. The only setup I can get working without hosts being listed as broken is one master with one worker. Unfortunately, and as you could probably guess, it takes an eternity to push changes using only one worker once you throw in more than just a couple hosts... Apache as a proxy does not seem to make a difference, accessing unicorn through it's own port, or through the Apache proxy has no noticeable change in the number of broken hosts. In the end I'd like Apache to proxy to multiple unicorn masters on different hosts, but right now I'd settle for being able to have more than one worker running ;) >>>> >>>> The list of "broken" hosts steadily increases over the day at around the ten minute interval when etch client kicks off from cron. It starts off with just a few in a pool of 40 hosts listed as broken and goes up from there by one or two hosts every ten minutes. It seems to stop around 25 +/- 3 "broken" hosts, and the hosts will alternate at the ten minute interval. If I put a change in my etch source directory it does get pushed out to the hosts, even the ones listed as broken, and if I log into a broken host and run etch manually it runs fine, except for two warnings. When running etch client manually it removes the host from the broken list, only to add it back in later. I've always ignored the warning because it did not seem to have any impact under the previous test setup. It seemed to have cropped up when I upgraded from 3.11 to the ruby gem 3.13 version. There are two hosts still running the 3.11 client that don't produce this warning, but they're also subject to being listed as broken along with the others. Just in case its important, the warning is: >>>> >>>> /usr/lib/ruby/site_ruby/1.8/facter/ec2.rb:8: warning: method redefined; discarding old can_connect? >>>> /usr/lib/ruby/site_ruby/1.8/facter/ec2.rb:16: warning: method redefined; discarding old metadata >>>> >>>> I don't think this is related to my problem though.The etch client command I'm running that produces this is: >>>> >>>> /usr/bin/etch --generate-all --server http://etch:8080/ >>>> >>>> Otherwise there are no errors produced by the etch client. Port 8080 is running through the Apache proxy, behind it is currently only one unicorn master with 20 workers. I'm running etch client version 3.13 on the nodes, and on the server I'm running 3.11. Please let me know if you need any additional details, any help is truly appreciated.Thanks!! >>>> >>>> -- >>>> Kenneth Williams >>>> ------------------------------------------------------------------------------ >>>> This SF.Net email is sponsored by the Verizon Developer Community >>>> Take advantage of Verizon's best-in-class app development support >>>> A streamlined, 14 day to market process makes app distribution fast and easy >>>> Join now and get one step closer to millions of Verizon customers >>>> http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ >>>> etch-users mailing list >>>> etc...@li... >>>> https://lists.sourceforge.net/lists/listinfo/etch-users >>> >>> >>> >>> >>> -- >>> Kenneth Williams >> >> >> >> >> -- >> Kenneth Williams <www.krw.info> >> No man's life, liberty, or property are safe while the legislature is in session. - Mark Twain > > > > > -- > Kenneth Williams <www.krw.info> > No man's life, liberty, or property are safe while the legislature is in session. - Mark Twain |