|
From: Sandeep G. <gup...@gm...> - 2014-02-05 00:20:18
|
Hi Koichi, Thank you suggesting these parameters. Initially we did play around these. However, we used significantly higher values such checkpoint_timeout=30mins etc. Essentially we were trying parameters so as to avoid interference from autovaccum in the first place. The reason was using low values was to recreate the problem in the test setup. I did the regression tests with the new setting it is certainly better. It does crash but not so often. I will try to use in the application and see if runs in the main application. Also, I am running the same tests over a standalone PG (9.3 version I believe) and so far it has crashed. I haven't been too careful to make sure to use the exact same values for checkpoint parameters. Next email I will attach log files for review. Thanks. Sandeep On Tue, Feb 4, 2014 at 8:34 AM, Koichi Suzuki <koi...@gm...> wrote: > I looked at the log at datanode and found checkpoint is running too > frequently. Default checkpoint timeout is 5min. In your case, > checkpoint runs almost every five seconds (not minutes) in each > datanode. It is extraordinary. > > Could you try to tweak each datanode's postgresql.conf as follows? > > 1. Longer period for checkpoint_timeout. Default is 5min. 15min. > will be okay. > 2. Larger value for checkpoint_completion_target. Default is 0.5. > It should be okay. Larger value, such as 0.7, will make make > checkpoint work more smoothly. > 3. Larger value of checkpoint_segment. Default is 3. Because your > application updates the database very frequently, this number of > checkpoint segment will be exhausted very easily. Increase this to, > say, 30 or more. Each checkpoint_segment (in fact, WAL file) > consumes 16MB of your file space. I hope this is no problem to you at > all. > > I'm afraid too frequent checkpoint causes this kind of error (even > with vanilla PostgreSQL) and this situation is what you should avoid > both in PG and XC. > > Would like to know if things are improved. > > Best; > --- > Koichi Suzuki > > > 2014-02-04 Sandeep Gupta <gup...@gm...>: > > Hi Koichi, > > > > Just wanted to add that I have send across the datanode and coordinator > log > > files in my previous email. My hope is that it may give some insights > into > > what could be amiss and any ideas for workaround. > > > > > > Thanks. > > Sandeep > > > |