Re: [Postgres-xc-general] tracing cause of datanode crash

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Koichi,

 Thanks for taking a look. I did some further debugging and I won't rule
out the possibility of the the OS killing some postgres processes which
leads to copy execution reaching in a inconsistent state.
Will do some further tests to confirm this as well.

-Sandeep

On Mon, Nov 18, 2013 at 1:06 AM, Koichi Suzuki <koi...@gm...>wrote:

> This happens when the message is not null-terminated properly.   Most
> case is a bug.   In this case, different from vanilla PostgreSQL, copy
> data is involved in.    Please let me find a time to look into it.
>
> Regards;
> ---
> Koichi Suzuki
>
>
> 2013/11/16 Sandeep Gupta <gup...@gm...>:
> > Forgot to mention  that the error is still the same "invalid string in
> > message".
> >
> > -Sandeep
> >
> >
> >
> > On Fri, Nov 15, 2013 at 8:27 PM, Sandeep Gupta <gup...@gm...>
> > wrote:
> >>
> >> Hi Koichi,
> >>
> >>  We tried the  copy with the patch but it did not fix the problem for
> us.
> >> If you need any additional details then let me know.
> >>
> >> Best,
> >> -Sandeep
> >>
> >>
> >> On Fri, Nov 15, 2013 at 6:13 AM, Sandeep Gupta <gup...@gm...
> >
> >> wrote:
> >>>
> >>> Hi Koichi,
> >>>
> >>>  Many thanks for sharing the patch. We would test this in our
> environment
> >>> and would report back on the result.
> >>>
> >>> Regarding the very high memory consumption, it is not the coordinator
> >>> that sees the high memory usage.
> >>> The setup is following: We have one coordinator and 4 datanodes. All
> the
> >>> datanodes run on separate machine
> >>> then the coordinator.  It is the machine where the datanodes are
> running
> >>> that have 48GB memory usage.
> >>> The other caveat is that that "top" doesn't show that it is the
> datanodes
> >>> processes using the memory.
> >>>
> >>> In any case, the last run of copy even with the high usage went through
> >>> fine.
> >>>
> >>>
> >>> Best,
> >>> Sandeep
> >>>
> >>>
> >>>
> >>> On Fri, Nov 15, 2013 at 2:29 AM, Koichi Suzuki <koi...@gm...>
> >>> wrote:
> >>>>
> >>>> Yes, I've already found this kind of failure.   When disk write is
> >>>> quite slower than read, coordinator overshoots.  However, in this
> >>>> case, I thought we will see that palloc() overflows.   If each piece
> >>>> of COPY TO text data is less than 1GB, then each COPY TO succeeds.
> >>>> If the write is much slower than read,  each coordinator will allocate
> >>>> memory for each COPY TO cache going to datanodes, ending up with
> >>>> enormous amount of memory consumption and have a good chance to run
> >>>> OOM killer.    Your memory usage looks enormous.    Maybe you have
> >>>> many COPY TO instances each of which allocates too much memory.
> >>>>
> >>>> Attached patch fixes COPY TO.   Because we have a chance to have
> >>>> similar overshoots in COPY FROM, I'd like to write a patch to fix this
> >>>> too before committing.   This is for the current master but will work
> >>>> with 1.1.x (even with 1.0.x hopefully).
> >>>>
> >>>> The patch looks to be fine.   I asked to test it in such an
> >>>> environment and they reported the patch fixed their issue.
> >>>>
> >>>> I'm not sure but hope this works in your case.   May reduce memory
> >>>> consumption drastically.   Please let me know the result.
> >>>>
> >>>> I have to allocate some of my resource to fix the same issue in COPY
> >>>> FROM.    I have another request to fix this so if the patch works for
> >>>> you, I may be able to commit only COPY TO fix first.
> >>>>
> >>>> Regards;
> >>>> ---
> >>>> Koichi Suzuki
> >>>>
> >>>>
> >>>> 2013/11/15 Sandeep Gupta <gup...@gm...>:
> >>>> > Hi Koichi,
> >>>> >
> >>>> >  I am using the copy command over the text file to populate a table.
> >>>> >
> >>>> > Sorry for the confusion. I meant 628 -- 638 lines in the pqformat.c.
> >>>> > Let me explain this a bit more. I get  error in some of the datanode
> >>>> > logfiles
> >>>> > as "invalid string in message". The message is mentioned only in
> >>>> > pqformat.c
> >>>> > at line 628. Although I am not able to trace back as to where  error
> >>>> > may
> >>>> > have first ordered in the datanode that triggered this error.
> >>>> >
> >>>> > In order to debug this I started printing out the rows the
> coordinator
> >>>> > is
> >>>> > sending to the datanode (in function DataNodeCopyIn). For some
> reason
> >>>> > the
> >>>> > error invalid string in messge error has  not happened so far.  It
> >>>> > seems
> >>>> > there is some issue with the flow control of message between
> >>>> > coordinator and
> >>>> > datanode.
> >>>> > Hope this helps in narrowing down what is causing the problem in the
> >>>> > first
> >>>> > place.
> >>>> >
> >>>> >
> >>>> > However, the new problem I see is that the memory usage goes very
> >>>> > high. Of
> >>>> > the 48 GB almost all of the memory is getting used. Is this normal?
> >>>> > If not can you give suggestion as to how to keep the memory
> >>>> > consumption
> >>>> > low.
> >>>> >
> >>>> > -Sandeep
> >>>> >
> >>>> >
> >>>> >
> >>>> > On Thu, Nov 14, 2013 at 8:57 PM, Koichi Suzuki <
> koi...@gm...>
> >>>> > wrote:
> >>>> >>
> >>>> >> Could you let me know what "line 682 -- 637" means?
> >>>> >>
> >>>> >> Because any communication between coordinator and datanode is done
> >>>> >> using libpq, I don't think XC populates information using raw
> >>>> >> stringinfo.   In fact, from datanodes, coordinator is just a client
> >>>> >> except for supplying XID and snapshot.   However,  COPY TO uses
> >>>> >> send()/receive(), not through libpq, to send data to datanode.  In
> >>>> >> this case, if incoming data file contains NULL in the data, such
> data
> >>>> >> may be sent via send() to datanodes.   COPY TO assumes incoming
> data
> >>>> >> in a text format.   Do you think you have such situations?
> >>>> >>
> >>>> >> Analysis may take a bit but let me try.
> >>>> >>
> >>>> >> Regards;
> >>>> >> ---
> >>>> >> Koichi Suzuki
> >>>> >>
> >>>> >>
> >>>> >> 2013/11/15 Sandeep Gupta <gup...@gm...>:
> >>>> >> > Hi Koichi, Masataka,
> >>>> >> >
> >>>> >> >  Thanks for taking a look. I double checked both memory usage and
> >>>> >> > the
> >>>> >> > dmesg
> >>>> >> > log. The kill is not happening due to the out of memory. The
> kernel
> >>>> >> > doesn't
> >>>> >> > show any processes being killed.
> >>>> >> >
> >>>> >> > My current suspicion is that something amiss is happening over
> the
> >>>> >> > network.
> >>>> >> > This happens consistently when performing copy over large
> datasets.
> >>>> >> > Few
> >>>> >> > of
> >>>> >> > datanodes end up with "invalid string in message".
> >>>> >> > Going over the code in pqformat.c this happens because the
> >>>> >> > invariant
> >>>> >> > "StringInfo is guaranteed to have a trailing null byte" is not
> >>>> >> > valid
> >>>> >> > (line
> >>>> >> > 682 -- 637). I am not sure why this happens.
> >>>> >> >
> >>>> >> >
> >>>> >> > After this the datanodes do something illegal (probably buffer
> >>>> >> > overflow
> >>>> >> > on
> >>>> >> > the port..just a guess) that the OS decides to issue a sigkill.
>  I
> >>>> >> > am
> >>>> >> > also
> >>>> >> > not sure why the OS is not logging this killed operation.
> >>>> >> >
> >>>> >> >
> >>>> >> > Please let me know of  suggestions you may have regarding the
> >>>> >> > "invalid
> >>>> >> > string in message". In particular, where in the code is
> StringInfo
> >>>> >> > msg
> >>>> >> > is
> >>>> >> > getting populated i.e. which code reads the message over the
> >>>> >> > network and
> >>>> >> > keeps it ready for the copy operation on the datanode. Any other
> >>>> >> > suggestion
> >>>> >> > would also be helpful to trace this bug.
> >>>> >> >
> >>>> >> > Best,
> >>>> >> > Sandeep
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> > On Thu, Nov 14, 2013 at 10:25 AM, Masataka Saito <
> pg...@gm...>
> >>>> >> > wrote:
> >>>> >> >>
> >>>> >> >> Hi,
> >>>> >> >>
> >>>> >> >> I think I may be onto something.
> >>>> >> >>
> >>>> >> >> When the memory is exhausted, the Linux kernel kills process
> >>>> >> >> randomly
> >>>> >> >> by SIGKILL.
> >>>> >> >> The mechanism is called OOM(Out Of Memory) killer.
> >>>> >> >> OOM killer logs its activity to kernel mesage buffer.
> >>>> >> >>
> >>>> >> >> Could you check display message (dmesg) and memory resource?
> >>>> >> >>
> >>>> >> >> Regards.
> >>>> >> >>
> >>>> >> >> On Tue, Nov 12, 2013 at 3:55 AM, Sandeep Gupta
> >>>> >> >> <gup...@gm...>
> >>>> >> >> wrote:
> >>>> >> >> > Hi Koichi,
> >>>> >> >> >
> >>>> >> >> >    It is a bit of mystery  because it does not happen
> >>>> >> >> > consistently.
> >>>> >> >> > Thanks
> >>>> >> >> > for the clarification though. It is indeed helpful.
> >>>> >> >> >
> >>>> >> >> > -Sandeep
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> > On Mon, Nov 11, 2013 at 1:58 AM, 鈴木 幸市
> >>>> >> >> > <ko...@in...>
> >>>> >> >> > wrote:
> >>>> >> >> >>
> >>>> >> >> >> Someone is sending SIGKILL to coordinator/datanode backend
> >>>> >> >> >> processes.
> >>>> >> >> >> Although XC (originally PG) code has a handler for SIGKILL, I
> >>>> >> >> >> didn’t
> >>>> >> >> >> found
> >>>> >> >> >> any code in XC sending SIGKILL to other processes.   I’m
> afraid
> >>>> >> >> >> there
> >>>> >> >> >> could
> >>>> >> >> >> be another process sending SIGKILL to them.
> >>>> >> >> >>
> >>>> >> >> >> Best Regards;
> >>>> >> >> >> ---
> >>>> >> >> >> Koichi Suzuki
> >>>> >> >> >>
> >>>> >> >> >> 2013/11/09 2:54、Sandeep Gupta <gup...@gm...>
> のメール：
> >>>> >> >> >>
> >>>> >> >> >> > I am running single instance of postgres-xc with  several
> >>>> >> >> >> > datanode
> >>>> >> >> >> > Each data node runs on its own machine.
> >>>> >> >> >> > After instantiating the cluster the database sits idle. I
> do
> >>>> >> >> >> > not
> >>>> >> >> >> > perform
> >>>> >> >> >> > any create table or insert operations. However, after some
> >>>> >> >> >> > time
> >>>> >> >> >> > all
> >>>> >> >> >> > the
> >>>> >> >> >> > datanode  automatically shuts down (log messages attached).
> >>>> >> >> >> >
> >>>> >> >> >> > Any pointers as to why this maybe happening would be very
> >>>> >> >> >> > useful.
> >>>> >> >> >> >
> >>>> >> >> >> > -Sandeep
> >>>> >> >> >> >
> >>>> >> >> >> > The log files across all datanode contain various error
> >>>> >> >> >> > messages.
> >>>> >> >> >> > These
> >>>> >> >> >> > include:
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> > LOG:  statistics collector process (PID 23073) was
> terminated
> >>>> >> >> >> > by
> >>>> >> >> >> > signal
> >>>> >> >> >> > 9: Killed
> >>>> >> >> >> > LOG:  WAL writer process (PID 23071) was terminated by
> signal
> >>>> >> >> >> > 9:
> >>>> >> >> >> > Killed
> >>>> >> >> >> > LOG:  terminating any other active server processes
> >>>> >> >> >> > WARNING:  terminating connection because of crash of
> another
> >>>> >> >> >> > server
> >>>> >> >> >> > process
> >>>> >> >> >> > DETAIL:  The postmaster has commanded this server process
> to
> >>>> >> >> >> > roll
> >>>> >> >> >> > back
> >>>> >> >> >> > the current transaction and exit, because another server
> >>>> >> >> >> > process
> >>>> >> >> >> > exited
> >>>> >> >> >> > abnormally and possibly corrupted shared memory.
> >>>> >> >> >> > HINT:  In a moment you should be able to reconnect to the
> >>>> >> >> >> > database
> >>>> >> >> >> > and
> >>>> >> >> >> > repeat your command.
> >>>> >> >> >> > LOG:  all server processes terminated; reinitializing
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> > LOG:  checkpointer process (PID 21406) was terminated by
> >>>> >> >> >> > signal 9:
> >>>> >> >> >> > Killed
> >>>> >> >> >> > LOG:  terminating any other active server processes
> >>>> >> >> >> > WARNING:  terminating connection because of crash of
> another
> >>>> >> >> >> > server
> >>>> >> >> >> > process
> >>>> >> >> >> > DETAIL:  The postmaster has commanded this server process
> to
> >>>> >> >> >> > roll
> >>>> >> >> >> > back
> >>>> >> >> >> > the current transaction and exit, because another server
> >>>> >> >> >> > process
> >>>> >> >> >> > exited
> >>>> >> >> >> > abnormally and possibly corrupted shared memory.
> >>>> >> >> >> > LOG:  statistics collector process (PID 28881) was
> terminated
> >>>> >> >> >> > by
> >>>> >> >> >> > signal
> >>>> >> >> >> > 9: Killed
> >>>> >> >> >> > LOG:  autovacuum launcher process (PID 28880) was
> terminated
> >>>> >> >> >> > by
> >>>> >> >> >> > signal
> >>>> >> >> >> > 9: Killed
> >>>> >> >> >> > LOG:  terminating any other active server processes
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> ------------------------------------------------------------------------------
> >>>> >> >> >> > November Webinars for C, C++, Fortran Developers
> >>>> >> >> >> > Accelerate application performance with scalable
> programming
> >>>> >> >> >> > models.
> >>>> >> >> >> > Explore
> >>>> >> >> >> > techniques for threading, error checking, porting, and
> >>>> >> >> >> > tuning. Get
> >>>> >> >> >> > the
> >>>> >> >> >> > most
> >>>> >> >> >> > from the latest Intel processors and coprocessors. See
> >>>> >> >> >> > abstracts
> >>>> >> >> >> > and
> >>>> >> >> >> > register
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> >>>> >> >> >> >
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk_______________________________________________
> >>>> >> >> >> > Postgres-xc-general mailing list
> >>>> >> >> >> > Pos...@li...
> >>>> >> >> >> >
> >>>> >> >> >> >
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> >>>> >> >> >>
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> ------------------------------------------------------------------------------
> >>>> >> >> > November Webinars for C, C++, Fortran Developers
> >>>> >> >> > Accelerate application performance with scalable programming
> >>>> >> >> > models.
> >>>> >> >> > Explore
> >>>> >> >> > techniques for threading, error checking, porting, and tuning.
> >>>> >> >> > Get
> >>>> >> >> > the
> >>>> >> >> > most
> >>>> >> >> > from the latest Intel processors and coprocessors. See
> abstracts
> >>>> >> >> > and
> >>>> >> >> > register
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> >>>> >> >> >
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> >>>> >> >> > _______________________________________________
> >>>> >> >> > Postgres-xc-general mailing list
> >>>> >> >> > Pos...@li...
> >>>> >> >> >
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> >>>> >> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> >
> ------------------------------------------------------------------------------
> >>>> >> > DreamFactory - Open Source REST & JSON Services for HTML5 &
> Native
> >>>> >> > Apps
> >>>> >> > OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API
> >>>> >> > Access
> >>>> >> > Free app hosting. Or install the open source package on any LAMP
> >>>> >> > server.
> >>>> >> > Sign up and see examples for AngularJS, jQuery, Sencha Touch and
> >>>> >> > Native!
> >>>> >> >
> >>>> >> >
> >>>> >> >
> http://pubads.g.doubleclick.net/gampad/clk?id=63469471&iu=/4140/ostg.clktrk
> >>>> >> > _______________________________________________
> >>>> >> > Postgres-xc-general mailing list
> >>>> >> > Pos...@li...
> >>>> >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> >>>> >> >
> >>>> >
> >>>> >
> >>>
> >>>
> >>
> >
>