Menu

#72 DDL DML commands are hanging for ever.. even on REBOOT cluster not helped

Undecided
open
None
1
None
Nikhil Sontakke
2016-10-21
2016-10-21
No

Hi,

We are using PGXL 9.5 r 1.3, Observed for db restore operation postgres connection state are (idle in transaction) on datanode, (UPDATE WAITING) on coordinator and waiting indefinitely forever for specific tables. On reboot of the PGXL cluster also, except "SELECT" every other DML operations are hanging for ever on those tables. Same behavior observed for partition table creation under high load of postgres XL DB.

Is it there any workaround to come out of it? Any help can be appreciated.

Below are the backtraces for coordinator and datanodes:

Thanks,
Srinivas

Coordinator:
(gdb) bt

0 0x0000000800f6f61c in poll () from /lib/libc.so.7

1 0x0000000800dcb92e in poll () from /lib/libthr.so.3

2 0x00000000006247e4 in pgxc_node_receive (conn_count=2, connections=0x886e5c1f8, timeout=0x0) at pgxcnode.c:527

3 0x000000000062dac8 in ExecRemoteUtility (node=0x886e54a80) at execRemote.c:3489

4 0x000000000072fe6c in ExecUtilityStmtOnNodes (queryString=0x801256980 "DROP TABLE trigger_detail;", nodes=0x0, sentToRemote=<value optimized="" out="">,

force_autocommit=0 '\0', exec_type=EXEC_ON_ALL_NODES, is_temp=<value optimized out>) at utility.c:4256

5 0x000000000073009e in ExecDropStmt (stmt=0x801256bb8, queryString=0x801256980 "DROP TABLE trigger_detail;", sentToRemote=<value optimized="" out="">,

isTopLevel=<value optimized out>) at utility.c:2559

6 0x00000000007307ea in ProcessUtilitySlow (parsetree=0x801256bb8, queryString=0x801256980 "DROP TABLE trigger_detail;",

context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=<value optimized out>, sentToRemote=<value optimized out>, completionTag=0x7fffffffd470 "")
at utility.c:2397

7 0x0000000000732ccc in standard_ProcessUtility (parsetree=0x801256bb8, queryString=0x801256980 "DROP TABLE trigger_detail;",

context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x801256ff8, sentToRemote=<value optimized out>, completionTag=0x7fffffffd470 "") at utility.c:1239

8 0x000000000072b9aa in PortalRunUtility (portal=0x886e3d038, utilityStmt=0x801256bb8, isTopLevel=1 '\001', dest=0x801256ff8,

completionTag=0x7fffffffd470 "") at pquery.c:1712

9 0x000000000072cc00 in PortalRunMulti (portal=0x886e3d038, isTopLevel=1 '\001', dest=0x801256ff8, altdest=0x801256ff8, completionTag=0x7fffffffd470 "")

at pquery.c:1859

10 0x000000000072dc3b in PortalRun (portal=0x886e3d038, count=9223372036854775807, isTopLevel=<value optimized="" out="">, dest=0x801256ff8,

altdest=0x801256ff8, completionTag=0x7fffffffd470 "") at pquery.c:1159

11 0x00000000007284d0 in exec_simple_query (query_string=<value optimized="" out="">) at postgres.c:1355

12 0x000000000072a86a in PostgresMain (argc=1, argv=<value optimized="" out="">, dbname=0x801210cc0 "mpsdb", username=0x801210ca0 "pgxl") at postgres.c:4666

13 0x00000000006c67c7 in ServerLoop () at postmaster.c:4477

14 0x00000000006ca22a in PostmasterMain (argc=5, argv=0x7fffffffea38) at postmaster.c:1409

15 0x000000000063c497 in main (argc=5, argv=0x7fffffffea38) at main.c:228

(gdb)

DATANODES:

pgxl 75966 0.0 1.3 2273500 98784 ?? Ss 6:03PM 0:00.08 postgres: pgxl mpsdb 10.106.100.10(52502) DROP TABLE waiting (postgres)

(gdb) bt

0 0x0000000800f6f61c in poll () from /lib/libc.so.7

1 0x0000000800dcb92e in poll () from /lib/libthr.so.3

2 0x00000000006b7450 in WaitLatchOrSocket (latch=0x8860c9218, wakeEvents=<value optimized="" out="">, sock=-1, timeout=0) at pg_latch.c:333

3 0x0000000000716754 in ProcSleep (locallock=0x80125d3e8, lockMethodTable=<value optimized="" out="">) at proc.c:1143

4 0x000000000071412c in WaitOnLock (locallock=0x80125d3e8, owner=0x8013cbbc8) at lock.c:1742

5 0x000000000071583a in LockAcquireExtendedXC (locktag=0x7fffffffce00, lockmode=8, sessionLock=<value optimized="" out="">, dontWait=0 '\0',

reportMemoryError=1 '\001', only_increment=<value optimized out>) at lock.c:1031

6 0x000000000071193b in LockRelationOid (relid=40339, lockmode=8) at lmgr.c:112

7 0x000000000051bd7a in RangeVarGetRelidExtended (relation=0x801386688, lockmode=8, missing_ok=1 '\001', nowait=0 '\0',

callback=0x5c4770 <RangeVarCallbackForDropRelation>, callback_arg=0x7fffffffcea0) at namespace.c:390

8 0x00000000005c4614 in RemoveRelations (drop=0x801256bb8) at tablecmds.c:943

9 0x00000000007300fd in ExecDropStmt (stmt=0x801256bb8, queryString=0x801256568 "DROP TABLE trigger_detail;", sentToRemote=<value optimized="" out="">,

isTopLevel=<value optimized out>) at utility.c:2537

10 0x00000000007307ea in ProcessUtilitySlow (parsetree=0x801256bb8, queryString=0x801256568 "DROP TABLE trigger_detail;",

context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=<value optimized out>, sentToRemote=<value optimized out>, completionTag=0x7fffffffd470 "")
at utility.c:2397

11 0x0000000000732ccc in standard_ProcessUtility (parsetree=0x801256bb8, queryString=0x801256568 "DROP TABLE trigger_detail;",

context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x801256ff8, sentToRemote=<value optimized out>, completionTag=0x7fffffffd470 "") at utility.c:1239

12 0x000000000072b9aa in PortalRunUtility (portal=0x801351038, utilityStmt=0x801256bb8, isTopLevel=1 '\001', dest=0x801256ff8,

completionTag=0x7fffffffd470 "") at pquery.c:1712

13 0x000000000072cc00 in PortalRunMulti (portal=0x801351038, isTopLevel=1 '\001', dest=0x801256ff8, altdest=0x801256ff8, completionTag=0x7fffffffd470 "")

at pquery.c:1859

14 0x000000000072dc3b in PortalRun (portal=0x801351038, count=9223372036854775807, isTopLevel=<value optimized="" out="">, dest=0x801256ff8,

altdest=0x801256ff8, completionTag=0x7fffffffd470 "") at pquery.c:1159

15 0x00000000007284d0 in exec_simple_query (query_string=<value optimized="" out="">) at postgres.c:1355

16 0x000000000072a86a in PostgresMain (argc=1, argv=<value optimized="" out="">, dbname=0x801210d58 "mpsdb", username=0x801256868 '\177' <repeats 200="" times="">...)

at postgres.c:4666

17 0x00000000006c67c7 in ServerLoop () at postmaster.c:4477

18 0x00000000006ca22a in PostmasterMain (argc=5, argv=0x7fffffffea30) at postmaster.c:1409

19 0x000000000063c497 in main (argc=5, argv=0x7fffffffea30) at main.c:228

Discussion


Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.