From: rohit d. <roh...@gm...> - 2011-11-24 16:22:11
|
Hi, I tried to make changes to my client code and also tried running the transactions in serializable isolation level. Although some of the transaction were getting aborted whenever there were conflicts, but in the end it still got stuck again. I have set up the deadlock timeout in postgres but still it doesn't seem to work. I tried running this with different versions of postgres (postgres 7.2, 7.3, 8.0 and 9.0) but no use. So I guess it's probably because of the bug in Sequoia. It can not handle the SELECT...FOR UPDATE statements properly and it needs to be taken care of. Does anyone know how to report the bug for sequoia? You can check this problem by running BenchmarkSql-2.3 ( http://sourceforge.net/projects/benchmarksql/) with Sequoia and two postgres servers. Please let me know if there is any progress in this regard. thank you very much for all the help. Regards, Rohit On Wed, Nov 23, 2011 at 5:28 PM, rohit dhamane <roh...@gm...>wrote: > Hi Emmanuel, > Thanks a lot. That was really helpful reply. I will try to fix the issue > with my client application and see is it works. > > Regards, > Rohit > > > On Wed, Nov 23, 2011 at 5:21 PM, Emmanuel Cecchet <ce...@gm...>wrote: > >> Hi, >> >> Thank you very much for your quick reply emmanuel. I have already tried >> setting the timeout in load balancer but it doesn't work for me I guess :( >> I have been facing this problem with the SELECT.. FOR UPDATE statements. >> While going through the source code of sequoia I found this in >> >> sequoia-2.10.10-src/src/org/continuent/sequoia/controller/requests/SelectRequest.java >> >> public class SelectRequest extends AbstractRequest implements >> Serializable >> { >> private static final long serialVersionUID = 6498520472410320514L; >> >> /** >> * Set to true if this SelectRequest must be broadcasted on the cluster >> * (useful for queries like SELECT FOR UPDATE >> */ >> private boolean mustBroadcast = false; >> <Line 76> >> >> Do I have to change mustBroadcast to true and recompile the sequoia to >> enable it to do SELECT...FOR UPDATE ? >> >> No this value is automatically set by the parser. When the select >> statement is parsed, it is considered as a write on the table and is >> broadcast to all databases to acquire a write lock. >> >> >> Also, here's the request log from a similar test (it again got stuck at >> same point) . >> I apologize for copy pasting the entire log here. >> >> 06:42:10,221 DEBUG controller.core.PingResponder Response to ping sent >> to /127.0.0.1 >> 06:42:11,207 DEBUG controller.core.PingResponder Received ping data from / >> 127.0.0.1 >> 06:42:11,207 DEBUG controller.core.PingResponder Response to ping sent to >> /127.0.0.1 >> 06:42:12,351 DEBUG sequoia.controller.loadbalancer Building wait-for >> graph... >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer warehouse locked by 13 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer new_order locked by 12 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer district locked by 13 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer Walking wait-for >> graph... >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer 17 waits for new_order >> locked by 12 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer 12 waits for >> district locked by 13 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer 15 waits for new_order >> locked by 12 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer 12 waits for >> district locked by 13 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer 14 waits for warehouse >> locked by 13 >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer Choosing victim node... >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer 0 cycles detected >> 06:42:12,352 DEBUG sequoia.controller.loadbalancer No victim >> >> Looks like transaction 13 is not completing. I have reduced your log to >> the incriminating transactions below: >> >> >> *Request.log* >> >> 06:41:26,450 tpcw B 12 >> 06:41:26,450 tpcw S 175 12 SELECT c_discount, c_last, c_credit, w_tax >> FROM customer, warehouse WHERE w_id = ? AND w_id = c_w_id AND c_d_id = ? >> AND c_id = ?/<!%I|1|!%><!%I|4|!%><!%I|1258|!%> >> 06:41:26,664 tpcw S 178 12 SELECT d_next_o_id, d_tax FROM district WHERE >> d_id = ? AND d_w_id = ? FOR UPDATE/<!%I|4|!%><!%I|1|!%> >> 06:41:26,673 tpcw W 179 12 INSERT INTO NEW_ORDER (no_o_id, no_d_id, >> no_w_id) VALUES ( ?, ?, ?)/<!%I|3001|!%><!%I|4|!%><!%I|1|!%> >> 06:41:26,917 tpcw W 181 0 UPDATE stock SET s_quantity = ? , s_ytd = s_ytd >> + ?, s_remote_cnt = s_remote_cnt + ? WHERE s_i_id = ? AND s_w_id = >> ?/<!%I|71|!%><!%I|2|!%><!%I|0|!%><!%I|92253|!%><!%I|2|!%> >> 06:41:27,633 tpcw B 13 >> 06:41:27,633 tpcw W 188 13 UPDATE warehouse SET w_ytd = w_ytd + ? WHERE >> w_id = ?/<!%F|56.68|!%><!%I|1|!%> >> 06:41:27,680 tpcw S 190 13 SELECT w_street_1, w_street_2, w_city, >> w_state, w_zip, w_name FROM warehouse WHERE w_id = ?/<!%I|1|!%> >> 06:41:27,681 tpcw W 191 13 UPDATE district SET d_ytd = d_ytd + ? WHERE >> d_w_id = ? AND d_id = ?/<!%F|56.68|!%><!%I|1|!%><!%I|4|!%> >> 06:41:28,664 tpcw B 14 >> 06:41:28,664 tpcw W 204 14 UPDATE warehouse SET w_ytd = w_ytd + ? WHERE >> w_id = ?/<!%F|4539.02|!%><!%I|2|!%> >> 06:41:30,433 tpcw B 15 >> 06:41:30,433 tpcw S 226 15 SELECT c_discount, c_last, c_credit, w_tax >> FROM customer, warehouse WHERE w_id = ? AND w_id = c_w_id AND c_d_id = ? >> AND c_id = ?/<!%I|3|!%><!%I|2|!%><!%I|2751|!%> >> 06:41:30,500 tpcw S 231 15 SELECT d_next_o_id, d_tax FROM district WHERE >> d_id = ? AND d_w_id = ? FOR UPDATE/<!%I|2|!%><!%I|3|!%> >> 06:41:30,512 tpcw W 233 15 INSERT INTO NEW_ORDER (no_o_id, no_d_id, >> no_w_id) VALUES ( ?, ?, ?)/<!%I|3001|!%><!%I|2|!%><!%I|3|!%> >> 06:41:32,424 tpcw B 17 >> 06:41:32,424 tpcw S 249 17 SELECT c_discount, c_last, c_credit, w_tax >> FROM customer, warehouse WHERE w_id = ? AND w_id = c_w_id AND c_d_id = ? >> AND c_id = ?/<!%I|2|!%><!%I|2|!%><!%I|1676|!%> >> 06:41:32,487 tpcw S 250 17 SELECT d_next_o_id, d_tax FROM district WHERE >> d_id = ? AND d_w_id = ? FOR UPDATE/<!%I|2|!%><!%I|2|!%> >> 06:41:32,489 tpcw W 251 17 INSERT INTO NEW_ORDER (no_o_id, no_d_id, >> no_w_id) VALUES ( ?, ?, ?)/<!%I|3001|!%><!%I|2|!%><!%I|2|!%> >> >> >> I found the problem. The lock that transaction 12 acquired on the >> district table (in its select for update) has not been properly added and >> the deadlock has not been detected. >> So on one side, Sequoia has a bug with not properly updating its lock >> table on this select for update (the query parsing code might have an >> issue). >> On the other side, your transactions are not serializable and they >> deadlock (the application should be fixed). By serialiazing writes per >> table, Sequoia exposes these bugs faster than if you were running directly >> on top of the database (but you should be able to obtain the same deadlock >> directly on the database). Actually is stuck on the database deadlock. I >> don't remember if the version of Postgres you are using has a global >> transaction timeout that you can set so that Postgres would kill the >> deadlocked transactions itself (quick fix). >> >> Hope this helps >> Emmanuel >> >> >> >> On Wed, Nov 23, 2011 at 3:09 PM, Emmanuel Cecchet <ce...@gm...>wrote: >> >>> Hi, >>> >>> From your log it looks like everyone is waiting for that transaction 65 >>> to either commit or rollback. >>> If your thread has crashed and there is no rollback or connection >>> closing in a catch block, all the locks are going to be retained forever >>> (thought there is a possibility to set timeout on transactions in the load >>> balancer settings if I recall). >>> Without the log of the exact requests, it's hard to tell you more than >>> that. >>> >>> Keep us posted with your findings, >>> Emmanuel >>> >>> >>> On 11/23/2011 06:30, rohit dhamane wrote: >>> >>> Hi, >>> >>> I have been facing some problem with sequoia. I am trying to run >>> BenchmarkSql-2.3 (http://sourceforge.net/projects/benchmarksql/) with >>> sequoia over two postgresql-7.3 servers but it seems to get stuck >>> indefinitely at the same point. I think it's happening because of the >>> SELECT...FOR UPDATE but not really sure how :( It seems that the >>> transactions are getting into a deadlock. Here's the out put when I turned >>> on the debug mode. >>> >>> 02:56:49,388 DEBUG controller.core.PingResponder Received ping data >>> from /127.0.0.1 >>> 02:56:49,388 DEBUG controller.core.PingResponder Response to ping sent >>> to /127.0.0.1 >>> 02:56:49,775 DEBUG sequoia.controller.loadbalancer Building wait-for >>> graph... >>> 02:56:49,776 DEBUG sequoia.controller.loadbalancer new_order locked by 64 >>> 02:56:49,777 DEBUG sequoia.controller.loadbalancer district locked by 65 >>> 02:56:49,777 DEBUG sequoia.controller.loadbalancer warehouse locked by 65 >>> 02:56:49,777 DEBUG sequoia.controller.loadbalancer Walking wait-for >>> graph... >>> 02:56:49,778 DEBUG sequoia.controller.loadbalancer 64 waits for district >>> locked by 65 >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer 68 waits for >>> new_order locked by 64 >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer 64 waits for >>> district locked by 65 >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer 67 waits for >>> warehouse locked by 65 >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer 66 waits for >>> new_order locked by 64 >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer 64 waits for >>> district locked by 65 >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer Choosing victim >>> node... >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer 0 cycles detected >>> 02:56:49,779 DEBUG sequoia.controller.loadbalancer No victim >>> 02:56:50,389 DEBUG controller.core.PingResponder Received ping data from >>> /127.0.0.1 >>> 02:56:50,389 DEBUG controller.core.PingResponder Response to ping sent >>> to /127.0.0.1 >>> >>> >>> >>> Here's my configuration file.. >>> >>> <?xml version="1.0" encoding="UTF-8"?>^M >>> <!DOCTYPE SEQUOIA PUBLIC "-//Continuent//DTD SEQUOIA @VERSION@//EN" " >>> http://sequoia.continuent.org/dtds/sequoia-@VERSION@.dtd">^M >>> ^M >>> <SEQUOIA>^M >>> ^M >>> <VirtualDatabase name="tpcw">^M >>> ^M >>> <AuthenticationManager>^M >>> <Admin>^M >>> <User username="postgres" password=""/>^M >>> </Admin> ^M >>> <VirtualUsers>^M >>> <VirtualLogin vLogin="postgres" vPassword=""/>^M >>> </VirtualUsers>^M >>> </AuthenticationManager>^M >>> ^M >>> <DatabaseBackend name="localhost" driver="org.postgresql.Driver" >>> driverPath="/home/rohit/Software/CJDBC/sequoia-2.10.10-bin/drivers/postgresql7.3.jar" >>> url="jdbc:postgresql://localhost:6551/tpcw" connectionTestStatement="select >>> now()">^M >>> >>> <ConnectionManager vLogin="postgres" rLogin="postgres" >>> rPassword="">^M >>> <VariablePoolConnectionManager initPoolSize="20" minPoolSize="5" >>> maxPoolSize="0" idleTimeout="180" waitTimeout="0"/>^M >>> </ConnectionManager>^M >>> </DatabaseBackend>^M >>> ^M >>> <DatabaseBackend name="Remote" driver="org.postgresql.Driver" >>> driverPath="/home/rohit/Software/CJDBC/sequoia-2.10.10-bin/drivers/postgresql7.3.jar" >>> url="jdbc:postgresql://192.168.164.136:6552/tpcw" >>> connectionTestStatement="select now()">^M >>> <ConnectionManager vLogin="postgres" rLogin="postgres" >>> rPassword="">^M >>> <VariablePoolConnectionManager initPoolSize="20" minPoolSize="5" >>> maxPoolSize="0" idleTimeout="180" waitTimeout="0"/>^M >>> </ConnectionManager>^M >>> </DatabaseBackend>^M >>> <RequestManager>^M >>> <RequestScheduler>^M >>> <RAIDb-1Scheduler level="passThrough"/>^M >>> </RequestScheduler>^M >>> ^M >>> <RequestCache>^M >>> <MetadataCache/>^M >>> <ParsingCache/> ^M >>> <ResultCache granularity="table" />^M >>> </RequestCache>^M >>> ^M >>> <LoadBalancer transactionIsolation="serializable">^M >>> <RAIDb-1>^M >>> <WaitForCompletion policy="all" >>> deadlockTimeoutInMs="30000"/>^M >>> <RAIDb-1-RoundRobin/>^M >>> </RAIDb-1>^M >>> </LoadBalancer>^M >>> </RequestManager>^M >>> ^M >>> </VirtualDatabase>^M >>> ^M >>> </SEQUOIA>^M >>> >>> If anybody has anu idea why this happens please let me know. I would >>> appreciate it a lot. Thanks in advance. >>> >>> Regards, >>> Rohit >>> >>> >>> ------------------------------------------------------------------------------ >>> All the data continuously generated in your IT infrastructure >>> contains a definitive record of customers, application performance, >>> security threats, fraudulent activity, and more. Splunk takes this >>> data and makes sense of it. IT sense. And common sense.http://p.sf.net/sfu/splunk-novd2d >>> >>> >>> >>> _______________________________________________ >>> Sequoiadb-discuss mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/sequoiadb-discuss >>> >>> >>> >> >> > |