Screenshot instructions:
Windows
Mac
Red Hat Linux
Ubuntu
Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)
From: Ted Dunning (JIRA) <jira@os...> - 2009-06-04 22:23:35
|
Katta is much too sensitive to recoverable KeeperExceptions ----------------------------------------------------------- Key: KATTA-69 URL: http://oss.101tec.com/jira/browse/KATTA-69 Project: Katta Issue Type: Bug Reporter: Ted Dunning If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. At the least, Katta should handle these situations more gracefully. For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Ted Dunning (JIRA) <jira@os...> - 2009-06-04 22:23:35
|
Katta is much too sensitive to recoverable KeeperExceptions ----------------------------------------------------------- Key: KATTA-69 URL: http://oss.101tec.com/jira/browse/KATTA-69 Project: Katta Issue Type: Bug Reporter: Ted Dunning If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. At the least, Katta should handle these situations more gracefully. For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Ted Dunning (JIRA) <jira@os...> - 2009-06-04 22:30:28
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10142#action_10142 ] Ted Dunning commented on KATTA-69: ---------------------------------- But watch out for this. I can see how to change "exists", but many others are more delicate because I can't tell if they are idempotent or not. > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Ted Dunning (JIRA) <jira@os...> - 2009-06-04 22:32:34
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10142#action_10142 ] Ted Dunning edited comment on KATTA-69 at 6/4/09 10:30 PM: ----------------------------------------------------------- But watch out for this. http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling I can see how to change "exists", but many others are more delicate because I can't tell if they are idempotent or not. was (Author: tdunning): But watch out for this. I can see how to change "exists", but many others are more delicate because I can't tell if they are idempotent or not. > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Ted Dunning (JIRA) <jira@os...> - 2009-06-04 22:40:28
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated KATTA-69: ----------------------------- Attachment: retry_for_exists.patch Here is a patch for one problem at least. > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > Attachments: retry_for_exists.patch > > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Jason Venner (JIRA) <jira@os...> - 2009-06-30 19:02:11
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10170#action_10170 ] Jason Venner commented on KATTA-69: ----------------------------------- I have applied this patch to trunk 473, and the unit tests hang in the MasterTest. The Master.shutdown is hanging in the join with the DistributeShardsThread, which is in _updateLock.getUpdatedCondition().await(); I am guessing that await is not honoring the interrupt. Jstack follows Attaching to process ID 82990, please wait... Debugger attached successfully. Server compiler detected. JVM version is 11.3-b02-83 Deadlock Detection: No deadlocks found. Thread t@...: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1925 (Interpreted frame) - net.sf.katta.master.DistributeShardsThread.run() @bci=190, line=141 (Interpreted frame) Thread t@...: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1925 (Interpreted frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=358 (Interpreted frame) - org.apache.zookeeper.ClientCnxn$EventThread.run() @bci=4, line=355 (Interpreted frame) Thread t@...: (state = IN_NATIVE) - sun.nio.ch.KQueueArrayWrapper.kevent0(int, long, int, long) @bci=0 (Compiled frame; information may be imprecise) - sun.nio.ch.KQueueArrayWrapper.poll(long) @bci=12, line=136 (Compiled frame) - sun.nio.ch.KQueueSelectorImpl.doSelect(long) @bci=46, line=69 (Compiled frame) - sun.nio.ch.SelectorImpl.lockAndDoSelect(long) @bci=37, line=69 (Compiled frame) - sun.nio.ch.SelectorImpl.select(long) @bci=30, line=80 (Compiled frame) - org.apache.zookeeper.ClientCnxn$SendThread.run() @bci=192, line=852 (Compiled frame) Thread t@...: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Interpreted frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1925 (Interpreted frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=358 (Interpreted frame) - org.apache.zookeeper.ClientCnxn$EventThread.run() @bci=4, line=355 (Interpreted frame) Thread t@...: (state = IN_NATIVE) - sun.nio.ch.KQueueArrayWrapper.kevent0(int, long, int, long) @bci=0 (Compiled frame; information may be imprecise) - sun.nio.ch.KQueueArrayWrapper.poll(long) @bci=12, line=136 (Compiled frame) - sun.nio.ch.KQueueSelectorImpl.doSelect(long) @bci=46, line=69 (Compiled frame) - sun.nio.ch.SelectorImpl.lockAndDoSelect(long) @bci=37, line=69 (Compiled frame) - sun.nio.ch.SelectorImpl.select(long) @bci=30, line=80 (Compiled frame) - org.apache.zookeeper.ClientCnxn$SendThread.run() @bci=192, line=852 (Interpreted frame) Thread t@...: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1925 (Compiled frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=358 (Compiled frame) - org.apache.zookeeper.server.PrepRequestProcessor.run() @bci=4, line=97 (Interpreted frame) Thread t@...: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=158 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=1925 (Compiled frame) - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=358 (Compiled frame) - org.apache.zookeeper.server.SyncRequestProcessor.run() @bci=16, line=71 (Interpreted frame) Thread t@...: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Interpreted frame) - org.apache.zookeeper.server.SessionTrackerImpl.run() @bci=36, line=124 (Interpreted frame) Thread t@...: (state = IN_NATIVE) - sun.nio.ch.KQueueArrayWrapper.kevent0(int, long, int, long) @bci=0 (Compiled frame; information may be imprecise) - sun.nio.ch.KQueueArrayWrapper.poll(long) @bci=12, line=136 (Compiled frame) - sun.nio.ch.KQueueSelectorImpl.doSelect(long) @bci=46, line=69 (Compiled frame) - sun.nio.ch.SelectorImpl.lockAndDoSelect(long) @bci=37, line=69 (Compiled frame) - sun.nio.ch.SelectorImpl.select(long) @bci=30, line=80 (Compiled frame) - org.apache.zookeeper.server.NIOServerCnxn$Factory.run() @bci=20, line=142 (Interpreted frame) Thread t@...: (state = BLOCKED) Thread t@...: (state = BLOCKED) Thread t@...: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Interpreted frame) - java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=116 (Interpreted frame) - java.lang.ref.ReferenceQueue.remove() @bci=2, line=132 (Interpreted frame) - java.lang.ref.Finalizer$FinalizerThread.run() @bci=3, line=159 (Interpreted frame) Thread t@...: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Interpreted frame) - java.lang.Object.wait() @bci=2, line=485 (Interpreted frame) - java.lang.ref.Reference$ReferenceHandler.run() @bci=46, line=116 (Interpreted frame) Thread t@...: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Interpreted frame) - java.lang.Thread.join(long) @bci=38, line=1167 (Interpreted frame) - java.lang.Thread.join() @bci=2, line=1220 (Interpreted frame) - net.sf.katta.master.Master.shutdown() @bci=11, line=118 (Interpreted frame) - net.sf.katta.AbstractKattaTest$MasterStartThread.shutdown() @bci=4, line=277 (Interpreted frame) - net.sf.katta.master.MasterTest.testRebalanceIndexAfterNodeCrash() @bci=371, line=211 (Interpreted frame) - sun.reflect.NativeMethodAccessorImpl.invoke0(java.lang.reflect.Method, java.lang.Object, java.lang.Object[]) @bci=0 (Interpreted frame) - sun.reflect.NativeMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=87, line=39 (Interpreted frame) - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=25 (Interpreted frame) - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=161, line=597 (Interpreted frame) - junit.framework.TestCase.runTest() @bci=96, line=154 (Interpreted frame) - junit.framework.TestCase.runBare() @bci=5, line=127 (Interpreted frame) - junit.framework.TestResult$1.protect() @bci=4, line=106 (Interpreted frame) - junit.framework.TestResult.runProtected(junit.framework.Test, junit.framework.Protectable) @bci=1, line=124 (Interpreted frame) - junit.framework.TestResult.run(junit.framework.TestCase) @bci=17, line=109 (Interpreted frame) - junit.framework.TestCase.run(junit.framework.TestResult) @bci=2, line=118 (Interpreted frame) - junit.framework.TestSuite.runTest(junit.framework.Test, junit.framework.TestResult) @bci=2, line=208 (Interpreted frame) - junit.framework.TestSuite.run(junit.framework.TestResult) @bci=31, line=203 (Interpreted frame) - org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run() @bci=431, line=420 (Interpreted frame) - org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(org.apache.tools.ant.taskdefs.optional.junit.JUnitTest, boolean, boolean, boolean, boolean, boolean, boolean) @bci=39, line=911 (Interpreted frame) - org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(java.lang.String[]) @bci=741, line=768 (Interpreted frame) > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > Attachments: retry_for_exists.patch > > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Jason Venner (JIRA) <jira@os...> - 2009-06-30 19:21:25
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10171#action_10171 ] Jason Venner commented on KATTA-69: ----------------------------------- This failure may be intermittant. The next time I ran the test set, the test completed normally. java -version java version "1.6.0_13" Java(TM) SE Runtime Environment (build 1.6.0_13-b03-211) Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02-83, mixed mode) host environment macos X leopard- > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > Attachments: retry_for_exists.patch > > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Stefan Groschupf (JIRA) <jira@os...> - 2009-10-04 04:00:38
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Groschupf updated KATTA-69: ---------------------------------- Fix Version/s: 0.6 Assignee: Peter Voss Hi Peter, I think with the zkclient refactoring this kind of issues should be solved. Do you think we can close this issue? > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > Assignee: Peter Voss > Fix For: 0.6 > > Attachments: retry_for_exists.patch > > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Ted Dunning (JIRA) <jira@os...> - 2009-10-04 18:31:40
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10264#action_10264 ] Ted Dunning commented on KATTA-69: ---------------------------------- There is a related issue in that if a client can't broadcast to all nodes because the search configuration has not yet updated, then the entire search is lost rather than just recording the exception. This is also related to the bug for restructuring of the ZK data so that ZK can delete all state when a node is lost rather than depending on the master to do that (KATTA-43 and KATTA-58) > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > Assignee: Peter Voss > Fix For: 0.6 > > Attachments: retry_for_exists.patch > > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |
From: Stefan Groschupf (JIRA) <jira@os...> - 2009-10-14 21:45:41
|
[ http://oss.101tec.com/jira/browse/KATTA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Groschupf resolved KATTA-69. ----------------------------------- Resolution: Duplicate I merging this issues into KATTA-43 as well and generally solve that problem. > Katta is much too sensitive to recoverable KeeperExceptions > ----------------------------------------------------------- > > Key: KATTA-69 > URL: http://oss.101tec.com/jira/browse/KATTA-69 > Project: Katta > Issue Type: Bug > Reporter: Ted Dunning > Assignee: Peter Voss > Fix For: 0.6 > > Attachments: retry_for_exists.patch > > > If you get a ConnectionLossException when trying to deploy an index, the entire index is marked as ERROR and can never recover. > At the least, Katta should handle these situations more gracefully. > For instance, ZkClient.exists just blows out, transforming the recoverable exception into a non-recoverable KattaException. > It is dangerous to change too many of these, but some can probably be fixed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://oss.101tec.com/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira |