You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(26) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
(5) |
Feb
(1) |
Mar
|
Apr
(71) |
May
(22) |
Jun
(47) |
Jul
(32) |
Aug
(18) |
Sep
(9) |
Oct
(4) |
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mahadev K. <ma...@ya...> - 2008-08-07 18:45:08
|
The only thing I see different is that you seem to be using an old version of zookeeper on apache. Can you download trunk from http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk And try it again? The print statement -- ³Trying to connect² is not there in the code any more. Can you try the trunk so that the log information matches the current code in the trunk and easier to debug. mahadev On 8/7/08 11:23 AM, "Satish Bhatti" <cth...@gm...> wrote: > Hey Mahadev, > > oops, I didn't have the server started when I ran the echo command! However, > I do still have problems running the tutorial example. I sent you the client > and server output. Did you notice anything strange in there? I think the > > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > null: 0--1 > > in the client log are suspicious. > > scbmac:~/zookeeper $ echo "stat" | nc localhost 2181 > Zookeeper version: 2.2.0-173, built on 08/06/2008 17:35 GMT > Clients: > /127.0.0.1:54574[1](queued=0,recved=0,sent=0) > > Latency min/avg/max: 0/0/0 > Received: 0 > Sent: 2 > Outstanding: 0 > Zxid: 0x8 > Mode: standalone > Node count: 2 > scbmac:~/zookeeper $ > scbmac:~/zookeeper $ echo "stat" | nc `hostname` 2181 > Zookeeper version: 2.2.0-173, built on 08/06/2008 17:35 GMT > Clients: > /10.74.0.159:54575[1](queued=0,recved=0,sent=0) > > Latency min/avg/max: 0/0/0 > Received: 0 > Sent: 3 > Outstanding: 0 > Zxid: 0x8 > Mode: standalone > Node count: 2 > scbmac:~/zookeeper $ > > > > > On Thu, Aug 7, 2008 at 11:12 AM, Mahadev Konar <ma...@ya...> wrote: >> Hi Satish, >> I just tried it on my mac and the client connects fine. I used your config >> and >> >> Echo "stat" | nc localhost 2181 shows >> Zookeeper version: 3.0.0-681910, built on 08/07/2008 17:32 GMT >> Clients: >> /0:0:0:0:0:0:0:1%0:65305[1](queued=0,recved=0,sent=0) >> /0:0:0:0:0:0:0:1%0:65280[1](queued=0,recved=102,sent=102) >> >> Latency min/avg/max: 3/20/38 >> Received: 102 >> Sent: 102 >> Outstanding: 0 >> Zxid: 0x2 >> Mode: standalone >> >> >> Also, which zookeeper version are you trrying out? My test was on trunk on >> apache. >> >> http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk >> >> >> Mahadev >> >> >> >> On 8/7/08 11:01 AM, "Satish Bhatti" <cth...@gm... >> <http://cth...@gm...> > wrote: >> >>> scbmac:~/zookeeper $ echo "stat" | nc localhost 2181 >>> scbmac:~/zookeeper $ echo "stat" | nc `hostname` 2181 >>> scbmac:~/zookeeper $ >>> >>> Was there supposed to be output? >>> >>> I ran the simple tutorial example again. Here's what I got: >>> >>> CLIENT >>> ----------- >>> >>> scbmac:~/zookeeper $ java -cp >>> zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf >>> com.yahoo.zookeeper.ZooKeeper 127.0.0.1:2181WARN - >>> [SendThread:ClientCnxn$SendThread@639] - Trying to connect to >>> /127.0.0.1:2181 <http://127.0.0.1:2181> <http://127.0.0.1:2181> >>> WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to >>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:54372 >>> <http://127.0.0.1:54372> <http://127.0.0.1:54372> remote=/127.0.0.1:2181 >>> <http://127.0.0.1:2181> <http://127.0.0.1:2181> ] >>> >>> WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: >>> java.io.IOException: TIMED OUT >>> at >>> com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) >>> null: 0--1 >>> WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to >>> /127.0.0.1:2181 <http://127.0.0.1:2181> <http://127.0.0.1:2181> >>> WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to >>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:54373 >>> <http://127.0.0.1:54373> <http://127.0.0.1:54373> remote=/127.0.0.1:2181 >>> <http://127.0.0.1:2181> <http://127.0.0.1:2181> ] >>> >>> WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: >>> java.io.IOException: TIMED OUT >>> at >>> com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) >>> null: 0--1 >>> WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to >>> /127.0.0.1:2181 <http://127.0.0.1:2181> <http://127.0.0.1:2181> >>> WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to >>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:54374 >>> <http://127.0.0.1:54374> <http://127.0.0.1:54374> remote=/127.0.0.1:2181 >>> <http://127.0.0.1:2181> <http://127.0.0.1:2181> ] >>> >>> WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: >>> java.io.IOException: TIMED OUT >>> at >>> com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) >>> null: 0--1 >>> WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to >>> /127.0.0.1:2181 <http://127.0.0.1:2181> <http://127.0.0.1:2181> >>> WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to >>> java.nio.channels.SocketChannel[connected local=/127.0.0.1:54375 >>> <http://127.0.0.1:54375> <http://127.0.0.1:54375> remote=/127.0.0.1:2181 >>> <http://127.0.0.1:2181> <http://127.0.0.1:2181> ] >>> >>> scbmac:~/zookeeper $ >>> >>> >>> SERVER >>> ------------- >>> >>> scbmac:~/zookeeper $ java -cp >>> zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf >>> com.yahoo.zookeeper.server.quorum.QuorumPeer conf/zoo.cfg >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to >>> /127.0.0.1:54372 <http://127.0.0.1:54372> <http://127.0.0.1:54372> >>> lastZxid 0 >>> >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session >>> 11b9e52f33a0000 >>> WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0000: >>> true >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to >>> /127.0.0.1:54373 <http://127.0.0.1:54373> <http://127.0.0.1:54373> >>> lastZxid 0 >>> >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session >>> 11b9e52f33a0001 >>> WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0001: >>> true >>> WARN - [SessionTracker:SessionTrackerImpl@133] - Expiring 11b9e52f33a0000 >>> WARN - [ProcessThread:PrepRequestProcessor@341] - Processed session >>> termination request for id: 11b9e52f33a0000 >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to >>> /127.0.0.1:54374 <http://127.0.0.1:54374> <http://127.0.0.1:54374> >>> lastZxid 0 >>> >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session >>> 11b9e52f33a0002 >>> WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0002: >>> true >>> WARN - [SessionTracker:SessionTrackerImpl@133] - Expiring 11b9e52f33a0001 >>> WARN - [ProcessThread:PrepRequestProcessor@341] - Processed session >>> termination request for id: 11b9e52f33a0001 >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to >>> /127.0.0.1:54375 <http://127.0.0.1:54375> <http://127.0.0.1:54375> >>> lastZxid 0 >>> >>> WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session >>> 11b9e52f33a0003 >>> WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0003: >>> true >>> WARN - [SessionTracker:SessionTrackerImpl@133] - Expiring 11b9e52f33a0002 >>> WARN - [ProcessThread:PrepRequestProcessor@341] - Processed session >>> termination request for id: 11b9e52f33a0002 >>> scbmac:~/zookeeper $ >>> >>> >>> On Thu, Aug 7, 2008 at 10:53 AM, Mahadev Konar <ma...@ya... >>> <http://ma...@ya...> > wrote: >>>> For your example, do you see anything unusual in the server logs? Also did >>>> you try -- >>>> >>>> Echo "stat" | nc localhost 2181 >>>> Echo "stat" | nc hostname 2181? >>>> >>>> >>>> >>>> mahadev >>>> >>>> >>>> >>>> On 8/7/08 10:11 AM, "Satish Bhatti" <cth...@gm... >>>> <http://cth...@gm...> <http://cth...@gm...> > wrote: >>>> >>>>> Yeah, it works great on Ubuntu. Did you try it out on your Mac yet? >>>>> >>>>> Satish >>>>> >>>>> On Wed, Aug 6, 2008 at 6:08 PM, Mahadev Konar <ma...@ya... >>>>> <http://ma...@ya...> <http://ma...@ya...> > wrote: >>>>>> In that case it looks like some problem with Mac os. I have a leopard os >>>>>> as well. I can try it out. Does your example woork on your ubuntu >>>>>> machine? >>>>>> >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.Net email is sponsored by the Moblin Your Move Developer's >>>> challenge >>>> Build the coolest Linux based applications with Moblin SDK & win great >>>> prizes >>>> Grand prize is a trip for two to an Open Source event anywhere in the world >>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >>>> _______________________________________________ >>>> Zookeeper-user mailing list >>>> Zoo...@li... >>>> <http://Zoo...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/zookeeper-user >>>> >>> >>> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >> Build the coolest Linux based applications with Moblin SDK & win great prizes >> Grand prize is a trip for two to an Open Source event anywhere in the world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> Zookeeper-user mailing list >> Zoo...@li... >> https://lists.sourceforge.net/lists/listinfo/zookeeper-user >> > > |
From: Patrick H. <ph...@gm...> - 2008-08-07 18:43:01
|
An update on the move to Apache: Issue tracking, source control, and mailing lists have been moved to Apache hosting infrastructure: Our Apache homepage can be found at: http://hadoop.apache.org/zookeeper/ Mailing lists are at: http://hadoop.apache.org/zookeeper/mailing_lists.html Please move communication to these lists (the project maintainers are still monitoring conversation on sourceforge mailing lists) Many of the pages on the sourceforge twiki are being converted to a new html/pdf based documentation format. This is in progress. Additionally we will continue to have a wiki hosted on Apache: http://wiki.apache.org/hadoop/ZooKeeper We have not posted an Apache release at this point. Please continue to d/l the latest release (2.2.1) from sourceforge: http://sourceforge.net/project/showfiles.php?group_id=209147 Patrick |
From: Mahadev K. <ma...@ya...> - 2008-08-07 18:12:58
|
Hi Satish, I just tried it on my mac and the client connects fine. I used your config and Echo ³stat² | nc localhost 2181 shows Zookeeper version: 3.0.0-681910, built on 08/07/2008 17:32 GMT Clients: /0:0:0:0:0:0:0:1%0:65305[1](queued=0,recved=0,sent=0) /0:0:0:0:0:0:0:1%0:65280[1](queued=0,recved=102,sent=102) Latency min/avg/max: 3/20/38 Received: 102 Sent: 102 Outstanding: 0 Zxid: 0x2 Mode: standalone Also, which zookeeper version are you trrying out? My test was on trunk on apache. http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk Mahadev On 8/7/08 11:01 AM, "Satish Bhatti" <cth...@gm...> wrote: > scbmac:~/zookeeper $ echo "stat" | nc localhost 2181 > scbmac:~/zookeeper $ echo "stat" | nc `hostname` 2181 > scbmac:~/zookeeper $ > > Was there supposed to be output? > > I ran the simple tutorial example again. Here's what I got: > > CLIENT > ----------- > > scbmac:~/zookeeper $ java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.ZooKeeper 127.0.0.1:2181WARN - > [SendThread:ClientCnxn$SendThread@639] - Trying to connect to /127.0.0.1:2181 > <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:54372 > <http://127.0.0.1:54372> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > null: 0--1 > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to > /127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:54373 > <http://127.0.0.1:54373> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > null: 0--1 > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to > /127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:54374 > <http://127.0.0.1:54374> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > null: 0--1 > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to > /127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:54375 > <http://127.0.0.1:54375> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > scbmac:~/zookeeper $ > > > SERVER > ------------- > > scbmac:~/zookeeper $ java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.server.quorum.QuorumPeer conf/zoo.cfg > WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to > /127.0.0.1:54372 <http://127.0.0.1:54372> lastZxid 0 > WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session > 11b9e52f33a0000 > WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0000: > true > WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to > /127.0.0.1:54373 <http://127.0.0.1:54373> lastZxid 0 > WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session > 11b9e52f33a0001 > WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0001: > true > WARN - [SessionTracker:SessionTrackerImpl@133] - Expiring 11b9e52f33a0000 > WARN - [ProcessThread:PrepRequestProcessor@341] - Processed session > termination request for id: 11b9e52f33a0000 > WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to > /127.0.0.1:54374 <http://127.0.0.1:54374> lastZxid 0 > WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session > 11b9e52f33a0002 > WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0002: > true > WARN - [SessionTracker:SessionTrackerImpl@133] - Expiring 11b9e52f33a0001 > WARN - [ProcessThread:PrepRequestProcessor@341] - Processed session > termination request for id: 11b9e52f33a0001 > WARN - [NIOServerCxn.Factory:NIOServerCnxn@471] - Connected to > /127.0.0.1:54375 <http://127.0.0.1:54375> lastZxid 0 > WARN - [NIOServerCxn.Factory:NIOServerCnxn@500] - Creating new session > 11b9e52f33a0003 > WARN - [SyncThread:NIOServerCnxn@774] - Finished init of 11b9e52f33a0003: > true > WARN - [SessionTracker:SessionTrackerImpl@133] - Expiring 11b9e52f33a0002 > WARN - [ProcessThread:PrepRequestProcessor@341] - Processed session > termination request for id: 11b9e52f33a0002 > scbmac:~/zookeeper $ > > > On Thu, Aug 7, 2008 at 10:53 AM, Mahadev Konar <ma...@ya...> wrote: >> For your example, do you see anything unusual in the server logs? Also did >> you try -- >> >> Echo "stat" | nc localhost 2181 >> Echo "stat" | nc hostname 2181? >> >> >> >> mahadev >> >> >> >> On 8/7/08 10:11 AM, "Satish Bhatti" <cth...@gm... >> <http://cth...@gm...> > wrote: >> >>> Yeah, it works great on Ubuntu. Did you try it out on your Mac yet? >>> >>> Satish >>> >>> On Wed, Aug 6, 2008 at 6:08 PM, Mahadev Konar <ma...@ya... >>> <http://ma...@ya...> > wrote: >>>> In that case it looks like some problem with Mac os. I have a leopard os as >>>> well. I can try it out. Does your example woork on your ubuntu machine? >>>> >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >> Build the coolest Linux based applications with Moblin SDK & win great prizes >> Grand prize is a trip for two to an Open Source event anywhere in the world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> Zookeeper-user mailing list >> Zoo...@li... >> https://lists.sourceforge.net/lists/listinfo/zookeeper-user >> > > |
From: Mahadev K. <ma...@ya...> - 2008-08-07 17:53:51
|
For your example, do you see anything unusual in the server logs? Also did you try -- Echo ³stat² | nc localhost 2181 Echo ³stat² | nc hostname 2181? mahadev On 8/7/08 10:11 AM, "Satish Bhatti" <cth...@gm...> wrote: > Yeah, it works great on Ubuntu. Did you try it out on your Mac yet? > > Satish > > On Wed, Aug 6, 2008 at 6:08 PM, Mahadev Konar <ma...@ya...> wrote: >> In that case it looks like some problem with Mac os. I have a leopard os as >> well. I can try it out. Does your example woork on your ubuntu machine? >> |
From: Mahadev K. <ma...@ya...> - 2008-08-07 17:44:05
|
Hi Satish, I just ran the tests on my mac and saw that two of the tests quorumtest and asynctest are failing. It looks like that there is a bind exception. Mac OS does not seem to be cleaning up the ports fast enough for the tests to reuse the ports. So the test failure mostly looks like bind problems with our test servers. mahadev On 8/7/08 10:11 AM, "Satish Bhatti" <cth...@gm...> wrote: > Yeah, it works great on Ubuntu. Did you try it out on your Mac yet? > > Satish > > On Wed, Aug 6, 2008 at 6:08 PM, Mahadev Konar <ma...@ya...> wrote: >> In that case it looks like some problem with Mac os. I have a leopard os as >> well. I can try it out. Does your example woork on your ubuntu machine? >> >> >> mahadev |
From: Satish B. <cth...@gm...> - 2008-08-07 00:54:36
|
oops, I meant to send this to the list! ---------- Forwarded message ---------- From: Satish Bhatti <cth...@gm...> Date: Wed, Aug 6, 2008 at 5:50 PM Subject: Re: [Zookeeper-user] FW: Client Connect Exceptions + Zookeeper Unit Tests fail To: Mahadev Konar <ma...@ya...> Hi Mahadev, I am running on Mac OSX Leopard. scbmac:~/agent/mserver $ java -version java version "1.6.0_03-p3" Java(TM) SE Runtime Environment (build 1.6.0_03-p3-landonf_03_feb_2008_01_32-b00) Java HotSpot(TM) 64-Bit Server VM (build 1.6.0_03-p3-landonf_03_feb_2008_01_32-b00, mixed mode) This is the 64 bit soylatte JVM. Some more datapoints: (1) I did not have a server running for the unit tests, because from reading the unit test code, it looks like the tests start up their own server on port 2181. (2) The unit tests all completed without error on my Ubuntu Linux box. This kinda looks like some platform/JDK issue? Satish On Wed, Aug 6, 2008 at 4:26 PM, Mahadev Konar <ma...@ya...> wrote: > Forwarding it to all > > ------ Forwarded Message > *From: *Mahadev Konar <ma...@ya...> > *Date: *Wed, 06 Aug 2008 11:54:49 -0700 > *To: *Satish Bhatti <cth...@gm...> > *Conversation: *[Zookeeper-user] Client Connect Exceptions + Zookeeper > Unit Tests fail > *Subject: *Re: [Zookeeper-user] Client Connect Exceptions + Zookeeper Unit > Tests fail > > Hi Satish, > What os are you running it on? > Also, did you have the server running while you were running the tests? We > use hard coded ports in the tests — 2181 is one of them. So the tests would > fail since it would not be able to bind to the server . > > Mahadev > > > On 8/6/08 11:49 AM, "Satish Bhatti" <cth...@gm...> wrote: > > I am using the 2.2.1 version of ZooKeeper. My zoo.cfg file is: > > # The number of milliseconds of each tick > tickTime=2000 > > # the directory where the snapshot is stored. > dataDir=/Users/satish/zookeeper/server1/data/ > > # the port at which the clients will connect > clientPort=2181 > > The command line I am using for the server is: > > java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.server.quorum.QuorumPeer conf/zoo.cfg > > and for the client: > > java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.ZooKeeper 127.0.0.1:2181 <http://127.0.0.1:2181> > > Here is what I get on the client end: > > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to / > 127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:63815 < > http://127.0.0.1:63815> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > null: 3--1 > > > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > null: 0--1 > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to / > 127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:63816 < > http://127.0.0.1:63816> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > > null: 3--1 > > create /test hello > > Processing create > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > Exception in thread "main" com.yahoo.zookeeper.KeeperException: > KeeperErrorCode = ConnectionLoss for /test > at com.yahoo.zookeeper.ZooKeeper.create(ZooKeeper.java:244) > at com.yahoo.zookeeper.ZooKeeper.processCmd(ZooKeeper.java:788) > at com.yahoo.zookeeper.ZooKeeper.main(ZooKeeper.java:745) > > At this point the client exits. > > Note the continuous stream of "null: 3--1" type errors. Why is it > constantly having to reconnect to the server? Finally, when I type in a > simple create command the client goes belly up and exits. Even though I am > following the exact instructions in the SourceForge wiki, I figured maybe I > was doing something wrong, so I decided to run the unit tests. Sure enough, > they also failed! > > > junit.run: > Warning: Reference test.classpath has not been set at runtime, but was > found during > build file parsing, attempting to resolve. Future versions of Ant may > support > referencing ids defined in non-executed targets. > [junit] Running com.yahoo.zookeeper.server.DeserializationPerfTest > [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 9.549 sec > [junit] Running com.yahoo.zookeeper.server.SerializationPerfTest > [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 6.78 sec > [junit] Running com.yahoo.zookeeper.server.ZooKeeperServerTest > [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.139 sec > [junit] Running com.yahoo.zookeeper.test.AsyncTest > [junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 103.648 sec > [junit] Test com.yahoo.zookeeper.test.AsyncTest FAILED > [junit] Running com.yahoo.zookeeper.test.ClientTest > [junit] Tests run: 4, Failures: 4, Errors: 0, Time elapsed: 140.248 sec > [junit] Test com.yahoo.zookeeper.test.ClientTest FAILED > [junit] Running com.yahoo.zookeeper.test.DataTreeTest > [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.087 sec > [junit] Running com.yahoo.zookeeper.test.OOMTest > [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.014 sec > [junit] Running com.yahoo.zookeeper.test.QuorumTest > [junit] Tests run: 4, Failures: 3, Errors: 1, Time elapsed: 169.124 sec > [junit] Test com.yahoo.zookeeper.test.QuorumTest FAILED > [junit] Running com.yahoo.zookeeper.test.RecoveryTest > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 14.635 sec > [junit] Test com.yahoo.zookeeper.test.RecoveryTest FAILED > [junit] Running com.yahoo.zookeeper.test.SessionTest > [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 32.542 sec > [junit] Test com.yahoo.zookeeper.test.SessionTest FAILED > > BUILD FAILED > /Users/satish/zookeeper/build.xml:336: The following error occurred while > executing this line: > /Users/satish/zookeeper/build.xml:314: Tests failed! > > Total time: 9 minutes 4 seconds > > Any ideas what's going on? > > Satish > > ------------------------------ > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > > ------ End of Forwarded Message > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > |
From: Patrick H. <ph...@gm...> - 2008-08-06 23:37:57
|
Hi Satish, your config file looks like it's missing some bits, can you try basing off the zoo_sample.cfg that's include in the zookeeper java bin release? (change datadir of course) Looks like this: # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. dataDir=/export/crawlspace/mahadev/zookeeper/server1/data # the port at which the clients will connect clientPort=2181 Patrick Satish Bhatti wrote: > I am using the 2.2.1 version of ZooKeeper. My zoo.cfg file is: > > # The number of milliseconds of each tick > tickTime=2000 > > # the directory where the snapshot is stored. > dataDir=/Users/satish/zookeeper/server1/data/ > > # the port at which the clients will connect > clientPort=2181 > > The command line I am using for the server is: > > java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.server.quorum.QuorumPeer conf/zoo.cfg > > and for the client: > > java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.ZooKeeper 127.0.0.1:2181 <http://127.0.0.1:2181> > > Here is what I get on the client end: > > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to > /127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:63815 > <http://127.0.0.1:63815> remote=/127.0.0.1:2181 <http://127.0.0.1:2181>] > null: 3--1 > > > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > null: 0--1 > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to > /127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:63816 > <http://127.0.0.1:63816> remote=/127.0.0.1:2181 <http://127.0.0.1:2181>] > null: 3--1 > > create /test hello > > Processing create > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > Exception in thread "main" com.yahoo.zookeeper.KeeperException: > KeeperErrorCode = ConnectionLoss for /test > at com.yahoo.zookeeper.ZooKeeper.create(ZooKeeper.java:244) > at com.yahoo.zookeeper.ZooKeeper.processCmd(ZooKeeper.java:788) > at com.yahoo.zookeeper.ZooKeeper.main(ZooKeeper.java:745) > > At this point the client exits. > > Note the continuous stream of "null: 3--1" type errors. Why is it > constantly having to reconnect to the server? Finally, when I type in a > simple create command the client goes belly up and exits. Even though I > am following the exact instructions in the SourceForge wiki, I figured > maybe I was doing something wrong, so I decided to run the unit tests. > Sure enough, they also failed! > > > junit.run: > Warning: Reference test.classpath has not been set at runtime, but was > found during > build file parsing, attempting to resolve. Future versions of Ant may > support > referencing ids defined in non-executed targets. > [junit] Running com.yahoo.zookeeper.server.DeserializationPerfTest > [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 9.549 sec > [junit] Running com.yahoo.zookeeper.server.SerializationPerfTest > [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 6.78 sec > [junit] Running com.yahoo.zookeeper.server.ZooKeeperServerTest > [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.139 sec > [junit] Running com.yahoo.zookeeper.test.AsyncTest > [junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 103.648 sec > [junit] Test com.yahoo.zookeeper.test.AsyncTest FAILED > [junit] Running com.yahoo.zookeeper.test.ClientTest > [junit] Tests run: 4, Failures: 4, Errors: 0, Time elapsed: 140.248 sec > [junit] Test com.yahoo.zookeeper.test.ClientTest FAILED > [junit] Running com.yahoo.zookeeper.test.DataTreeTest > [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.087 sec > [junit] Running com.yahoo.zookeeper.test.OOMTest > [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.014 sec > [junit] Running com.yahoo.zookeeper.test.QuorumTest > [junit] Tests run: 4, Failures: 3, Errors: 1, Time elapsed: 169.124 sec > [junit] Test com.yahoo.zookeeper.test.QuorumTest FAILED > [junit] Running com.yahoo.zookeeper.test.RecoveryTest > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 14.635 sec > [junit] Test com.yahoo.zookeeper.test.RecoveryTest FAILED > [junit] Running com.yahoo.zookeeper.test.SessionTest > [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 32.542 sec > [junit] Test com.yahoo.zookeeper.test.SessionTest FAILED > > BUILD FAILED > /Users/satish/zookeeper/build.xml:336: The following error occurred > while executing this line: > /Users/satish/zookeeper/build.xml:314: Tests failed! > > Total time: 9 minutes 4 seconds > > Any ideas what's going on? > > Satish > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > > ------------------------------------------------------------------------ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Mahadev K. <ma...@ya...> - 2008-08-06 23:27:01
|
Forwarding it to all ------ Forwarded Message From: Mahadev Konar <ma...@ya...> Date: Wed, 06 Aug 2008 11:54:49 -0700 To: Satish Bhatti <cth...@gm...> Conversation: [Zookeeper-user] Client Connect Exceptions + Zookeeper Unit Tests fail Subject: Re: [Zookeeper-user] Client Connect Exceptions + Zookeeper Unit Tests fail Hi Satish, What os are you running it on? Also, did you have the server running while you were running the tests? We use hard coded ports in the tests 2181 is one of them. So the tests would fail since it would not be able to bind to the server . Mahadev On 8/6/08 11:49 AM, "Satish Bhatti" <cth...@gm...> wrote: > I am using the 2.2.1 version of ZooKeeper. My zoo.cfg file is: > > # The number of milliseconds of each tick > tickTime=2000 > > # the directory where the snapshot is stored. > dataDir=/Users/satish/zookeeper/server1/data/ > > # the port at which the clients will connect > clientPort=2181 > > The command line I am using for the server is: > > java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.server.quorum.QuorumPeer conf/zoo.cfg > > and for the client: > > java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf > com.yahoo.zookeeper.ZooKeeper 127.0.0.1:2181 <http://127.0.0.1:2181> > > Here is what I get on the client end: > > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to > /127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:63815 > <http://127.0.0.1:63815> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > null: 3--1 > > > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > null: 0--1 > WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to > /127.0.0.1:2181 <http://127.0.0.1:2181> > WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to > java.nio.channels.SocketChannel[connected local=/127.0.0.1:63816 > <http://127.0.0.1:63816> remote=/127.0.0.1:2181 <http://127.0.0.1:2181> ] > null: 3--1 > > create /test hello > > Processing create > WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: > java.io.IOException: TIMED OUT > at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) > Exception in thread "main" com.yahoo.zookeeper.KeeperException: > KeeperErrorCode = ConnectionLoss for /test > at com.yahoo.zookeeper.ZooKeeper.create(ZooKeeper.java:244) > at com.yahoo.zookeeper.ZooKeeper.processCmd(ZooKeeper.java:788) > at com.yahoo.zookeeper.ZooKeeper.main(ZooKeeper.java:745) > > At this point the client exits. > > Note the continuous stream of "null: 3--1" type errors. Why is it constantly > having to reconnect to the server? Finally, when I type in a simple create > command the client goes belly up and exits. Even though I am following the > exact instructions in the SourceForge wiki, I figured maybe I was doing > something wrong, so I decided to run the unit tests. Sure enough, they also > failed! > > > junit.run: > Warning: Reference test.classpath has not been set at runtime, but was found > during > build file parsing, attempting to resolve. Future versions of Ant may support > referencing ids defined in non-executed targets. > [junit] Running com.yahoo.zookeeper.server.DeserializationPerfTest > [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 9.549 sec > [junit] Running com.yahoo.zookeeper.server.SerializationPerfTest > [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 6.78 sec > [junit] Running com.yahoo.zookeeper.server.ZooKeeperServerTest > [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.139 sec > [junit] Running com.yahoo.zookeeper.test.AsyncTest > [junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 103.648 sec > [junit] Test com.yahoo.zookeeper.test.AsyncTest FAILED > [junit] Running com.yahoo.zookeeper.test.ClientTest > [junit] Tests run: 4, Failures: 4, Errors: 0, Time elapsed: 140.248 sec > [junit] Test com.yahoo.zookeeper.test.ClientTest FAILED > [junit] Running com.yahoo.zookeeper.test.DataTreeTest > [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.087 sec > [junit] Running com.yahoo.zookeeper.test.OOMTest > [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.014 sec > [junit] Running com.yahoo.zookeeper.test.QuorumTest > [junit] Tests run: 4, Failures: 3, Errors: 1, Time elapsed: 169.124 sec > [junit] Test com.yahoo.zookeeper.test.QuorumTest FAILED > [junit] Running com.yahoo.zookeeper.test.RecoveryTest > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 14.635 sec > [junit] Test com.yahoo.zookeeper.test.RecoveryTest FAILED > [junit] Running com.yahoo.zookeeper.test.SessionTest > [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 32.542 sec > [junit] Test com.yahoo.zookeeper.test.SessionTest FAILED > > BUILD FAILED > /Users/satish/zookeeper/build.xml:336: The following error occurred while > executing this line: > /Users/satish/zookeeper/build.xml:314: Tests failed! > > Total time: 9 minutes 4 seconds > > Any ideas what's going on? > > Satish > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user ------ End of Forwarded Message |
From: Satish B. <cth...@gm...> - 2008-08-06 18:49:48
|
I am using the 2.2.1 version of ZooKeeper. My zoo.cfg file is: # The number of milliseconds of each tick tickTime=2000 # the directory where the snapshot is stored. dataDir=/Users/satish/zookeeper/server1/data/ # the port at which the clients will connect clientPort=2181 The command line I am using for the server is: java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf com.yahoo.zookeeper.server.quorum.QuorumPeer conf/zoo.cfg and for the client: java -cp zookeeper-dev.jar:java/lib/log4j-1.2.15.jar:conf com.yahoo.zookeeper.ZooKeeper 127.0.0.1:2181 Here is what I get on the client end: WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to / 127.0.0.1:2181 WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:63815 remote=/ 127.0.0.1:2181] null: 3--1 WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: java.io.IOException: TIMED OUT at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) null: 0--1 WARN - [SendThread:ClientCnxn$SendThread@639] - Trying to connect to / 127.0.0.1:2181 WARN - [SendThread:ClientCnxn$SendThread@568] - Priming connection to java.nio.channels.SocketChannel[connected local=/127.0.0.1:63816 remote=/ 127.0.0.1:2181] null: 3--1 create /test hello Processing create WARN - [SendThread:ClientCnxn$SendThread@719] - Closing: java.io.IOException: TIMED OUT at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:666) Exception in thread "main" com.yahoo.zookeeper.KeeperException: KeeperErrorCode = ConnectionLoss for /test at com.yahoo.zookeeper.ZooKeeper.create(ZooKeeper.java:244) at com.yahoo.zookeeper.ZooKeeper.processCmd(ZooKeeper.java:788) at com.yahoo.zookeeper.ZooKeeper.main(ZooKeeper.java:745) At this point the client exits. Note the continuous stream of "null: 3--1" type errors. Why is it constantly having to reconnect to the server? Finally, when I type in a simple create command the client goes belly up and exits. Even though I am following the exact instructions in the SourceForge wiki, I figured maybe I was doing something wrong, so I decided to run the unit tests. Sure enough, they also failed! junit.run: Warning: Reference test.classpath has not been set at runtime, but was found during build file parsing, attempting to resolve. Future versions of Ant may support referencing ids defined in non-executed targets. [junit] Running com.yahoo.zookeeper.server.DeserializationPerfTest [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 9.549 sec [junit] Running com.yahoo.zookeeper.server.SerializationPerfTest [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 6.78 sec [junit] Running com.yahoo.zookeeper.server.ZooKeeperServerTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.139 sec [junit] Running com.yahoo.zookeeper.test.AsyncTest [junit] Tests run: 2, Failures: 2, Errors: 0, Time elapsed: 103.648 sec [junit] Test com.yahoo.zookeeper.test.AsyncTest FAILED [junit] Running com.yahoo.zookeeper.test.ClientTest [junit] Tests run: 4, Failures: 4, Errors: 0, Time elapsed: 140.248 sec [junit] Test com.yahoo.zookeeper.test.ClientTest FAILED [junit] Running com.yahoo.zookeeper.test.DataTreeTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.087 sec [junit] Running com.yahoo.zookeeper.test.OOMTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.014 sec [junit] Running com.yahoo.zookeeper.test.QuorumTest [junit] Tests run: 4, Failures: 3, Errors: 1, Time elapsed: 169.124 sec [junit] Test com.yahoo.zookeeper.test.QuorumTest FAILED [junit] Running com.yahoo.zookeeper.test.RecoveryTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 14.635 sec [junit] Test com.yahoo.zookeeper.test.RecoveryTest FAILED [junit] Running com.yahoo.zookeeper.test.SessionTest [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 32.542 sec [junit] Test com.yahoo.zookeeper.test.SessionTest FAILED BUILD FAILED /Users/satish/zookeeper/build.xml:336: The following error occurred while executing this line: /Users/satish/zookeeper/build.xml:314: Tests failed! Total time: 9 minutes 4 seconds Any ideas what's going on? Satish |
From: Andrew K. <ak...@ya...> - 2008-07-30 17:34:56
|
Austin, Thanks a lot for the patch! It was a known issue and we even had a bug open for this in our internal bug tracking system. As Ben has suggested, could you please open a bug in the zookeeper project's Jira here https://issues.apache.org/jira/browse/ZOOKEEPER ? Also, this patch will break a few unit tests (specifically, the ones that test zookeeper client initialization). Would you mind updating them so that they work with your patch? Thanks, Andrew > -----Original Message----- > From: zoo...@li... [mailto:zookeeper-user- > bo...@li...] On Behalf Of Benjamin Reed > Sent: Wednesday, July 30, 2008 8:40 AM > To: zoo...@li... > Subject: Re: [Zookeeper-user] Data races in ZooKeeper C client > > Sorry about that and thanx for the patch! > > We have moved the development process to Apache. Would you mind opening a > Jira > and submitting the patch through the Jira? > > thank you > ben > > On Wednesday 30 July 2008 02:23:11 Austin Shoemaker wrote: > > We were debugging a crash when using the C client on multiple threads > > and isolated the crash to gethostbyname, which is not reentrant. > > Replacing it with the newer getaddrinfo resolved the problem. The same > > function was also using strtok instead of the thread-safe strtok_r. > > The patch below incorporates both fixes. > > > > There may be other non-threadsafe calls that we haven't discovered- > > let us know if you find any. > > > > Austin > > > > > > Patch for zookeeper-c-client-2.2.1/src/zookeeper.c (2008-06-09 on > > SF.net) > > > > 241c241 > > < struct hostent *he; > > --- > > > > > struct addrinfo hints, *res, *res0; > > > > 243,245d242 > > < struct sockaddr_in *addr4; > > < struct sockaddr_in6 *addr6; > > < char **ptr; > > 247a245 > > > > > char *strtok_last; > > > > 263c261 > > < host=strtok(hosts, ","); > > --- > > > > > host=strtok_r(hosts, ",", &strtok_last); > > > > 283,294c281,297 > > < he = gethostbyname(host); > > < if (!he) { > > < LOG_ERROR(("could not resolve %s", host)); > > < errno=EINVAL; > > < rc=ZBADARGUMENTS; > > < goto fail; > > < } > > < > > < /* Setup the address array */ > > < for(ptr = he->h_addr_list;*ptr != 0; ptr++) { > > < if (zh->addrs_count == alen) { > > < void *tmpaddr; > > --- > > > > > memset(&hints, 0, sizeof(hints)); > > > hints.ai_flags = AI_ADDRCONFIG; > > > hints.ai_family = AF_UNSPEC; > > > hints.ai_socktype = SOCK_STREAM; > > > hints.ai_protocol = IPPROTO_TCP; > > > > > > if (getaddrinfo(host, port_spec, &hints, &res0) != 0) { > > > LOG_ERROR(("getaddrinfo: %s\n", strerror(errno))); > > > rc=ZSYSTEMERROR; > > > goto fail; > > > } > > > > > > for (res = res0; res; res = res->ai_next) { > > > // Expand address list if needed > > > if (zh->addrs_count == alen) { > > > void *tmpaddr; > > > > 304,313c307,312 > > < } > > < addr = &zh->addrs[zh->addrs_count]; > > < addr4 = (struct sockaddr_in*)addr; > > < addr6 = (struct sockaddr_in6*)addr; > > < addr->sa_family = he->h_addrtype; > > < if (addr->sa_family == AF_INET) { > > < addr4->sin_port = htons(port); > > < memset(&addr4->sin_zero, 0, sizeof(addr4->sin_zero)); > > < memcpy(&addr4->sin_addr, *ptr, he->h_length); > > < zh->addrs_count++; > > --- > > > > > } > > > > > > // Copy addrinfo into address list > > > addr = &zh->addrs[zh->addrs_count]; > > > switch (res->ai_family) { > > > case AF_INET: > > > > 315,320c314 > > < } else if (addr->sa_family == AF_INET6) { > > < addr6->sin6_port = htons(port); > > < addr6->sin6_scope_id = 0; > > < addr6->sin6_flowinfo = 0; > > < memcpy(&addr6->sin6_addr, *ptr, he->h_length); > > < zh->addrs_count++; > > --- > > > > > case AF_INET6: > > > > 322,327c316,328 > > < } else { > > < LOG_WARN(("skipping unknown address family %x for %s", > > < addr->sa_family, zh->hostname)); > > < } > > < } > > < host = strtok(0, ","); > > --- > > > > > memcpy(addr, res->ai_addr, res->ai_addrlen); > > > ++zh->addrs_count; > > > break; > > > default: > > > LOG_WARN(("skipping unknown address family %x for > %s", > > > res->ai_family, zh->hostname)); > > > break; > > > } > > > } > > > > > > freeaddrinfo(res0); > > > > > > host = strtok_r(0, ",", &strtok_last); > > > > 329a331 > > > > > > > > ------------------------------------------------------------------------ > - > > This SF.Net email is sponsored by the Moblin Your Move Developer's > > challenge Build the coolest Linux based applications with Moblin SDK & > win > > great prizes Grand prize is a trip for two to an Open Source event > anywhere > > in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > > Zookeeper-user mailing list > > Zoo...@li... > > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the > world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Benjamin R. <br...@ya...> - 2008-07-30 15:40:17
|
Sorry about that and thanx for the patch! We have moved the development process to Apache. Would you mind opening a Jira and submitting the patch through the Jira? thank you ben On Wednesday 30 July 2008 02:23:11 Austin Shoemaker wrote: > We were debugging a crash when using the C client on multiple threads > and isolated the crash to gethostbyname, which is not reentrant. > Replacing it with the newer getaddrinfo resolved the problem. The same > function was also using strtok instead of the thread-safe strtok_r. > The patch below incorporates both fixes. > > There may be other non-threadsafe calls that we haven't discovered- > let us know if you find any. > > Austin > > > Patch for zookeeper-c-client-2.2.1/src/zookeeper.c (2008-06-09 on > SF.net) > > 241c241 > < struct hostent *he; > --- > > > struct addrinfo hints, *res, *res0; > > 243,245d242 > < struct sockaddr_in *addr4; > < struct sockaddr_in6 *addr6; > < char **ptr; > 247a245 > > > char *strtok_last; > > 263c261 > < host=strtok(hosts, ","); > --- > > > host=strtok_r(hosts, ",", &strtok_last); > > 283,294c281,297 > < he = gethostbyname(host); > < if (!he) { > < LOG_ERROR(("could not resolve %s", host)); > < errno=EINVAL; > < rc=ZBADARGUMENTS; > < goto fail; > < } > < > < /* Setup the address array */ > < for(ptr = he->h_addr_list;*ptr != 0; ptr++) { > < if (zh->addrs_count == alen) { > < void *tmpaddr; > --- > > > memset(&hints, 0, sizeof(hints)); > > hints.ai_flags = AI_ADDRCONFIG; > > hints.ai_family = AF_UNSPEC; > > hints.ai_socktype = SOCK_STREAM; > > hints.ai_protocol = IPPROTO_TCP; > > > > if (getaddrinfo(host, port_spec, &hints, &res0) != 0) { > > LOG_ERROR(("getaddrinfo: %s\n", strerror(errno))); > > rc=ZSYSTEMERROR; > > goto fail; > > } > > > > for (res = res0; res; res = res->ai_next) { > > // Expand address list if needed > > if (zh->addrs_count == alen) { > > void *tmpaddr; > > 304,313c307,312 > < } > < addr = &zh->addrs[zh->addrs_count]; > < addr4 = (struct sockaddr_in*)addr; > < addr6 = (struct sockaddr_in6*)addr; > < addr->sa_family = he->h_addrtype; > < if (addr->sa_family == AF_INET) { > < addr4->sin_port = htons(port); > < memset(&addr4->sin_zero, 0, sizeof(addr4->sin_zero)); > < memcpy(&addr4->sin_addr, *ptr, he->h_length); > < zh->addrs_count++; > --- > > > } > > > > // Copy addrinfo into address list > > addr = &zh->addrs[zh->addrs_count]; > > switch (res->ai_family) { > > case AF_INET: > > 315,320c314 > < } else if (addr->sa_family == AF_INET6) { > < addr6->sin6_port = htons(port); > < addr6->sin6_scope_id = 0; > < addr6->sin6_flowinfo = 0; > < memcpy(&addr6->sin6_addr, *ptr, he->h_length); > < zh->addrs_count++; > --- > > > case AF_INET6: > > 322,327c316,328 > < } else { > < LOG_WARN(("skipping unknown address family %x for %s", > < addr->sa_family, zh->hostname)); > < } > < } > < host = strtok(0, ","); > --- > > > memcpy(addr, res->ai_addr, res->ai_addrlen); > > ++zh->addrs_count; > > break; > > default: > > LOG_WARN(("skipping unknown address family %x for %s", > > res->ai_family, zh->hostname)); > > break; > > } > > } > > > > freeaddrinfo(res0); > > > > host = strtok_r(0, ",", &strtok_last); > > 329a331 > > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge Build the coolest Linux based applications with Moblin SDK & win > great prizes Grand prize is a trip for two to an Open Source event anywhere > in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Austin S. <au...@co...> - 2008-07-30 09:23:06
|
We were debugging a crash when using the C client on multiple threads and isolated the crash to gethostbyname, which is not reentrant. Replacing it with the newer getaddrinfo resolved the problem. The same function was also using strtok instead of the thread-safe strtok_r. The patch below incorporates both fixes. There may be other non-threadsafe calls that we haven't discovered- let us know if you find any. Austin Patch for zookeeper-c-client-2.2.1/src/zookeeper.c (2008-06-09 on SF.net) 241c241 < struct hostent *he; --- > struct addrinfo hints, *res, *res0; 243,245d242 < struct sockaddr_in *addr4; < struct sockaddr_in6 *addr6; < char **ptr; 247a245 > char *strtok_last; 263c261 < host=strtok(hosts, ","); --- > host=strtok_r(hosts, ",", &strtok_last); 283,294c281,297 < he = gethostbyname(host); < if (!he) { < LOG_ERROR(("could not resolve %s", host)); < errno=EINVAL; < rc=ZBADARGUMENTS; < goto fail; < } < < /* Setup the address array */ < for(ptr = he->h_addr_list;*ptr != 0; ptr++) { < if (zh->addrs_count == alen) { < void *tmpaddr; --- > > memset(&hints, 0, sizeof(hints)); > hints.ai_flags = AI_ADDRCONFIG; > hints.ai_family = AF_UNSPEC; > hints.ai_socktype = SOCK_STREAM; > hints.ai_protocol = IPPROTO_TCP; > > if (getaddrinfo(host, port_spec, &hints, &res0) != 0) { > LOG_ERROR(("getaddrinfo: %s\n", strerror(errno))); > rc=ZSYSTEMERROR; > goto fail; > } > > for (res = res0; res; res = res->ai_next) { > // Expand address list if needed > if (zh->addrs_count == alen) { > void *tmpaddr; 304,313c307,312 < } < addr = &zh->addrs[zh->addrs_count]; < addr4 = (struct sockaddr_in*)addr; < addr6 = (struct sockaddr_in6*)addr; < addr->sa_family = he->h_addrtype; < if (addr->sa_family == AF_INET) { < addr4->sin_port = htons(port); < memset(&addr4->sin_zero, 0, sizeof(addr4->sin_zero)); < memcpy(&addr4->sin_addr, *ptr, he->h_length); < zh->addrs_count++; --- > } > > // Copy addrinfo into address list > addr = &zh->addrs[zh->addrs_count]; > switch (res->ai_family) { > case AF_INET: 315,320c314 < } else if (addr->sa_family == AF_INET6) { < addr6->sin6_port = htons(port); < addr6->sin6_scope_id = 0; < addr6->sin6_flowinfo = 0; < memcpy(&addr6->sin6_addr, *ptr, he->h_length); < zh->addrs_count++; --- > case AF_INET6: 322,327c316,328 < } else { < LOG_WARN(("skipping unknown address family %x for %s", < addr->sa_family, zh->hostname)); < } < } < host = strtok(0, ","); --- > memcpy(addr, res->ai_addr, res->ai_addrlen); > ++zh->addrs_count; > break; > default: > LOG_WARN(("skipping unknown address family %x for %s", > res->ai_family, zh->hostname)); > break; > } > } > > freeaddrinfo(res0); > > host = strtok_r(0, ",", &strtok_last); 329a331 > |
From: Martin S. <ms...@10...> - 2008-07-25 06:49:55
|
Am 30.06.2008 um 16:32 schrieb Benjamin Reed: >> The session is expired and I try now to catch this state with >> event.getState() == Watcher.Event.KeeperStateExpired. Is there an >> example how to renew the session or do I have to create a new >> ZooKeeper object? > > When a session expires the ZooKeeper object becomes invalid. You > must create a > new ZooKeeper object. I have done this. Now I get the following. The zookeeper clients get all a seesion expiration event. I create a new zookeeper object. The zookeeper server gets the event -1,0,'null followed by -1,3,'null . Thats all. All other clients have a lot of -1,0,'null events in it (about 1 per second) after they created a new zookeeper object. Every read or write in the client leads to a ConnectionLoss exception. Any help would be appreciated. Martin |
From: Martin S. <ms...@10...> - 2008-07-24 16:24:34
|
Am 24.07.2008 um 17:56 schrieb Benjamin Reed: > What do you mean by "In this the events were thrown endless."? In this situation the events were thrown endless. I would say that they filled up my log files. I think 1 per second. It seems that the reconnection has not worked? > The event is a notification to the application of the disconnect or > reconnect. The ZooKeeper client library takes care of reestablishing > the > connection. The main reason for this event is for the application to > know that the watches have gone away. What means that I have to reinstall my listeners? |
From: Benjamin R. <br...@ya...> - 2008-07-24 15:57:11
|
What do you mean by "In this the events were thrown endless."? The event is a notification to the application of the disconnect or reconnect. The ZooKeeper client library takes care of reestablishing the connection. The main reason for this event is for the application to know that the watches have gone away. ben Martin Schaaf wrote: > Hi, > > to fix my problems with the timeouts and connection loss. I print out > every unhandled event. As my 9 clients gets disconnected from the > zookeeper server they print out the following event -1,0,'null (from > event.toString()). In this the events were thrown endless. > > I found the following documentation for this event. > > * When a client drops current connection and re-connects to a > server, all the > * existing watches are considered as being triggered but the > undelivered events > * are lost. To emulate this, the client will generate a special > event to tell > * the event handler a connection has been dropped. This special > event has type > * EventNone and state sKeeperStateDisconnected. > > What I need to know is: Do we have to reconnect or does it happen by > its own? What should the handler do if this event gets thrown? > > Thank you. > > Martin > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > |
From: Martin S. <ms...@10...> - 2008-07-24 07:31:56
|
Hi, to fix my problems with the timeouts and connection loss. I print out every unhandled event. As my 9 clients gets disconnected from the zookeeper server they print out the following event -1,0,'null (from event.toString()). In this the events were thrown endless. I found the following documentation for this event. * When a client drops current connection and re-connects to a server, all the * existing watches are considered as being triggered but the undelivered events * are lost. To emulate this, the client will generate a special event to tell * the event handler a connection has been dropped. This special event has type * EventNone and state sKeeperStateDisconnected. What I need to know is: Do we have to reconnect or does it happen by its own? What should the handler do if this event gets thrown? Thank you. Martin |
From: Avinash L. <avi...@gm...> - 2008-07-24 07:09:01
|
Here is the exception I am seeing at the server on one of the machines. But this happened in the morning at around 9/10 AM. 2008-07-09 17:33:34,607 - WARN [FollowerHandler-/10.18.38.191:39598 :FollowerHandler@346] - ******* GOODBYE /10.18.38.191:39598 ******** 2008-07-09 21:41:30,380 - ERROR [FollowerHandler-/10.18.38.191:21568 :FollowerHandler@341] - FIXMSG java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:358) at com.yahoo.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:64) at com.yahoo.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:65) at com.yahoo.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:109) at com.yahoo.zookeeper.server.quorum.FollowerHandler.run(FollowerHandler.java:277) I can start the client and see if this continues to occur. But the exception still continues to surface on the client. Avinash On Wed, Jul 23, 2008 at 11:01 PM, Benjamin Reed <br...@ya...> wrote: > Are you seeing any errors in the ZooKeeper server logs? Looking more > closely that error is happening in a very strange location. I would think > that something should show up in the error log of one of the servers. Are > other clients working? I assume that the client that is getting the > exception is not working right? Does this keep happening even after you > restart the client? > > thanx > ben > > > > Avinash Lakshman wrote: > > How can I fix this issue? I have a cluster of 5 nodes running Zookeeper. > Now I have a bunch of other nodes that use Zookeeper to elect a leader > amongst themselves. No other explicit communication with Zookeeper. I keep > seeing these annoying exceptions. What do I do to get rid of them? I can > bounce the Zookeeper instance but would that mean I will have to bounce the > other nodes too? Please advice. > > Avinash > > On Wed, Jul 23, 2008 at 12:06 PM, Mahadev Konar <ma...@ya...>wrote: > >> This looks a little weird. This mostly would happen if the server goes >> down or closes the connection — which should not happen often. Can you give >> us some background on what you are trying to do and whats your setup? >> >> mahadev >> >> >> On 7/23/08 11:56 AM, "Avinash Lakshman" <avi...@gm...> >> wrote: >> >> Could someone please tell me what this exception could be? I get a >> bunch of these: >> >> system.log:java.io <http://java.io> . >> EOFException >> system.log- at >> java.io.DataInputStream.readInt(DataInputStream.java:392) >> system.log- at >> com.yahoo.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:64) >> system.log- at >> com.yahoo.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:98) >> system.log- at >> com.yahoo.zookeeper.proto.ConnectResponse.deserialize(ConnectResponse.java:59) >> system.log- at >> com.yahoo.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:400) >> system.log- at >> com.yahoo.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:492) >> system.log- at >> com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:705) >> >> Avinash >> >> ------------------------------ >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win great >> prizes >> Grand prize is a trip for two to an Open Source event anywhere in the >> world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> ------------------------------ >> _______________________________________________ >> Zookeeper-user mailing list >> Zoo...@li... >> https://lists.sourceforge.net/lists/listinfo/zookeeper-user >> >> > ------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the worldhttp://moblin-contest.org/redirect.php?banner_id=100&url=/ > > ------------------------------ > > _______________________________________________ > Zookeeper-user mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > > |
From: Benjamin R. <br...@ya...> - 2008-07-24 06:03:32
|
Are you seeing any errors in the ZooKeeper server logs? Looking more closely that error is happening in a very strange location. I would think that something should show up in the error log of one of the servers. Are other clients working? I assume that the client that is getting the exception is not working right? Does this keep happening even after you restart the client? thanx ben Avinash Lakshman wrote: > How can I fix this issue? I have a cluster of 5 nodes running > Zookeeper. Now I have a bunch of other nodes that use Zookeeper to > elect a leader amongst themselves. No other explicit communication > with Zookeeper. I keep seeing these annoying exceptions. What do I do > to get rid of them? I can bounce the Zookeeper instance but would that > mean I will have to bounce the other nodes too? Please advice. > > Avinash > > On Wed, Jul 23, 2008 at 12:06 PM, Mahadev Konar <ma...@ya... > <mailto:ma...@ya...>> wrote: > > This looks a little weird. This mostly would happen if the server > goes down or closes the connection — which should not happen > often. Can you give us some background on what you are trying to > do and whats your setup? > > mahadev > > > > On 7/23/08 11:56 AM, "Avinash Lakshman" > <avi...@gm... <http://avi...@gm...>> > wrote: > > Could someone please tell me what this exception could be? I > get a bunch of these: > > system.log:java.io <http://java.io> <http://java.io> . > > EOFException > system.log- at > java.io.DataInputStream.readInt(DataInputStream.java:392) > system.log- at > com.yahoo.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:64) > system.log- at > com.yahoo.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:98) > system.log- at > com.yahoo.zookeeper.proto.ConnectResponse.deserialize(ConnectResponse.java:59) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:400) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:492) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:705) > > Avinash > > ------------------------------------------------------------------------ > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move > Developer's challenge > Build the coolest Linux based applications with Moblin SDK & > win great prizes > Grand prize is a trip for two to an Open Source event anywhere > in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > <http://moblin-contest.org/redirect.php?banner_id=100&url=/> > ------------------------------------------------------------------------ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > <http://Zoo...@li...> > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------------------------------------------------ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > |
From: Avinash L. <avi...@gm...> - 2008-07-24 05:04:56
|
How can I fix this issue? I have a cluster of 5 nodes running Zookeeper. Now I have a bunch of other nodes that use Zookeeper to elect a leader amongst themselves. No other explicit communication with Zookeeper. I keep seeing these annoying exceptions. What do I do to get rid of them? I can bounce the Zookeeper instance but would that mean I will have to bounce the other nodes too? Please advice. Avinash On Wed, Jul 23, 2008 at 12:06 PM, Mahadev Konar <ma...@ya...>wrote: > This looks a little weird. This mostly would happen if the server goes > down or closes the connection — which should not happen often. Can you give > us some background on what you are trying to do and whats your setup? > > mahadev > > > On 7/23/08 11:56 AM, "Avinash Lakshman" <avi...@gm...> > wrote: > > Could someone please tell me what this exception could be? I get a bunch of > these: > > system.log:java.io <http://java.io> . > EOFException > system.log- at > java.io.DataInputStream.readInt(DataInputStream.java:392) > system.log- at > com.yahoo.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:64) > system.log- at > com.yahoo.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:98) > system.log- at > com.yahoo.zookeeper.proto.ConnectResponse.deserialize(ConnectResponse.java:59) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:400) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:492) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:705) > > Avinash > > ------------------------------ > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > |
From: Avinash L. <avi...@gm...> - 2008-07-23 21:41:33
|
---------- Forwarded message ---------- From: Avinash Lakshman <avi...@gm...> Date: Wed, Jul 23, 2008 at 2:41 PM Subject: Re: [Zookeeper-user] Weird exception To: Mahadev Konar <ma...@ya...> It was running fine for a week. I have a bunch of machines that are talking to Zookeeper cluster. I use Zookeeper for leader election. Avinash On Wed, Jul 23, 2008 at 12:06 PM, Mahadev Konar <ma...@ya...> wrote: > This looks a little weird. This mostly would happen if the server goes > down or closes the connection — which should not happen often. Can you give > us some background on what you are trying to do and whats your setup? > > mahadev > > > On 7/23/08 11:56 AM, "Avinash Lakshman" <avi...@gm...> > wrote: > > Could someone please tell me what this exception could be? I get a bunch of > these: > > system.log:java.io <http://java.io> . > EOFException > system.log- at > java.io.DataInputStream.readInt(DataInputStream.java:392) > system.log- at > com.yahoo.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:64) > system.log- at > com.yahoo.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:98) > system.log- at > com.yahoo.zookeeper.proto.ConnectResponse.deserialize(ConnectResponse.java:59) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:400) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:492) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:705) > > Avinash > > ------------------------------ > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------ > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > |
From: Mahadev K. <ma...@ya...> - 2008-07-23 19:07:14
|
This looks a little weird. This mostly would happen if the server goes down or closes the connection which should not happen often. Can you give us some background on what you are trying to do and whats your setup? mahadev On 7/23/08 11:56 AM, "Avinash Lakshman" <avi...@gm...> wrote: > Could someone please tell me what this exception could be? I get a bunch of > these: > > system.log:java.io <http://java.io> . > EOFException > system.log- at java.io.DataInputStream.readInt(DataInputStream.java:392) > system.log- at > com.yahoo.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:64) > system.log- at > com.yahoo.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:98) > system.log- at > com.yahoo.zookeeper.proto.ConnectResponse.deserialize(ConnectResponse.java:59) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:40 > 0) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:492) > system.log- at > com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:705) > > Avinash > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Avinash L. <avi...@gm...> - 2008-07-23 18:56:50
|
Could someone please tell me what this exception could be? I get a bunch of these: system.log:java.io.EOFException system.log- at java.io.DataInputStream.readInt(DataInputStream.java:392) system.log- at com.yahoo.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:64) system.log- at com.yahoo.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:98) system.log- at com.yahoo.zookeeper.proto.ConnectResponse.deserialize(ConnectResponse.java:59) system.log- at com.yahoo.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:400) system.log- at com.yahoo.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:492) system.log- at com.yahoo.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:705) Avinash |
From: Benjamin R. <br...@ya...> - 2008-07-23 16:46:02
|
There is no such a way currently and implementing it would probably be harder than implementing the addition and removal of servers in a ZooKeeper cluster. It does seem like a useful feature to gracefully decommission a running ZooKeeper leader. ben On Tuesday 22 July 2008 17:16:05 Patrick Hunt wrote: > Is there a way to ask the zk cluster to switch leaders? Switch in such a > way as it doesn't cause a "virtual bounce"? (I mean can we code > something that would enable this) > > Patrick > > Benjamin Reed wrote: > > Creative idea. This should work. It's kind of a pain. I don't think > > there is a Jira opened for this, but you should open one. It's really a > > matter of implementation. You should be able to add (or remove) a > > machine to ZooKeeper and just have it get integrated in without having > > to do any restarting. The real problem with the restarting is that > > eventually you will get to the leader which will cause a virtual bounce > > of everyone when a new leader gets elected, so you really don't save > > much by doing it gradually. > > > > ben > > > > Austin Shoemaker wrote: > >> We are using Zookeeper to implement consistent hashing, and need to be > >> able to add or remove hosts from the Zookeeper service without > >> interrupting its functionality. > >> > >> According to > >> http://zookeeper.wiki.sourceforge.net/ZooKeeperConfiguration: "Every > >> machine that is part of the ZooKeeper service needs to know about every > >> other machine. So, we need to have a list of machines in the > >> configuration file." > >> > >> Can we add a single Zookeeper server to the service with an expanded > >> host list without interrupting the proper functioning of the service? > >> The old servers will each have host list s_1, ..., s_k, and the new > >> server s_k+1 will have host list s_1, ..., s_k, s_k+1. Now we need to > >> restart the rest of the servers one by one from s_1 to s_k with the > >> new host list s_1, ..., s_k, s_k+1. Is this a violation of the service > >> specification? > >> > >> Thanks, > >> > >> Austin > >> > >> ------------------------------------------------------------------------ > >> > >> ------------------------------------------------------------------------ > >>- This SF.Net email is sponsored by the Moblin Your Move Developer's > >> challenge Build the coolest Linux based applications with Moblin SDK & > >> win great prizes Grand prize is a trip for two to an Open Source event > >> anywhere in the world > >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ > >> ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Zookeeper-user mailing list > >> Zoo...@li... > >> https://lists.sourceforge.net/lists/listinfo/zookeeper-user > > > > ------------------------------------------------------------------------ > > > > ------------------------------------------------------------------------- > > This SF.Net email is sponsored by the Moblin Your Move Developer's > > challenge Build the coolest Linux based applications with Moblin SDK & > > win great prizes Grand prize is a trip for two to an Open Source event > > anywhere in the world > > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Zookeeper-user mailing list > > Zoo...@li... > > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Patrick H. <ph...@gm...> - 2008-07-23 00:15:55
|
Is there a way to ask the zk cluster to switch leaders? Switch in such a way as it doesn't cause a "virtual bounce"? (I mean can we code something that would enable this) Patrick Benjamin Reed wrote: > Creative idea. This should work. It's kind of a pain. I don't think > there is a Jira opened for this, but you should open one. It's really a > matter of implementation. You should be able to add (or remove) a > machine to ZooKeeper and just have it get integrated in without having > to do any restarting. The real problem with the restarting is that > eventually you will get to the leader which will cause a virtual bounce > of everyone when a new leader gets elected, so you really don't save > much by doing it gradually. > > ben > > Austin Shoemaker wrote: >> We are using Zookeeper to implement consistent hashing, and need to be >> able to add or remove hosts from the Zookeeper service without >> interrupting its functionality. >> >> According to http://zookeeper.wiki.sourceforge.net/ZooKeeperConfiguration: >> "Every machine that is part of the ZooKeeper service needs to know >> about every other machine. So, we need to have a list of machines in >> the configuration file." >> >> Can we add a single Zookeeper server to the service with an expanded >> host list without interrupting the proper functioning of the service? >> The old servers will each have host list s_1, ..., s_k, and the new >> server s_k+1 will have host list s_1, ..., s_k, s_k+1. Now we need to >> restart the rest of the servers one by one from s_1 to s_k with the >> new host list s_1, ..., s_k, s_k+1. Is this a violation of the service >> specification? >> >> Thanks, >> >> Austin >> >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge >> Build the coolest Linux based applications with Moblin SDK & win great prizes >> Grand prize is a trip for two to an Open Source event anywhere in the world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Zookeeper-user mailing list >> Zoo...@li... >> https://lists.sourceforge.net/lists/listinfo/zookeeper-user >> > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > > > ------------------------------------------------------------------------ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user |
From: Benjamin R. <br...@ya...> - 2008-07-22 23:59:01
|
Creative idea. This should work. It's kind of a pain. I don't think there is a Jira opened for this, but you should open one. It's really a matter of implementation. You should be able to add (or remove) a machine to ZooKeeper and just have it get integrated in without having to do any restarting. The real problem with the restarting is that eventually you will get to the leader which will cause a virtual bounce of everyone when a new leader gets elected, so you really don't save much by doing it gradually. ben Austin Shoemaker wrote: > We are using Zookeeper to implement consistent hashing, and need to be > able to add or remove hosts from the Zookeeper service without > interrupting its functionality. > > According to http://zookeeper.wiki.sourceforge.net/ZooKeeperConfiguration: > "Every machine that is part of the ZooKeeper service needs to know > about every other machine. So, we need to have a list of machines in > the configuration file." > > Can we add a single Zookeeper server to the service with an expanded > host list without interrupting the proper functioning of the service? > The old servers will each have host list s_1, ..., s_k, and the new > server s_k+1 will have host list s_1, ..., s_k, s_k+1. Now we need to > restart the rest of the servers one by one from s_1 to s_k with the > new host list s_1, ..., s_k, s_k+1. Is this a violation of the service > specification? > > Thanks, > > Austin > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > ------------------------------------------------------------------------ > > _______________________________________________ > Zookeeper-user mailing list > Zoo...@li... > https://lists.sourceforge.net/lists/listinfo/zookeeper-user > |