This list is closed, nobody may subscribe to it.
| 2010 |
Jan
|
Feb
(19) |
Mar
(8) |
Apr
(25) |
May
(16) |
Jun
(77) |
Jul
(131) |
Aug
(76) |
Sep
(30) |
Oct
(7) |
Nov
(3) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
(16) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(7) |
Dec
(7) |
| 2012 |
Jan
(10) |
Feb
(1) |
Mar
(8) |
Apr
(6) |
May
(1) |
Jun
(3) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
(8) |
Dec
(2) |
| 2013 |
Jan
(5) |
Feb
(12) |
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(22) |
Aug
(50) |
Sep
(31) |
Oct
(64) |
Nov
(83) |
Dec
(28) |
| 2014 |
Jan
(31) |
Feb
(18) |
Mar
(27) |
Apr
(39) |
May
(45) |
Jun
(15) |
Jul
(6) |
Aug
(27) |
Sep
(6) |
Oct
(67) |
Nov
(70) |
Dec
(1) |
| 2015 |
Jan
(3) |
Feb
(18) |
Mar
(22) |
Apr
(121) |
May
(42) |
Jun
(17) |
Jul
(8) |
Aug
(11) |
Sep
(26) |
Oct
(15) |
Nov
(66) |
Dec
(38) |
| 2016 |
Jan
(14) |
Feb
(59) |
Mar
(28) |
Apr
(44) |
May
(21) |
Jun
(12) |
Jul
(9) |
Aug
(11) |
Sep
(4) |
Oct
(2) |
Nov
(1) |
Dec
|
| 2017 |
Jan
(20) |
Feb
(7) |
Mar
(4) |
Apr
(18) |
May
(7) |
Jun
(3) |
Jul
(13) |
Aug
(2) |
Sep
(4) |
Oct
(9) |
Nov
(2) |
Dec
(5) |
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2019 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: husdon <no...@no...> - 2010-06-24 19:38:10
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 18:38:03
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 17:31:23
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 16:41:34
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 15:46:58
|
See <http://localhost/job/BigData/changes> |
|
From: Bryan T. <br...@sy...> - 2010-06-24 15:27:08
|
Brian, No, I was confused. I did not see zookeeper and had assumed that it was not running. Hence the rest of my questions. I do think that we should schedule a call to talk about how services will be started and restarted because this all interacts with the HA quorum logic. For example, hot spare recruitment, the target replication factor and the actual replication factor for a highly available service all interact. The logic for starting those services therefore has to coordinate with the HA logic. The quorums depend on having a simple majority. This is built around a service replication factor, k. k is an odd positive integer. k:=1 is not highly available. k:=3 is highly available and there must be a minimum of (k+1)/2 = 2 services running for the quorum to meet. If we start more than k services, then this can break the quorum logic. Right now I have presumed a dependency on zookeeper and the existing services manager service to provide a distributed guarantee that we start exactly k services. Planned down time and hot spare recruitment are both tricky issues for HA. We have to actually annotate the service when it is brought down, e.g., for a rolling code base update, to prevent it being treated as a failure and having a hot spare automatically recruited. Likewise, we have to pay careful attention when a hot spare is recruited to how it joins the write replication pipeline and when it joins the quorum. If we follow a path where service start is not linked to the configuration information in zookeeper and the service management services, then this is all stuff that we need to work through together. I think that we should do this soon -- ideally before I proceed with the zookeeper quorum integration based on the existing design. I'd be happy to talk through the quorum design on the call as well. Maybe we can do this in three pieces. One on the quorum work that I have been doing, one on the deploy/config work that you have been doing, and then an open discussion on how these things could be used to provide the flexibility and high availability and how they interact with hot spare recruitment. Thanks, Bryan ________________________________ From: Brian Murphy [mailto:btm...@gm...] Sent: Thursday, June 24, 2010 11:07 AM To: big...@li... Subject: Re: [Bigdata-developers] Alternate install/deploy mechanism On Wed, Jun 23, 2010 at 8:23 PM, Bryan Thompson <br...@sy...<mailto:br...@sy...>> wrote: Right now, bigdata depends on leader election semantics from zookeeper to start the appropriate mixture of services. I did not see zookeeper running so I presume that you are handling that differently in this example. No, zookeeper was running. If you run the disco-tool (or a jini browser), you should see a service of type com.bigdata.service.QuorumPeerService; which is zookeeper wrapped in a Jini service. Wrapping zookeeper in a Jini service not only provides a means to more easily start and stop zookeeper, but also provides a means to dynamically discover zookeeper in the federation. Furthermore, the QuorumPeerService interface provides a mechanism to customize how the services interact with zookeeper if desired. I would like to understand how we would handle the distributed decision making necessary to start an appropriate mixture of services with this proposal and also how we would handle the distributed decision making required to support the HA quorums. I've attached an updated version of my draft for the HA quorum design and the proposed zookeeper integration. Rather than using zookeeper to decide what gets started, this mechanism allows one to configure what individual services get started where, including the appropriate number of zookeeper instances. Zookeeper would then be viewed as a discoverable resource that can be used by the other services to determine who is the leader and whether or not a quorum exists before those services are used. I realize that some jini implementation do provide capabilities similar to what zookeeper provides. I'm not sure what jini implementations you're talking about. Something not in the Jini starter kit? Are you suggesting that or did you simply leave zookeeper and its roles in configuration management, leader elections, etc. out of the demo? As I said above, zookeeper was not left out. But I also said in my original posting that this work is not anywhere near complete, and was posted to give folks an idea of what could be done with install and deployment if the services are re-implemented to a smart proxy model and move to a shared nothing architecture; all of which I believe will be a significant amount of work. Perhaps in the future I should hold off on posting until the work is more complete. Sorry if I caused confusion. BrianM |
|
From: Brian M. <btm...@gm...> - 2010-06-24 15:07:03
|
On Wed, Jun 23, 2010 at 8:23 PM, Bryan Thompson <br...@sy...> wrote: Right now, bigdata depends on leader election semantics from zookeeper to > start the appropriate mixture of services. I did not see zookeeper running > so I presume that you are handling that differently in this example. > No, zookeeper was running. If you run the disco-tool (or a jini browser), you should see a service of type com.bigdata.service.QuorumPeerService; which is zookeeper wrapped in a Jini service. Wrapping zookeeper in a Jini service not only provides a means to more easily start and stop zookeeper, but also provides a means to dynamically discover zookeeper in the federation. Furthermore, the QuorumPeerService interface provides a mechanism to customize how the services interact with zookeeper if desired. I would like to understand how we would handle the distributed decision > making necessary to start an appropriate mixture of services with this > proposal and also how we would handle the distributed decision making > required to support the HA quorums. I've attached an updated version of my > draft for the HA quorum design and the proposed zookeeper integration. > Rather than using zookeeper to decide what gets started, this mechanism allows one to configure what individual services get started where, including the appropriate number of zookeeper instances. Zookeeper would then be viewed as a discoverable resource that can be used by the other services to determine who is the leader and whether or not a quorum exists before those services are used. > I realize that some jini implementation do provide capabilities similar to > what zookeeper provides. > I'm not sure what jini implementations you're talking about. Something not in the Jini starter kit? > Are you suggesting that or did you simply leave zookeeper and its roles in > configuration management, leader elections, etc. out of the demo? > As I said above, zookeeper was not left out. But I also said in my original posting that this work is not anywhere near complete, and was posted to give folks an idea of what could be done with install and deployment if the services are re-implemented to a smart proxy model and move to a shared nothing architecture; all of which I believe will be a significant amount of work. Perhaps in the future I should hold off on posting until the work is more complete. Sorry if I caused confusion. BrianM |
|
From: husdon <no...@no...> - 2010-06-24 14:57:28
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 14:07:19
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 13:16:37
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 11:46:15
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 02:12:51
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-24 01:24:27
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 21:21:39
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 20:34:16
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 16:13:55
|
See <http://localhost/job/BigData/changes> |
|
From: Bryan T. <br...@sy...> - 2010-06-23 15:52:01
|
Brian, Do you have any insight on how to ensure that only and all processes started by the services manager are taken down by a 'bigdata stop' without using "killall -9 java"? E.g., do you know if child processes will be destroyed when the parent terminates across platforms? When running this, I had the following errors related to the bigdata init.d script. [root@dutl-57 ~]# cp /opt/bigdata/etc/bigdata.initd /etc/init.d/bigdata [root@dutl-57 ~]# /etc/init.d/bigdata start /etc/init.d/bigdata: line 15: /lib/lsb/init-functions: No such file or directory /opt/bigdata/bin/initd-processes.sh: line 7: log_begin_msg: command not found /opt/bigdata/bin/initd-processes.sh: line 14: sudo: command not found /opt/bigdata/bin/initd-processes.sh: line 15: sudo: command not found /opt/bigdata/bin/initd-processes.sh: line 33: sudo: command not found I took the following steps to resolve those dependencies. yum install sudo yum install redhat-lsb However, that gives me only: [root@dutl-57 ~]# ls -l /etc/redhat-lsb total 32 -rwxr-xr-x 1 root root 70 Nov 10 2007 lsb_killproc -rwxr-xr-x 1 root root 243 Nov 10 2007 lsb_log_message -rwxr-xr-x 1 root root 59 Nov 10 2007 lsb_pidofproc -rwxr-xr-x 1 root root 254 Nov 10 2007 lsb_start_daemon I do not have log_begin_msg. I worked around this using "echo". However, things are still not starting. Maybe you can take a look at this host and see what is wrong with the configuration? Thanks, Bryan ________________________________ From: Brian Murphy [mailto:btm...@gm...] Sent: Monday, June 21, 2010 11:42 AM To: big...@li... Subject: [Bigdata-developers] Alternate install/deploy mechanism Just an FYI to those who might be interested. Over the last few weeks I've been looking into a deployment mechanism that might be used as an alternative to 'ant install'. The investigation has currently taken the form of some code that I've recently checked in to a personal branch (dev-btm). If anyone is interested in taking a look at this work and seeing how it might be used, one can follow the steps outlined below. BrianM --------------------------------------------------------------------------- > cd <baseDir> > svn checkout https://bigdata.svn.sourceforge.net/svnroot/bigdata/branches/dev-btm <baseDir>/bigdata/branches > ant release-dist > ls -al <baseDir>/bigdata/branches/dev-btm REL.bigdata-<version>-<date>.tgz (ex. REL.bigdata-0.82.0-210610.tgz) Open 3 command windows, WinA, WinB, and WinC (use sudo or login as root) --------------------------------------------------------------------------- -- WinA -- [install and deploy] > su # tar xzvof <baseDir>/bigdata/branches/dev-btm/REL.bigdata-<version>-<data>.tgz -C /opt # cp /opt/bigdata/var/config/deploy/example-deploy.properties /opt/bigdata/var/config/deploy/deploy.properties # vi /opt/bigdata/var/config/deploy/deploy.properties Un-comment the following items in deploy.properties: #federation.name<http://federation.name>=com.bigdata.group.0 #node.type=standalone #node.layout=1-of-1 #node.role=bigdata Uncomment and set the node.serviceNetwork item to the name of the node's network interface card (NIC); which can be found by typing 'ifconfig' on linux or 'ipconfig /all' on Windows. #node.serviceNetwork=eth0 Next, # cp /opt/bigdata/etc/bigdata.initd /etc/init.d/bigdata # /etc/init.d/bigdata start # /etc/init.d/bigdata status --------------------------------------------------------------------------- -- WinB -- [for non-graphical command line discovery tool] > su # /opt/bigdata/bin/disco-tool -v -g com.bigdata.group.0 --------------------------------------------------------------------------- -- WinC -- [for testing restart capability] > su # ps -elf | grep java Pick one of the pids from the output and kill that process. For example, suppose the java process associated with the "shardlocator" process is 24539 (that is, '# ps -elf | grep java | grep shardlocator' ==> 24539) # kill -9 24539 Observe the removed-then-added events displayed by the discovery tool in WinB. # /etc/init.d/bigdata status Observe that the output indicates that the shardlocator process is in the RUNNING state. # ps -elf | grep java | grep sharelocator Note that the service with process tag "shardlocator" appears, but its pid is no longer 24539; because the process was restarted upon the death of process 24539. # /etc/init.d/bigdata stop # /etc/init.d/bigdata status Observe that all processes are in the STOPPED state --------------------------------------------------------------------------- -- START ON BOOT -- To achieve start/restart on boot/reboot, for X=0-5, one can creaate the appropriate soft links from /etc/rcX.d/KXXbigdata and /etc/rcX.d/SXXbigdata to /etc/init.d/bigdata For example, on Ubuntu, one would do the following: # update-rc.d bigdata defaults [to remove the soft links type, '# update-rc.d -f bigdata remove'] --------------------------------------------------------------------------- -- NOTES & CAVEATS -- - The mechanism above is intended to support the installation and deployment of a system that may include other components as well as bigdata, or a system that includes only bigdata. Thus, although the files named default-deploy.properties and example-deploy.properties reference only a role value of "bigdata", other roles can be easily added. - The file example-deploy.properties is intended to be a template for the deploy.properties file that is used to communicate the top-level configuration to the mechanism. A single deploy.properties file cannot be used since the contents of that file will generally be different on different nodes in the system; although the goal of the deploy.properties file is to minimize the number of items that do differ from node to node. The deploy.properties file can be created by copying example-deploy.properties and then modifying the resulting file for the desired configuration (as shown above). Or it can be auto-generated by some tool (ex. scripts, awk/sed, puppet, etc.) - To avoid breaking existing code, the beginnings of smart proxy based counterparts to the bigdata services were created. Currently, those smart proxy based implementations include only the smart proxy pattern, the required public service interfaces, a common service attribute, and the necessary infrastructure for starting and stopping each service. None of these service implementations currently provide any bigdata-specific functionality. The smart proxy based service implementations are intended to share nothing but convenient helper utilities and the top-level deploy.properties configuration file (when different service implementations run on the same node). In addition to sharing a common Jini configuration file, the current, purely remote, service implementations share up to eight layers of common ancestry in the form of abstract and concrete super classes; which makes it unclear how much work it will take to either add the necessary functionality (and tests) to the smart proxy based implementations, or convert the current layered implementations to a smart proxy model. Thus, much more investigation and work needs to be done. |
|
From: husdon <no...@no...> - 2010-06-23 15:25:09
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 14:34:51
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 13:06:38
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 11:28:52
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 00:55:15
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-23 00:08:36
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-22 14:58:36
|
See <http://localhost/job/BigData/changes> |
|
From: husdon <no...@no...> - 2010-06-22 14:12:44
|
See <http://localhost/job/BigData/changes> |