|
From: Ben C. <ben...@gm...> - 2011-01-17 23:40:38
|
Hi All, I've created a How-To from my experiences with creating a read-only instance of netdisco using slony-I. Apologies for the length but I don't think attachments are going to work here. regards, B ============================================== How To Build A Read-Only Netdisco with Slony-I ============================================== ben...@gm... ------------ Introduction ------------ This document describes how to create a public-access 'slave' instance of netdisco that receives database updates from a private or 'master' netdisco server. This is useful in cases where security requirements dictate that netdisco users cannot access the same subnet that you SNMP-manage your network from. It also means community strings do not need to be stored on the slave server. As a bonus the read-only server also serves the purpose of an online database backup. We'll use the Slony-I replication system for postgres to do a partial replication of the database that still allows user and session information to be stored on the public server, whilst receiving all information about the network from the private server. Replication schemes that do not allow limited replication sets (such as postgres 9.0 streaming replication) will not work here as they replicate ALL information in a database cluster. ----------- Assumptions ----------- - A working Netdisco 1.0 installation on Red Hat Enterprise Linux 5 - A second netdisco 1.0 installation with all cron jobs disabled - Slony 2.0.6 - Postgres 8.4+ (8.1?) - Some knowledge of postgres administration and which user certain commands need to be run as ----- Notes ----- - This guide is based on a redhat installation (RHEL5), concepts will be similar for other distributions - This guide does NOT use the redhat provided packages, instead we use the standard rpms available from postgresql.org - Postgres does not need to be same version on both servers, but it will probably simplify things - The second installation does not need to be a fresh netdisco install - unsynced data in the tables is fine - We don't use the slony 'altperl' scripts for building a cluster, instead we create our own shell scripts. This seems to be just as fast. - The postgres graphical tool 'pgadmin' is useful for monitoring slony and generally exploring the changes it makes to your database to enable replication -------------- Prep for Slony -------------- Slony requires the postgres source files for the current version to build against. Do this on both servers. 1) Download the source RPM E.g. wget ftp://ftp.nz.postgresql.org/postgresql/binary/v9.0.2/linux/srpms/redhat/rhel-5-x86_64/postgresql90-9.0.2-1PGDG.rhel5.src.rpm rpm -i postgresql90-9.0.2-1PGDG.rhel5.src.rpm 2) Make sure postgres devel package is installed E.g. wget ftp://ftp.nz.postgresql.org/postgresql/binary/v9.0.2/linux/rpms/redhat/rhel-5-x86_64/postgresql90-devel-9.0.2-1PGDG.rhel5.x86_64.rpm rpm -i postgresql90-devel-9.0.2-1PGDG.rhel5.x86_64.rpm which pg_config 3) Download slony source files E.g. wget http://www.slony.info/downloads/2.0/source/slony1-2.0.6.tar.bz2 tar xjf slony1-2.0.6.tar.bz2 4) Note the directory containing postgres binaries E.g. which psql /usr/pgsql-9.0/bin ----------- Build Slony ----------- Do this on both servers. 1) configure E.g. cd slony1-2.0.6 ./configure --with-pgconfigdir=/usr/pgsql-9.0/bin 2) Check that the directories in the makefile make sense for your install E.g. less Makefile.global 3) Build and install make make install ------------------------------ Configure Slony for Production ------------------------------ I recommend creating a postgres superuser specifically for replication tasks. In addition creating a local account of the same name for the slon daemon to run under is a good idea. Do these steps on both servers. 1) Create slony local account useradd slony 2) Create slony pg account psql postgres CREATE USER slony WITH SUPERUSER PASSWORD 'whatever'; We need to allow the slony user to access the netdisco and postgres databases from and to either machine 3) Edit pg_hba.conf # example permissions that require a password to connect to a foreign host but not locally local all slony ident hostssl all slony 10.0.0.1/24 md5 # on slave hostssl all slony 10.0.0.2/24 md5 # on master Reload postgres after making these changes. Now all slony nodes (~hosts) must be able to connect to all other nodes. 4) Test db connectivity E.g. as the 'slony' local account # from master psql netdisco psql postgres psql -h <host2> netdisco psql -h <host2> postgres # from slave psql netdisco psql postgres psql -h <host1> netdisco psql -h <host1> postgres You can configure logging to taste. In this guide we will use stdout logging redirected to a central location. 5) Create slony log dir E.g. mkdir /var/log/slony1 chown slony.slony /var/log/slony1 We will need to run a daemon 'slon' on both machines. For robustness we will configure this a service under Red Hat. 6) Prep slon files cp /root/slony1-2.0.6/share/slon.conf-sample /etc/slon.conf; chmod 644 /etc/slon.conf cp /root/slony1-2.0.6/redhat/slony1.init /etc/init.d/slony1; chmod 755 /etc/init.d/slony1 7) Edit slon configuration E.g. vi /etc/slon.conf cluster_name='netdisco_cluster' conn_info='dbname=netdisco user=slony' log_level=4 # full debuggging initially The supplied init is a good example. I made a few small changes as follows: 8) Patch /etc/init.d/slony1 SLONDIR=/usr/pgsql-9.0/bin SLONDAEMON=$SLONDIR/slon SLONLOG=/var/log/slony1/slon.log SLONUSER=slony $SU -l $SLONUSER -c "$SLONDAEMON -f $SLONCONF &" >> "$SLONLOG" 2>&1 < /dev/null 9) Configure service chkconfig --add slony1 chkconfig slony1 on chkconfig --list slony1 -------------------------- Create replication cluster -------------------------- Next define the slony cluster. We use a shell script that calls the slonik binary. Slonik is the administration tool for slony. This script can be run from either machine where you installed slony (or indeed another machine entirely provided it has access to both databases). The complete script is provided below: #!/bin/sh # Defines a slony cluster and replication set for netdisco # Netdisco tables not replicated: # admin # device_port_log # device_port_ssid # device_port_wireless # log # process # sessions # user_log CLUSTER=netdisco_cluster DBNAME=netdisco HOST1=<master> HOST2=<slave> SLONY_USER=slony SLONY_PASS=<whatever> slonik <<_EOF_ cluster name = $CLUSTER; node 1 admin conninfo = 'dbname=$DBNAME host=$HOST1 user=$SLONY_USER password=$SLONY_PASS'; node 2 admin conninfo = 'dbname=$DBNAME host=$HOST2 user=$SLONY_USER password=$SLONY_PASS'; init cluster ( id = 1, comment = 'Node 1' ); create set ( id = 1, origin = 1, comment = 'Netdisco Read-Only Replication Set'); set add table ( set id = 1, origin = 1, id = 1, fully qualified name = 'public.device', comment = 'Table device'); set add table ( set id = 1, origin = 1, id = 2, fully qualified name = 'public.device_ip', comment = 'Table device_ip'); set add table ( set id = 1, origin = 1, id = 3, fully qualified name = 'public.device_module', comment = 'Table device_module'); set add table ( set id = 1, origin = 1, id = 4, fully qualified name = 'public.device_port', comment = 'Table device_port'); set add table ( set id = 1, origin = 1, id = 5, fully qualified name = 'public.device_port_power', comment = 'Table device_port_power'); set add table ( set id = 1, origin = 1, id = 6, fully qualified name = 'public.device_port_vlan', comment = 'Table device_port_vlan'); set add table ( set id = 1, origin = 1, id = 7, fully qualified name = 'public.device_power', comment = 'Table device_power'); set add table ( set id = 1, origin = 1, id = 8, fully qualified name = 'public.device_vlan', comment = 'Table device_vlan'); set add table ( set id = 1, origin = 1, id = 9, fully qualified name = 'public.node', comment = 'Table node'); set add table ( set id = 1, origin = 1, id = 10, fully qualified name = 'public.node_ip', comment = 'Table node_ip'); set add table ( set id = 1, origin = 1, id = 11, fully qualified name = 'public.node_monitor', comment = 'Table node_monitor'); set add table ( set id = 1, origin = 1, id = 12, fully qualified name = 'public.node_nbt', comment = 'Table node_nbt'); set add table ( set id = 1, origin = 1, id = 13, fully qualified name = 'public.oui', comment = 'Table oui'); set add table ( set id = 1, origin = 1, id = 14, fully qualified name = 'public.subnets', comment = 'Table subnets'); set add table ( set id = 1, origin = 1, id = 15, fully qualified name = 'public.users', comment = 'Table users'); store node ( id = 2, comment = 'Node 2', event node = 1); store path ( server = 1, client = 2, conninfo = 'dbname=$DBNAME host=$HOST1 user=$SLONY_USER password=$SLONY_PASS'); store path ( server = 2, client = 1, conninfo = 'dbname=$DBNAME host=$HOST2 user=$SLONY_USER password=$SLONY_PASS'); store listen ( origin = 1, provider = 1, receiver = 2); store listen ( origin = 2, provider = 2, receiver = 1); _EOF_ WARNING: If you are using the wireless features of netdisco you will need to modify this script. The commands above work for tables with primary keys and without sequences. Slony seems to require a keyless table to have column/s that still work like primary keys. Sequences also need to be added to replication set with a separate command. 1) Copy the script above into a file called '/home/slony/slony_netdisco_setup.sh' Make sure the file is chmod +x 2) Edit the script for your installation, specifically the shell variables for hosts and passwords 3) Run the script ./slony_netdisco_setup.sh If all goes well, your database has now been updated with the framework for replication. But replication has not actually commenced. ---------- Start slon ---------- First we need to start the daemons that will actually perform replication. Do these steps on both servers. 1) Test slon will start # Start slon manually on each server with: slon netdisco_cluster "dbname=netdisco user=slony" If slon starts ok, kill the process with Ctl-C and proceed 2) Start service service slony1 start If slon has started it should have read your slon.conf and be logging to the correct location. 3) Check logs E.g. tail -f /var/log/slony1/slon.log You can leave this command running on each server while we proceed to the next section in a separate terminal. ----------------- Start Replication ----------------- To start replication we need to 'subscribe' the slave to the master, and that means another shell script. You will only need to do this once, if either server restarts or goes offline the subscription remains active. Here is the subscribe script: ./slony_netdisco_subscribe.sh #!/bin/sh # Subscribes slave to master for replication events, effectively starting replication CLUSTER=netdisco_cluster DBNAME=netdisco HOST1=<master> HOST2=<slave> SLONY_USER=slony SLONY_PASS=<whatever> slonik <<_EOF_ cluster name = $CLUSTER; node 1 admin conninfo = 'dbname=$DBNAME host=$HOST1 user=$SLONY_USER password=$SLONY_PASS'; node 2 admin conninfo = 'dbname=$DBNAME host=$HOST2 user=$SLONY_USER password=$SLONY_PASS'; subscribe set ( id = 1, provider = 1, receiver = 2, forward = no); _EOF_ Now before we do anything with this it may be wise to run a full backup of each database. 1) Backup each database cluster E.g. su postgres pg_dumpall > somefile.sql 2) Copy the script above into a file called '/home/slony/slony_netdisco_subscribe.sh' Make sure the file is chmod +x 2) Edit the script for your installation, specifically the shell variables for hosts and passwords Now the moment of truth. You should see a flurry of logs as replication starts. Note that any existing in your tables on the slave with first be TRUNCATED before a COPY is initiated. Slony works by installing triggers, so thereafter every INSERT or UPDATE in the tables in your replication set will cause the same action to be performed on the slave. 4) Run the script ./slony_netdisco_setup.sh NOTE: This script will likely return very quickly. However that does not mean that the subsciption is complete. Depending on the size of your tables this could take some time. Monitor the logs in your other terminals to determine when the databases are finally in sync. ---------------------- Monitoring Replication ---------------------- Apart from looking at the data in your tables or doing manual inserts, there are a couple of ways to check on the status of slony. 1) Postgres logs Keep an eye on these. Slony will log attempts to write to tables that are now read-only, e.g. 2011-01-13 15:44:36 ESTERROR: Slony-I: Table device_port is replicated and cannot be modified on a subscriber node - role=0 2) Run the supplied perl script E.g. cp /root/slony1-2.0.6/tools/test_slony_state-dbi.pl /usr/pgsql-9.0/bin chmod +x /usr/pgsql-9.0/bin/test_slony_state-dbi.pl # as slony user test_slony_state-dbi.pl -d netdisco -c netdisco_cluster 3) Pgadmin You will see the replication set under your DB in padmin. Click on various sub-elements and the status tab to get status information. ------------- Netdisco Jobs ------------- The only job I chose to run on the 'slave' server is graph creation (netdisco -g). The others are all likely to cause errors or are only required on the master. ------------- Starting Over ------------- If something goes wrong a very quick way to start over is to right-click the replication set in pgadmin and select delete. |