> For one, how do I start off a follower?
Typically, one starts a follower from an existing repository snapshot.
A repository snapshot is consists of the foxml & datastream files,
database dumps, and a copy of the RI from a quiescent Fedora instance.
Once the files have been placed (rsynced, etc) onto the follower machine
and databases have been populated from the dumps, the follower can be
started. It will have exactly the same data (and therefore repository
hash) as the machine from which the snapshot was taken.
> If I bring up a master and a
> slave, they'll each have 4 objects. But the repository hash is a
> combination of the number of objects and the filemod date of the
> newest object. Since it's unlikely I installed both installations at
> just the right moment, won't these two values always differ among
> separate instances? Are remote followers generally setup to ignore
> repository hash errors? Or do I fudge the filemod date?
To be consistent, you would start only one Fedora instance, let it
ingest the bootstrap objects, then stop it and take a snapshot.
Populate the follower machine from this data and start it up. They will
now have identical states.
> Second, I'd like to know if there's a regular process for spinning up
> a new follower for an installation that has been underway for a while.
> Do I down the master, copy its objects and datastreams over to the new
> follower, rebuild indexes, bring the follower up, and then the master?
> Or am I likely to run into more repository hash issues?
In a production situation, you might want to minimize the downtime of
the leader. So, to being a new follower into the fray, one could stop
an existing follower, take a snapshot, and restore the new follower from
At the NSDL, we maintain three machines in production: a leader, and two
followers. Every week, a cron job takes snapshot dumps from each
follower. Journal files are archived for at least a week. That way, if
we ever needed to spawn off a new follower (to replace a failed
machine), we can restore from an existing snapshot, and play back up to
a week of journal files to catch up to the current state. This has
allowed us to do upgrades and maintenance with minimal downtime in
The biggest "gotcha" is that we use RMI transports to ship journal files
to the followers. Since the connection is established from the leader
to the follower, we need to briefly re-start the leader Fedora instance
so that it can establish a connection to any newly introduced receivers.