Thank you for that pointer, Ted. I did some work on this today and got it working between two dev instances on the same server. I decided to use optimization as the trigger, since the import script doesn't optimize. So I can import everything and then optimize it all, at which point my live site will pick up the changes. This means I can leave polling on for the slave. I will probably set it to be pretty infrequent, maybe every hour.
Your initial thoughts sound reasonable to me, and Ted's suggestion of manually triggering replication sounds good as well.
If you end up going with the Xinclude configuration option, let me know how it works out -- maybe we should consider putting a standard Xinclude statement into the default VuFind Solr configuration so that local customizations can be more easily separated from the core defaults to simplify the upgrade issues you mention. If Xinclude works in the schema file, that would be a valuable improvement too!
From: anna headley [firstname.lastname@example.org]
Sent: Monday, August 22, 2011 2:40 PM
Subject: [VuFind-Tech] solr index replication
I'm ready to set up our vufind instance to use two indices, and looking for tips / info. The goal is to be able to reindex with zero downtime.
My plan is to run two servers with complete and working vufind installations: one for production and one for staging / emergency fallback. I was planning to link the two solr instances in a master/slave relationship.
I believe the staging index would be the master and the production index would be the slave. That way the staging (master) index could be deleted and reloaded, while the production (slave) index would remain online. Am I understanding this right?
The solr documentation <http://wiki.apache.org/solr/SolrReplication#Slave> states that if a slave gets out of sync with the master, it doesn't do anything until the master sends a commit. It looks like import-marc.sh sends a commit when it finishes processing a file. But our full marc record export spans several files. If I don't want the slave to replicate until the master is totally finished loading all the files, do I have a problem here? I guess I could just mush them all into one big file first.
Finally, for anyone who has set this up, do you just make these configurations in vufind/solr/biblio/conf/solrconfig.xml? Or do you include an external config file? <http://wiki.apache.org/solr/SolrConfigXml#XInclude> Either way it seems this could be an annoyance during vufind upgrades.
(The staging server will probably also house other dev installs, possibly with their own test indices or possibly linked to the staging index. Either way the staging instance would not be used for development, more as a "last stop" for code on the way to the live server.)
Appreciate any suggestions or pointers to resources / documentation.