Thank you for that pointer, Ted.  I did some work on this today and got it working between two dev instances on the same server.  I decided to use optimization as the trigger, since the import script doesn't optimize.  So I can import everything and then optimize it all, at which point my live site will pick up the changes.  This means I can leave polling on for the slave.  I will probably set it to be pretty infrequent, maybe every hour.  

I did make limited use of the Xinclude stuff but I don't think it is good enough in Solr 1.4 to justify putting anything in trunk. It works differently between Solr 1.4 and Solr 3.x, and the 1.4 way is a pain.  In 1.4 the include path is either absolute or relative from the cwd of the solr server (which you can see in the solr admin interface up at the top-left).  The way I have things set up with daemons, the cwd of the solr server is not a base for solr home, so I have to use an absolute path.  This just isn't flexible between multiple instances of vufind.  But in Solr 3.x relative paths are apparently relative from the config file location itself.  So I think once vufind is on Solr 3.x it would be great to go ahead and put an include to a local config file.  This should be tested but I think you may also have to stick an empty / dummy local config file in the distro to avoid an xml parsing error.

Best,
Anna



On Tue, Aug 23, 2011 at 08:26, Demian Katz <demian.katz@villanova.edu> wrote:
Your initial thoughts sound reasonable to me, and Ted's suggestion of manually triggering replication sounds good as well.

If you end up going with the Xinclude configuration option, let me know how it works out -- maybe we should consider putting a standard Xinclude statement into the default VuFind Solr configuration so that local customizations can be more easily separated from the core defaults to simplify the upgrade issues you mention.  If Xinclude works in the schema file, that would be a valuable improvement too!

- Demian
________________________________________
From: anna headley [anna3lc@gmail.com]
Sent: Monday, August 22, 2011 2:40 PM
To: vufind-tech@lists.sourceforge.net
Subject: [VuFind-Tech] solr index replication

Hi all,

I'm ready to set up our vufind instance to use two indices, and looking for tips / info.  The goal is to be able to reindex with zero downtime.

My plan is to run two servers with complete and working vufind installations: one for production and one for staging / emergency fallback.  I was planning to link the two solr instances in a master/slave relationship.

I believe the staging index would be the master and the production index would be the slave.  That way the staging (master) index could be deleted and reloaded, while the production (slave) index would remain online.  Am I understanding this right?

The solr documentation <http://wiki.apache.org/solr/SolrReplication#Slave> states that if a slave gets out of sync with the master, it doesn't do anything until the master sends a commit.  It looks like import-marc.sh sends a commit when it finishes processing a file.  But our full marc record export spans several files.  If I don't want the slave to replicate until the master is totally finished loading all the files, do I have a problem here?  I guess I could just mush them all into one big file first.

Finally, for anyone who has set this up, do you just make these configurations in vufind/solr/biblio/conf/solrconfig.xml?  Or do you include an external config file? <http://wiki.apache.org/solr/SolrConfigXml#XInclude> Either way it seems this could be an annoyance during vufind upgrades.

(The staging server will probably also house other dev installs, possibly with their own test indices or possibly linked to the staging index.  Either way the staging instance would not be used for development, more as a "last stop" for code on the way to the live server.)

Appreciate any suggestions or pointers to resources / documentation.

Thank you!
Anna
at Trico