> How are sites monitoring Solr for responsiveness?


Our monitoring is a little convoluted, but it works well (and it actually saved me last Friday when one of our primary Solr indices tanked).


To start, it seems that a very practical way to see if a Solr server is up is to check if it responds to queries (which is really the end-point service the Solr index provides to VuFind). If you just have a single Solr server, you could set up a trigger in your monitoring system (Cacti/Thold, Zabbix, manual scripts, etc.) on the index's response to a query. (e.g., http://your.solr.server:port/solr/biblio/select?q=*%3A*) If it returns an error code or times out, then something is wrong.


We have multiple Solr servers running behind HAProxy (for load-balancing), and the above query is how HAProxy tells if an individual server is up. We then have a script that returns HAProxy information to our monitoring server via SNMP. So if the Solr query fails, HAProxy announces fewer available Solr servers, and our monitoring system sends an alert if the condition remains the same for 5 minutes.


Benjamin Mosior