From: Osullivan L. <L.O...@sw...> - 2012-09-27 14:19:38
|
Hi Demian, The Server itself has 6GB of total memory and I've asked for an extra 2GB to be added. The Jetty logs are humongous - can you suggest anything which might help to analyse them? Thanks, Luke ________________________________ From: Demian Katz [dem...@vi...] Sent: 27 September 2012 13:48 To: Osullivan L.; sol...@go...; vuf...@li... Subject: RE: Garbage Collection Have you tried analyzing the Jetty logs to see if there is a usage pattern in there that tells you anything? (This may not be very helpful since when you POST to Solr, the logs don’t contain much data… but you might at least be able to establish load patterns). And yes, 4GB should be plenty of memory. Have you considered the alternate possibility that it’s too much memory and that some other element of the system is getting starved? - Demian From: Osullivan L. [mailto:L.O...@sw...] Sent: Thursday, September 27, 2012 5:15 AM To: Demian Katz; sol...@go...; vuf...@li... Subject: RE: Garbage Collection Hi Demian, Thanks for your e-mail. The custom call number code isn't running on our live server. We do have a custom analyzer but that uses standard filters. I'm going to hunt through our custom code and see if there's anything which is making lots of Solr calls. We have 4GB is allocated to VuFind - I would have thought that would have been plenty regardless of what we were running. Thanks, Luke ________________________________ From: Demian Katz [dem...@vi...] Sent: 26 September 2012 16:05 To: sol...@go...; vuf...@li... Subject: Re: [VuFind-Tech] Garbage Collection I tried to do the jmap dump, but I got: Unable to open socket file: target process not responding or HotSpot VM not loaded The -F option can be used when the target process is not responding Adding the -F option just caused a usage notice to print in place of a dump. Not sure what's going on there -- I've never used jmap before. Regarding your custom functions, these all look like they are related to SolrMarc -- those shouldn't have any effect on your Solr performance (they are executed during SolrMarc indexing but have no subsequent bearing on anything). If you're worried about Solr performance related to custom code, you should instead be looking at any custom Solr logic you have installed (i.e. your custom call number filter... though that seems unlikely to be the culprit). - Demian ________________________________ From: sol...@go... [sol...@go...] on behalf of Osullivan L. [L.O...@sw...] Sent: Wednesday, September 26, 2012 9:35 AM To: vuf...@li...; sol...@go... Subject: [solrmarc-tech] RE: Garbage Collection Further to my previous e-mail, here are some of the custom functions we have running in case anyone can spot a problem. I've also attached our heavily modified format script. /** * Extract the call number label from a record * @param record * @return Call number label */ public String getFullCallNumberStrip(Record record, String fieldSpec) { String val = indexer.getFirstFieldVal(record, fieldSpec); if (val != null) { return val.toUpperCase().replaceAll(" ", "").replaceAll("\\.", "").replaceAll(">", ""); } else { return val; } } /** * Extract the first valid callnumber letter / value * * Can return null * * @param record * @return Call number label */ public String getCallNumberFirst(Record record) { String fieldSpec = "852"; List fields = record.getVariableFields(fieldSpec); System.out.println(fields.size()); Iterator fieldsIter = fields.iterator(); if (fields != null) { DataField field; while(fieldsIter.hasNext()) { field = (DataField) fieldsIter.next(); Subfield k = field.getSubfield('k'); if (k != null && k != false) { if (k.getData().equals("W/")) { return "W/"; } } } } List fields = record.getVariableFields(fieldSpec); System.out.println(fields.size()); Iterator fieldsIter = fields.iterator(); if (fields != null) { DataField field; while(fieldsIter.hasNext()) { field = (DataField) fieldsIter.next(); Subfield h = field.getSubfield('h'); if (h != null && h != false) { String hData = h.getData(); String [] callNumberSubject = hData.toUpperCase().split("[^A-Z]+"); if (callNumberSubject.length > 0 && callNumberSubject[0].length() < 3) { return callNumberSubject[0].substring(0, 1); } } } } return(null); } import org.marc4j.marc.Record; import org.solrmarc.tools.CallNumUtils; org.solrmarc.index.SolrIndexer indexer = null; /** * Normalize a single LCCN * @param lccn * @return Normalized LCCN */ public String getFullNormalizedLCCN(Record record, String lccn) { if (lccn != null) { lccn = indexer.getFirstFieldVal(record, lccn); String recordID = "1"; // Need to assign real value from record? if (lccn != null) { String normal = CallNumUtils.getLCShelfkey(lccn, recordID); if (normal != null) { // Send back normalized LCCN: return normal; } } } // If we got this far, we couldn't find a valid value: return null; } public Set getLibrary(Record record) { Set result = new LinkedHashSet(); Set fields = indexer.getFieldList(record, "852b"); Iterator fieldsIter = fields.iterator(); if (fields != null) { String location; while(fieldsIter.hasNext()) { location = fieldsIter.next(); if (location != null) { result.add(location.replaceAll(" ", "")); } else { result.add("Unknown"); } } } return result; } import bsh.BshMethod; import bsh.EvalError; import bsh.Interpreter; import bsh.NameSpace; import bsh.Primitive; import bsh.UtilEvalError; import java.io.ByteArrayOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.PrintStream; import java.io.UnsupportedEncodingException; import java.lang.reflect.InvocationTargetException; import java.lang.reflect.Method; import java.text.SimpleDateFormat; import java.util.Arrays; import java.util.Date; import java.util.Enumeration; import java.util.HashMap; import java.util.HashSet; import java.util.Iterator; import java.util.LinkedHashMap; import java.util.LinkedHashSet; import java.util.List; import java.util.Map; import java.util.Properties; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.apache.log4j.Logger; import org.marc4j.ErrorHandler; import org.marc4j.MarcStreamWriter; import org.marc4j.MarcWriter; import org.marc4j.MarcXmlWriter; import org.marc4j.marc.ControlField; import org.marc4j.marc.DataField; import org.marc4j.marc.Record; import org.marc4j.marc.Subfield; import org.marc4j.marc.VariableField; import org.solrmarc.marc.MarcImporter; import org.solrmarc.tools.SolrMarcIndexerException; import org.solrmarc.tools.Utils; org.solrmarc.index.SolrIndexer indexer = null; public static Set getTopics(Record record, String fieldSpec, String separator) { Pattern subfieldPattern; Set result = new LinkedHashSet(); String[] fldTags = fieldSpec.split(":"); for (int i = 0; i < fldTags.length; ++i) { if (fldTags[i].length() < 3) { System.err.println("Invalid tag specified: " + fldTags[i]); } else { String fldTag = fldTags[i].substring(0, 3); String subfldTags = fldTags[i].substring(3); List marcFieldList = record.getVariableFields(fldTag); if (marcFieldList.isEmpty()) continue; subfieldPattern = Pattern.compile((subfldTags.length() == 0) ? "." : subfldTags); for (VariableField vf : marcFieldList) { DataField marcField = (DataField)vf; char ind1 = marcField.getIndicator1(); char ind2 = marcField.getIndicator2(); if (ind2 != '5' && ind2 != '6' && ind2 != '7') { StringBuffer buffer = new StringBuffer(""); List subfields = marcField.getSubfields(); for (Subfield subfield : subfields) { Matcher matcher = subfieldPattern.matcher("" + subfield.getCode()); if (matcher.matches()) { if (buffer.length() > 0) buffer.append((separator != null) ? separator : " "); buffer.append(subfield.getData().trim()); } } if (buffer.length() > 0) { result.add(Utils.cleanData(buffer.toString())); } } } } } return result; } ________________________________ From: Osullivan L. [L.O...@sw...] Sent: 26 September 2012 14:23 To: vuf...@li...; sol...@go... Subject: [VuFind-Tech] Garbage Collection Hi Folks, It's the start of term here and, as there are induction courses, our VuFind instance takes a bigger hit. Sometimes for example, classes of 30+ will perform the same search and request results at the same time. Today, we had reports of slow / unresponsive service and Garbage Collection occurred just 2.5 hours apart in the middle of the day. Using jmap -histo:live <vufindPID> > dump.out I was able to get the results below and attached in full. Comparisons with other instances will be difficult but I was at least interested to see if similar rankings occur elsewhere. If you get the chance, can you run the command and let me know your top 30? We do have some custom index routines running so it's possible the problem lies there. Kind Regards, Luke num #instances #bytes class name ---------------------------------------------- 1: 2314196 259880112 [C 2: 63709 172426280 [I 3: 2263308 72425856 java.lang.String 4: 1066470 59722320 org.apache.lucene.document.Field 5: 15766 21338728 [B 6: 63609 7836776 [Ljava.lang.Object; 7: 5656 6131224 [Ljava.util.HashMap$Entry; 8: 30718 5056512 <constMethodKlass> 9: 177092 4250208 org.apache.lucene.index.Term 10: 374 4216712 [Ljava.lang.String; 11: 30718 3693008 <methodKlass> 12: 214 3402232 [J 13: 2903 3336176 <constantPoolKlass> 14: 2064 2583128 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 15: 53337 2572600 <symbolKlass> 16: 2903 2194752 <instanceKlassKlass> 17: 49680 1987200 java.util.LinkedHashMap$Entry 18: 2445 1894824 <constantPoolCacheKlass> 19: 74262 1782288 org.apache.lucene.search.BooleanClause 20: 36563 1755024 org.apache.lucene.index.TermInfosReader$TermInfoAndOrd 21: 67431 1618344 org.apache.lucene.search.TermQuery 22: 3213 1494128 <methodDataKlass> 23: 61846 1484304 java.util.ArrayList 24: 44895 1436640 java.util.concurrent.ConcurrentHashMap$HashEntry 25: 28160 1126400 org.tartarus.snowball.Among 26: 43243 691888 org.apache.lucene.index.TermInfosReader$CloneableTerm 27: 28302 679248 com.ibm.icu.text.BreakCTDictionary$CompactTrieHorizontalNode 28: 877 561280 [Lorg.apache.lucene.analysis.icu.segmentation.BreakIteratorWrapper; 29: 16679 533728 org.apache.lucene.search.BooleanQuery 30: 7210 519120 org.apache.lucene.store.MMapDirectory$MMapIndexInput -- You received this message because you are subscribed to the Google Groups "solrmarc-tech" group. To post to this group, send email to sol...@go.... To unsubscribe from this group, send email to sol...@go.... For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en. |