From: Dalia M. <da...@sa...> - 2011-10-31 16:00:57
|
Thanks to Robert and Tulie, the problem was resolved as was suggested by correcting the configuration in marc_local.properties author2 = 110ab:111ab:700abcd:710ab:711ab:929a:912a Dalia Mendelsson IT Projects Manager The Library Authority The Hebrew University of Jerusalem From: Tulie Amichal [mailto:tul...@gm...] Sent: Thursday, October 27, 2011 2:39 PM To: sol...@go... Cc: vuf...@li... Subject: Re: [VuFind-Tech] [solrmarc-tech] Problem loading some Russian and Arabic records into VuFind Hi Robert, I think you're correct. I did some analysis by creating marc files using marcedit each with one row removed and loading these and found that the problematic lines were fields such as 929 and 912. Which are mapped in the following example: author2 = 110ab:111ab:700abcd:710ab:711ab:929:912 So i assume that having these fields without subfield specifications in a field that is comprised of other tags with subfields will fail? Thanks for confirming the bug Tulie On Wed, Oct 26, 2011 at 5:46 PM, Robert Haschart <rh...@vi...<mailto:rh...@vi...>> wrote: Tulie, Have you modified the marc.properties file or the marc_local.properties file? In the code you highlighted it will take one code path if you provide a field specification like physical = 300abcefg:530abcd where multiple subfields are to be extracted from a field and concatenated (when subfldsStr.length() > 1) , and another for when the length is not greater than 1 publisher = 260b however it seems that is a bug in SolrMarc such that if you accidently forget to specify which subfield you are interested in, the subfldsStr.length() will be zero. Therefore it will take the second code path (because the length is not greater than 1) and then try to get the first character from a zero-length string, which will throw the exception you are getting. so look for a field specification in marc.properties or marc_local.properties where there is a field tag specified, but no subfield tags specified, like the following: publisher = 260 topic_facet = 600x:610x:611x:630:648x:650a:650x:651x:655x ^ This is clearly a bug in SolrMarc. Rather than crashing with a stack dump, should instead flag the error with a helpful error message and continue along its way. -Robert Haschart Tulie Amichal wrote: Hi All We're working with VuFind 1.0.1 and having problems loading certain records into VuFind. The records are mostly in Arabic and Russian. I'm attaching a sample record in Russian. I ran this record through MarcEdit to correct issues we found with empty indicators and when that was corrected the following error is now stopping us from loading: ERROR [main] (MarcImporter.java:310) - Error indexing record: HUJ000043590 -- String index out of range: 0 java.lang.StringIndexOutOfBoundsException: String index out of range: 0 at java.lang.String.charAt(String.java:694) at org.solrmarc.index.SolrIndexer.getSubfieldDataAsSet(SolrIndexer.java:1612) at org.solrmarc.index.SolrIndexer.getFieldList(SolrIndexer.java:1103) at org.solrmarc.index.SolrIndexer.map(SolrIndexer.java:536) at org.solrmarc.marc.MarcImporter.addToIndex(MarcImporter.java:329) at org.solrmarc.marc.MarcImporter.importRecords(MarcImporter.java:262) at org.solrmarc.marc.MarcImporter.handleAll(MarcImporter.java:506) at org.solrmarc.marc.MarcImporter.main(MarcImporter.java:785) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.simontuffs.onejar.Boot.run(Boot.java:334) at com.simontuffs.onejar.Boot.main(Boot.java:170) INFO [main] (MarcImporter.java:516) - Adding 0 of 1 documents to index Any Idea how I can see which field is causing the problem? Looking at the code got me to a dead end. Unless im not reading this correctly the parameter subfldsStr is both not null, not longer than 1 char and has no value for character in position 0 (SolrIndexer.java roughly around line 1612) if (!isControlField(fldTag) && subfldsStr != null) { // DataField DataField dfield = (DataField) vf; if (subfldsStr.length() > 1 || separator != null) { ...[removed code } else { // get all instances of the single subfield List<Subfield> subFlds = dfield.getSubfields(subfldsStr.charAt(0)); for (Subfield sf : subFlds) { resultSet.add(sf.getData().trim()); } } } Any ideas on why this is happening or steps to continue debugging this? we have about 100,000 records like these. Thanks Tulie -- טולי עמיכל 052-8700781 tul...@gm...<mailto:tul...@gm...> http://about.me/tulie/ -- You received this message because you are subscribed to the Google Groups "solrmarc-tech" group. To post to this group, send email to sol...@go...<mailto:sol...@go...>. To unsubscribe from this group, send email to sol...@go...<mailto:sol...@go...>. For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en. -- You received this message because you are subscribed to the Google Groups "solrmarc-tech" group. To post to this group, send email to sol...@go...<mailto:sol...@go...>. To unsubscribe from this group, send email to sol...@go...<mailto:solrmarc-tech%2Bu...@go...>. For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en. -- טולי עמיכל 052-8700781 tul...@gm...<mailto:tul...@gm...> http://about.me/tulie/ |