Thanks to Robert and Tulie,

the problem was resolved as was suggested by correcting the configuration in marc_local.properties

author2 = 110ab:111ab:700abcd:710ab:711ab:929a:912a

 

 

Dalia Mendelsson         

IT Projects Manager

 

The Library Authority

The Hebrew University of Jerusalem

 

 

 

 

From: Tulie Amichal [mailto:tulie.amichal@gmail.com]
Sent: Thursday, October 27, 2011 2:39 PM
To: solrmarc-tech@googlegroups.com
Cc: vufind-tech@lists.sourceforge.net
Subject: Re: [VuFind-Tech] [solrmarc-tech] Problem loading some Russian and Arabic records into VuFind

 

Hi Robert,

I think you're correct. I did some analysis by creating marc files using marcedit each with one row removed and loading these and found that the problematic lines were fields such as 929 and 912.  Which are mapped in the following example:

author2 = 110ab:111ab:700abcd:710ab:711ab:929:912

So i assume that having these fields without subfield specifications in a field that is comprised of other tags with subfields will fail?

Thanks for confirming the bug
Tulie

On Wed, Oct 26, 2011 at 5:46 PM, Robert Haschart <rh9ec@virginia.edu> wrote:

Tulie,

Have you modified the marc.properties file or the marc_local.properties file?    In the code you highlighted it will take one code path if you provide a field specification like 

physical = 300abcefg:530abcd

where multiple subfields are to be extracted from a field and concatenated  (when   subfldsStr.length() > 1) ,  and another for when the length is not greater than 1

publisher = 260b

however it seems that is a bug in SolrMarc such that if you accidently forget to specify which subfield you are interested in, the subfldsStr.length()  will be zero.  Therefore it will take the second code path (because the length is not greater than 1) and then try to get the first character from a zero-length string, which will throw the exception you are getting.

so look for a field specification in marc.properties or marc_local.properties where there is a field tag specified, but no subfield tags specified, like the following:

publisher = 260
topic_facet = 600x:610x:611x:630:648x:650a:650x:651x:655x

                               ^

This is clearly a bug in SolrMarc.  Rather than crashing with a stack dump, should instead flag the error with a helpful error message and continue along its way.

-Robert Haschart



Tulie Amichal wrote:

Hi All

We're working with VuFind 1.0.1 and having problems loading certain records into VuFind. The records are mostly in Arabic and Russian. I'm attaching a sample record in Russian. I ran this record through MarcEdit to correct issues we found with empty indicators and when that was corrected the following error is now stopping us from loading:

ERROR [main] (MarcImporter.java:310) - Error indexing record: HUJ000043590 -- String index out of range: 0
java.lang.StringIndexOutOfBoundsException: String index out of range: 0
        at java.lang.String.charAt(String.java:694)
        at org.solrmarc.index.SolrIndexer.getSubfieldDataAsSet(SolrIndexer.java:1612)
        at org.solrmarc.index.SolrIndexer.getFieldList(SolrIndexer.java:1103)
        at org.solrmarc.index.SolrIndexer.map(SolrIndexer.java:536)
        at org.solrmarc.marc.MarcImporter.addToIndex(MarcImporter.java:329)
        at org.solrmarc.marc.MarcImporter.importRecords(MarcImporter.java:262)
        at org.solrmarc.marc.MarcImporter.handleAll(MarcImporter.java:506)
        at org.solrmarc.marc.MarcImporter.main(MarcImporter.java:785)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at com.simontuffs.onejar.Boot.run(Boot.java:334)
        at com.simontuffs.onejar.Boot.main(Boot.java:170)
 INFO [main] (MarcImporter.java:516) -  Adding 0 of 1 documents to index

Any Idea how I can see which field is causing the problem? Looking at the code  got me to a dead end. Unless im not reading this correctly the parameter subfldsStr is both not null, not longer than 1 char and has no value for character in position 0

(SolrIndexer.java roughly around line 1612)
 
            if (!isControlField(fldTag) && subfldsStr != null)
            {
                // DataField
                DataField dfield = (DataField) vf;
 
                if (subfldsStr.length() > 1 || separator != null)
                {
                ...[removed code
                }
                else
                {
                    // get all instances of the single subfield
                    List<Subfield> subFlds = dfield.getSubfields(subfldsStr.charAt(0));
                    for (Subfield sf : subFlds)
                    {
                        resultSet.add(sf.getData().trim());
                    }
                }
            }


Any ideas on why this is happening or steps to continue debugging this? we have about 100,000 records like these.

Thanks
Tulie


--

טולי עמיכל
052-8700781
tulie.amichal@gmail.com
http://about.me/tulie/

 

--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrmarc-tech@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.

 

--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrmarc-tech@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tech+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.




--

טולי עמיכל
052-8700781
tulie.amichal@gmail.com
http://about.me/tulie/