The newer version of SolrMarc in the current trunk is a bit more error tolerant -- instead of dying outright, it is able to correct some errors and instead report them to a "marc_error" index field.  My index has quite a few of these messages in it:

 

Typo         : Erroneous character found at end of leader [ 45   ]; changing them to the standard "4500"

 

It definitely sounds like your problem is related to this.  As noted in my last message, it's probably worth trying a test import using a new install from the trunk.  If that solves a lot of issues and you need a fix before RC2 comes out, I think you can probably patch the newer SolrMarc into RC1 without too much difficulty just by adding the marc_error field to solr/biblio/conf/schema.xml, loading the newer import-marc.sh and import/bin directories, and possibly making some minor adjustments to import/*.properties.

 

- Demian

 

From: Ya'aqov Ziso [mailto:ziso@rowan.edu]
Sent: Monday, October 26, 2009 5:18 PM
To: Philip Shafer; Demian Katz; vufind-general@lists.sourceforge.net
Subject: Re: [VuFind-General] Solrmarc Import issues

 

Could this be parsing rule expects a certain length for this field, checks and finds length is incorrect (45 at the end needs to be 4500) ?
Ya’aqov






On 10/26/09 4:53 PM, "Philip Shafer" <shafer@rowan.edu> wrote:

This is another error that I get while running the import script.  I
actually receive about 220+ erros like this.  Suggestions are helpful.

Again, we are running VUFind RC1 and Voyager

009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 02244cam a2200493La 45
  2009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01698cam a2200397La 45
  2009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01493cam a2200349La 45
  2009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01759cam a2200421La 45
  2009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01617cam a2200373La 45
  2009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01689cam a2200385La 45
  2009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01755cam a2200385La 45
  2009-10-26 16:39:43,172 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01716cam a2200373La 45
  2009-10-26 16:39:43,173 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 02018cam a2200373La 45
  2009-10-26 16:39:43,173 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01492cam a2200349La 45
  2009-10-26 16:39:43,173 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01885cam a2200433La 45
  2009-10-26 16:39:43,173 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01528cam a2200349La 45
  2009-10-26 16:39:43,173 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01876cam a2200433La 45
  2009-10-26 16:39:43,173 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01501cam a2200337La 45
  2009-10-26 16:39:43,173 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01522cam a2200349La 45
  2009-10-26 16:39:43,174 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01403cam a2200325La 45
  2009-10-26 16:39:43,174 [main] ERROR main org.solrmarc.marc.MarcImporter -
Error reading record: error parsing leader with data: 01501cam a2200337La 45
------------------------------

Philip Shafer
Library System Services
Rowan University Library
201 Mullica Hill Rd
Glassboro, NJ  08028
856-256-4418
856-256-4924 Fax



> From: Philip Shafer <shafer@rowan.edu>
> Date: Mon, 26 Oct 2009 16:08:32 -0400
> To: Demian Katz <demian.katz@villanova.edu>,
> "vufind-general@lists.sourceforge.net" <vufind-general@lists.sourceforge.net>
> Conversation: Solrmarc Import issues
> Subject: Re: Solrmarc Import issues
>
> So to see if there would be any significant change in records imported, I took
> Demian's advice and went with option 3.)
>
>> 3.) Tell SolrMarc to ignore all but the first value.  You can change the line
>> in import/marc.properties like this:
>>
>> lccn = 010a, first
>
>
> This fixed a significant number of issues:
>
> We exported: 391522 marc records
>
> We had: 386367 records indexed in vufind
>
> Now we have: 389726 records indexed in vufind
>  
> An improvement of, 3,359 records.
>
> However, this still leaves us with 1,796 records not being indexed.
>
> Unfortunately this is difficult to see why the records aren't being indexed,
> since it seems that the solarmarc.log only has the last 1000 or so logs from
> the import.
>
> Is there anyway that I can dump all the errors out to a permanent error log?
>
> Thanks,
>
> Phil
> ------------------------------
>
> Philip Shafer
> Library System Services
> Rowan University Library
> 201 Mullica Hill Rd
> Glassboro, NJ  08028
> 856-256-4418
> 856-256-4924 Fax
>
>
>
>> From: Demian Katz <demian.katz@villanova.edu>
>> Date: Mon, 26 Oct 2009 14:33:08 -0400
>> To: Philip Shafer <shafer@rowan.edu>, "vufind-general@lists.sourceforge.net"
>> <vufind-general@lists.sourceforge.net>
>> Subject: RE: Solrmarc Import issues
>>
>> Some fields in the Solr index are only able to accept a single value.  If a
>> MARC field repeats unexpectedly, you'll see this error.  There are a few
>> possible solutions:
>>
>> 1.) Fix the MARC records -- in the case of LCCN, I believe it's abnormal for
>> there to be multiple values.  If it's practical, you may want to try to fix
>> the issue from the cataloging side.
>>
>> 2.) Accept multiple values.  You can change the appropriate line in
>> solr/biblio/conf/schema.xml like this:
>>
>> <field name="lccn" type="string" indexed="true" stored="true"
>> multiValued="true"/>
>>
>> Note that making some fields multi-valued may require other code changes --
>> if
>> the PHP code and Smarty templates assume that a field is always
>> single-valued,
>> you may end up seeing the word "Array" in inappropriate places when
>> multi-valued results are encountered.
>>
>> 3.) Tell SolrMarc to ignore all but the first value.  You can change the line
>> in import/marc.properties like this:
>>
>> lccn = 010a, first
>>
>> This way, only the first 010a value will go into your index, and anything
>> else
>> in the MARC record will be ignored.
>>
>> Obviously, for the example of LCCN, the decision isn't that important since
>> the value isn't used for much in VuFind.  If you're seeing similar problems
>> for other index fields, you may have to weigh your options more carefully.
>>
>> I hope this is a helpful start -- please let me know if you have any further
>> questions.
>>
>> - Demian
>>
>>> -----Original Message-----
>>> From: Philip Shafer [mailto:shafer@rowan.edu]
>>> Sent: Monday, October 26, 2009 2:24 PM
>>> To: vufind-general@lists.sourceforge.net
>>> Subject: [VuFind-General] Solrmarc Import issues
>>>
>>> I have a few records (actually I’m suspecting quite a few) that cannot
>>> be
>>> imported, so I’m trying to pair down the errors on import.
>>>
>>> As I find records, I’m exporting individual marc records (from Voyager)
>>> and
>>> trying to import them to see what the errors are.  I’m hoping someone
>>> on
>>> this mailing list can tell me what they mean.
>>>
>>> We are running Vufind RC1
>>>
>>> 2009-10-26 14:10:04,229 [main] ERROR main
>>> org.solrmarc.marc.MarcImporter -
>>> Control Number 447171
>>>
>>> org.apache.solr.common.SolrException: ERROR: multiple values
>>> encountered for
>>> non multiValued field lccn: first='80000702' second='2002213653'
>>>
>>> 2009-10-26 14:21:03,069 [main] ERROR main
>>> org.solrmarc.marc.MarcImporter -
>>> Control Number 455980
>>>
>>> org.apache.solr.common.SolrException: ERROR: multiple values
>>> encountered for
>>> non multiValued field lccn: first='89029082' second='sn 89029082'
>>>
>>>
>>> Any explanation would be very helpful.
>>>
>>> Thanks,
>>>
>>> Phil
>>> ------------------------------
>>>
>>> Philip Shafer
>>> Library System Services
>>> Rowan University Library
>>> 201 Mullica Hill Rd
>>> Glassboro, NJ  08028
>>> 856-256-4418
>>> 856-256-4924 Fax
>>>
>>>
>>>
>>> -----------------------------------------------------------------------
>>> -------
>>> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
>>> is the only developer event you need to attend this year. Jumpstart
>>> your
>>> developing skills, take BlackBerry mobile applications to market and
>>> stay
>>> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
>>> http://p.sf.net/sfu/devconference
>>> _______________________________________________
>>> VuFind-General mailing list
>>> VuFind-General@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/vufind-general


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
VuFind-General mailing list
VuFind-General@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/vufind-general