From: Barnett, J. <jef...@ya...> - 2008-09-12 13:37:22
|
Thanks Dan. I'll make your suggested change locally, and hope it eventually makes it into the trunk as well. I also mentioned the "clean MARC" suggestion to our catalogers. -----Original Message----- From: Dan Scott [mailto:de...@gm...] Sent: Thursday, September 11, 2008 11:30 PM To: Barnett, Jeffrey Cc: vuf...@li...; Andrew Nagy Subject: Re: [VuFind-Tech] Undefined offset: 1 in /usr/local/yulprog/yufind-r792/web/File/MARC.php on line 273 2008/9/11 Barnett, Jeffrey <jef...@ya...>: <snip> > Here is the code (unchanged in trunk since r754 (by asnagy)) > > 262 private function _decode($text) > 263 { > 264 $matches = array(); > 265 //if (!preg_match("/^(\d{5})/", $text, $matches)) { > 266 // $errorMessage = File_MARC_Exception::formatError(File_MARC_Exception::$messages[File_MARC_Exception::ERROR_NONNUMERIC_LENGTH], array("record_length" => sub > str($text, 0, 5))); > 267 // throw new File_MARC_Exception($errorMessage, File_MARC_Exception::ERROR_NONNUMERIC_LENGTH); > 268 //} > 269 > 270 $marc = new File_MARC_Record(); > 271 > 272 // Store record length > 273 $record_length = $matches[1]; > > Three questions: > > 1) Why is it failing, not once but dozens of times for a single record? > 2) Why was the preg_match test removed? > 3) How is the value of $text transferred to $matches (or otherwise processed)? As one of the developers of the original File_MARC code over at http://pear.php.net, I think the answer to 2) is that it was an attempt to relax File_MARC's relatively strict format checking. Presumably Andrew has run into records that don't have a 5-digit length at the start of the leader, which would normally cause File_MARC to throw an exception, and he tried to disable that exception by commenting out those lines. In your case, the Details view appears to show that the first five characters of the leader are "a0211" - that's a corrupted MARC record. As you've noticed, though, the result of that approach is that the $matches array can never contain an element; the answer to 3) is that it's not. In some other places in File_MARC where it's possible to relax the parser without fatal results, I've replaced the exception by adding the error message to a warnings array in the record. In this case, a tolerable hack to avoid the warnings you're getting, and to have a meaningful value for $record_length, while not throwing exceptions for naughty MARC records that don't have a valid length in their leader, might be something like replacing line 273 with the following code: preg_match("/^(\d{5})/", $text, $matches); $record_length = 0; if (count(matches) > 1) { $record_length = $matches[1]; } else { $record_length = strlen($text); } Ideally, of course, one would have standards-compliant MARC records for input so that these kinds of hacks weren't necessary; trying to successfully process non-compliant MARC records through two different toolchains (SolrMARC/MARC4J and File_MARC) would almost demand clean MARC for input. Heh... "clean MARC". -- Dan Scott Laurentian University |