From: Lovins, D. <dan...@ya...> - 2010-01-21 14:34:04
|
Robert and Demian, Thanks for clarifying the role of the marc_local.properties file and BeanShell script, and for suggesting how to debug the callnumber-subject field. By the way, Jeff Barnett just pointed out to me that the problem is also occurring in the live demo (http://vufind.org/demo/) Daniel From: Robert Haschart [mailto:rh...@vi...] Sent: Tuesday, January 19, 2010 4:07 PM To: Demian Katz Cc: Lovins, Daniel; vuf...@li... Subject: Re: [VuFind-General] indexing and mapping "callnumber-subject" Demian is right. The BeanShell versions of the call number routines primarily are examples of how you can create and reference a custom indexing script. They were written to produce exactly the same results as the compiled versions. In fact marc_local.properties is only present as an example of how the import.properties file can reference two (or more) marc.properties files and have the second (and subsequent) file either add to the index specification of the first file, or override something that was defined in the first file. As far as your problem with the callnumber-subject fields not getting populated, inside the import directory there should be a bin directory, that contains a number of bash scripts and batch files that are useful utilities for handling Marc records and investigating problems in indexing them. To use them you first need to add the following line to your import.properties file: solrmarc.site.path = .. then open a command window to the import directory and type the following command: bin/indextest import.properties somemarcfile.mrc call* This command runs Solrmarc using your indexing specification and then rather than sending the record to Solr, it merely prints the results to the screen. (The last parameter is a pattern match, saying only print the indexfields that start with the letters "call" ) When run on a single record marc file then following is output: u55 : callnumber = PR4231.A431984 u55 : callnumber-a = PR4231 u55 : callnumber-first = P - Language and Literature u55 : callnumber-first-code = P u55 : callnumber-label = PR4231 u55 : callnumber-subject = PR - English Literature u55 : callnumber-subject-code = PR Lastly, if you want to see the contents of the record(s) you are trying to index, the printrecord command in the same directory will work. bin/printrecord import.properties u55.mrc produces this output: LEADER 03904pam a2200601 a 4500 001 u55 003 SIRSI 008 840322m19849999ksua b 00110ceng 010 $a 84005287 020 $a091145909X (v. 1) 035 $a(Sirsi) o10696561 035 $a(OCoLC)10696561 035 $bMX786425=@V1C1;DX786426=@V2C1;AX907389=UV1C2;CX9 07388=UV2C2 039 0 $a2$b3$c3$d3$e3 040 $aDLC$cDLC$dVA@ 043 $ae-uk-en 049 $aVA@@[Also in][Clemons]$aVA@U 050 0 $aPR4231$b.A4 1984 082 0 $a821/.8$aB$219 090 $aPR4231$b.A43 1984$mVA@U$qCLEMONS 090 $aPR4231$b.A43 1984$mVA@@$qALDERMAN 100 1 $aBrowning, Robert,$d1812-1889. 245 14$aThe Brownings' correspondence /$cedited by Philip Kelley & Ronald Hudson. 260 $aWinfield, KS :$bWedgestone Press,$cc1984- 300 $av. :$bill. (some col.) ;$c24 cm. 500 $aCorrespondence written by and to Robert and Elizabeth Barrett Browning. 500 $aVols. 9, 11 edited by Philip Kelley & Scott Lewis. 504 $aIncludes bibliographical references and indexes. 505 1 $av. 1. September 1809-December 1826, letters 1-244 -- v. 2. January 1827- December 1831, letters 245-434 -- v. 11. July 1845-January 1846, letters 1982-21 77 -- v. 12. January 1846-May 1846, letters 2178-2383 -- v. 13. May 1846-Septemb er 1846, letters 2384-2615. 596 $a2 3 600 10$aBrowning, Robert,$d1812-1889$vCorrespondence. 600 10$aBrowning, Elizabeth Barrett,$d1806-1861$vCorrespondence. 650 0$aPoets, English$y19th century$vBiography. 700 1 $aBrowning, Elizabeth Barrett,$d1806-1861. 700 1 $aKelley, Philip. 700 1 $aHudson, Ronald. 700 1 $aLewis, Scott,$d1957- Demian Katz wrote: First of all, to clear up any confusion about the properties files -- the callnumber-subject/callnumber-subject-code lines in marc.properties use the Java-based call number functions that are built into SolrMarc. The lines in marc_local.properties do exactly the same thing, but they use external BeanShell scripts instead. The BeanShell versions exist to make customization easier -- you can directly edit the .bsh files to achieve new effects without having to recompile the whole SolrMarc Java application. There is no reason to uncomment those lines and use the BeanShell versions unless you plan on changing the code, since the Java methods do the exact same thing, only slightly faster. As far as your problem with the callnumber-subject fields not getting populated, for what it's worth, I'm not seeing the same issue on my end. I just picked a record at random and looked at it through the Solr admin tool, and it includes the appropriate facets. What happens if you do a search for "callnumber-subject:[* TO *]"? Do you get anything at all? This might help narrow down whether you have the field in some records but not others, or whether even referencing the field causes some kind of error. It might also be worth using the BeanShell scripts as a debugging tool -- you can put some "print" statements in there, reindex some records, and at least confirm that SolrMarc is finding values in your records. That might turn up another clue or two. Please let me know if this is of any help; if not, feel free to pass along more clues, and I'll see if I can come up with more ideas. - Demian From: Lovins, Daniel [mailto:dan...@ya...] Sent: Tuesday, January 19, 2010 2:27 PM To: vuf...@li...<mailto:vuf...@li...> Subject: [VuFind-General] indexing and mapping "callnumber-subject" In the RC2 code we seem to have lost the facet for the two-letter classification codes, i.e., the "callnumber-subject" (or what we've locally been calling "sub-class" facets). When I check the marc.properties file I find the correct settings: callnumber-subject = custom, getCallNumberSubject, callnumber_subject_map.properties callnumber-subject-code = custom, getCallNumberSubject And when I check import/translation_maps I find the necessary mapping file, i.e.: "callnumber_subject_map.properties". And yet, despite this, out subclass data aren't getting into the Solr index. Is there something that needs to be done with callnumber.bsh [*]? Alternatively, do those same two lines of code need to be un-commented from marc_local.properties? Seems unlikely, but thought I'd ask since I found them there, commented out. Thanks! Daniel [*] "callnumber.bsh" [snip] public String getCallNumberSubject(Record record) { String val = indexer.getFirstFieldVal(record, "090a:050a"); if (val != null) { String [] callNumberSubject = val.toUpperCase().split("[^A-Z]+"); if (callNumberSubject.length > 0) { return callNumberSubject[0]; } } return(null); } ________________________________ ------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev ________________________________ _______________________________________________ VuFind-General mailing list VuF...@li...<mailto:VuF...@li...> https://lists.sourceforge.net/lists/listinfo/vufind-general |