From: Tod O. <to...@uc...> - 2013-10-29 21:00:50
|
vufind-tech, We're starting to poke into the call number browse in more detail, and that leads to a question about normalizing the call number for browse searching. What have sites been using to normalize their LC call numbers for browsing? I'm looking at possibly wrapping org.solrmarc.tools.CallNumUtils.getLCShelfkey() in a Bean Shell script. But I'm also concerned that the search terms be normalized the same way, and I don't really see how that would work. I mean, it's important that PS35 come before PS258 in the index, and that the use be dropped in the right place. Any advice appreciated, -Tod Tod Olson <to...@uc...> Systems Librarian University of Chicago Library |
From: Al R. <ala...@mn...> - 2013-10-30 03:47:32
|
Hello Tod, Version 1.3 answer I've customized the bsh script a little, but I use the getLCShelfkey() function. The customization has to do with testing for null values and catching exceptions. I also wrote code to normalize the search term. al /** * Given an alphabrowse seed for a call number * Normalize it so that it seeds into the browse index correctly * This needs to correspond with the java routine that builds the index * CallNumUtils::getLCShelfkey * * @param string $callNumberSeed The input call number string * * @return string * @access private */ private function _normalize_call_number($callNumberSeed) { // upper case the string $callNum = strtoupper(trim($callNumberSeed)); // get the LC start letters $callNumArray = preg_split("/(^[A-Z]+)/", $callNum, 2, PREG_SPLIT_DELIM_CAPTURE); // If we matched we will have an array of size 3 // [0] is empty, [1] is our letters, [2] is empty or has the rest of the string if (count($callNumArray) != 3) { return $callNum; } $finalCallNum = str_pad($callNumArray[1], 4, " ", STR_PAD_RIGHT); // Now lets look at the rest if (trim($callNumArray[2]) == '') { // There is nothing else, return the orignal return $callNum; } // We should have numbers to start with $callNumArray = preg_split("/(^[\d|.]*\d+)/", trim($callNumArray[2]), 2, PREG_SPLIT_DELIM_CAPTURE); // I don't think this can happen, but if (count($callNumArray) != 3) { return $callNum; } else { // We should have a number in $callNumArray[1] $callNumberArray = explode('.', trim($callNumArray[1]), 2); // Should be 2 pieces // It could be only 1 piece if (count($callNumberArray) > 2) { return $callNum; } $finalCallNum .= str_pad($callNumberArray[0], 4, '0', STR_PAD_LEFT) . '.'; if (isset($callNumberArray[1])) { $finalCallNum .= str_pad($callNumberArray[1], 6, '0', STR_PAD_RIGHT); } else { $finalCallNum .= '000000'; } } // Is there more? if not return what we have // Find letters and numbers, ignore .s while (isset($callNumArray[2]) && (trim($callNumArray[2]) != '') && (preg_match("/^.?[A-Z]/", trim($callNumArray[2])))) { $callNumArray = preg_split("/^\.?([A-Z]+[\d]+)/", trim($callNumArray[2]), 2, PREG_SPLIT_DELIM_CAPTURE); // This should be a letter followed by a number if (isset($callNumArray[1])) { $finalCallNum .= " " . substr($callNumArray[1], 0, 1) . "0." . str_pad(substr($callNumArray[1], 1), 6, '0', STR_PAD_RIGHT); } else { $finalCallNum .= " " . $callNumArray[0]; } } if (isset($callNumArray[2]) && (trim($callNumArray[2]) != '')) { $callNumArray = preg_split("/^(\d+)/", trim($callNumArray[2]), 2, PREG_SPLIT_DELIM_CAPTURE); $finalCallNum .= " " . str_pad($callNumArray[1], 6, '0', STR_PAD_LEFT) . $callNumArray[2]; } return $finalCallNum; } On 10/29/2013 04:00 PM, Tod Olson wrote: > vufind-tech, > > We're starting to poke into the call number browse in more detail, and that leads to a question about normalizing the call number for browse searching. > > What have sites been using to normalize their LC call numbers for browsing? I'm looking at possibly wrapping org.solrmarc.tools.CallNumUtils.getLCShelfkey() in a Bean Shell script. But I'm also concerned that the search terms be normalized the same way, and I don't really see how that would work. I mean, it's important that PS35 come before PS258 in the index, and that the use be dropped in the right place. > > Any advice appreciated, > > -Tod > > > > Tod Olson <to...@uc...> > Systems Librarian > University of Chicago Library > > > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech |
From: Demian K. <dem...@vi...> - 2013-10-30 13:41:12
|
Have you looked at https://vufind.org/jira/browse/VUFIND-598 and https://vufind.org/jira/browse/VUFIND-657 yet? I haven't had time to delve into these myself, but if you want to take a look and make recommendations, that would be a great help! - Demian > -----Original Message----- > From: Tod Olson [mailto:to...@uc...] > Sent: Tuesday, October 29, 2013 5:01 PM > To: vuf...@li... Tech > Subject: [VuFind-Tech] LC call number browse > > vufind-tech, > > We're starting to poke into the call number browse in more detail, and that > leads to a question about normalizing the call number for browse searching. > > What have sites been using to normalize their LC call numbers for browsing? > I'm looking at possibly wrapping > org.solrmarc.tools.CallNumUtils.getLCShelfkey() in a Bean Shell script. But > I'm also concerned that the search terms be normalized the same way, and I > don't really see how that would work. I mean, it's important that PS35 come > before PS258 in the index, and that the use be dropped in the right place. > > Any advice appreciated, > > -Tod > > > > Tod Olson <to...@uc...> > Systems Librarian > University of Chicago Library > > > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech |