From: Demian K. <dem...@vi...> - 2014-06-23 17:45:57
|
Glad to help! If memory serves, we don't actually have a mechanism for sorting facets by values rather than counts. It gets discussed a lot, but the inherent problem is that because facet lists have an arbitrary cut-off, the user interface tends to be rather odd when you sort alphabetically - the idea of "top 30 results" makes sense, but the idea of "first 30 alphabetical results" just tends to look incomplete. If you really want to do this anyway, I can look into the easiest way to make it happen... but so far people usually decide against it. - Demian From: Shepard, Thomas - 1150 - MITLL [mailto:tsh...@ll...] Sent: Monday, June 23, 2014 1:43 PM To: Demian Katz; Robert Haschart Cc: sol...@go...; Tod Olson; vuf...@li...; vuf...@li... Subject: RE: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files Yes, this second one worked! You've come through again for us, Demian! The first regex only returned the character string values, and I wanted numbers (83,84, etc). Anyway, it did exactly as I was hoping, so thanks! I will be looking into sorting this facet by values, not by counts, but I assume that's done in factets.ini, right? Thom From: Demian Katz [mailto:dem...@vi...] Sent: Monday, June 23, 2014 1:04 PM To: Shepard, Thomas - 1150 - MITLL; Robert Haschart Cc: sol...@go...; Tod Olson; vuf...@li...; vuf...@li... Subject: RE: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files I think you need to account for both the "with parentheses" and "without parentheses" cases. You may be able to do this by putting a ? after each escaped parenthesis in the regex to make them optional. However, there may be some other problems there... I think perhaps your \) should be a \\), and I'm not sure what the \\ u(.*) is about. Are you trying to capture the text that is inside the parentheses, or the text that comes later? Right now, it looks like $1 is going to be text that comes later. Is it possible you actually want something more like this: pattern_map.llgrp.pattern_0 = \\(?([^)]*)\\)?\s+.*=>$1<file:///\\(%3f([%5e)]*)\)%3f\s+.*=%3e$1> (Ohh, if only regular expressions were actually readable. So useful, but so hard to talk about!) Also, if you're just looking to remove parentheses entirely, and assuming you'll never have multiple or out-of-order parentheses, this one might work: pattern_map.llgrp.pattern_0 = ([^(]*)\\(?([^)]*)\\)?(.*)=>$1$2$3<file:///\\(%3f([%5e)]*)\)%3f(.*)=%3e$1$2$3> - Demian From: Shepard, Thomas - 1150 - MITLL [mailto:tsh...@ll...] Sent: Monday, June 23, 2014 12:05 PM To: Robert Haschart Cc: sol...@go...<mailto:sol...@go...>; Tod Olson; Demian Katz; vuf...@li...<mailto:vuf...@li...>; vuf...@li...<mailto:vuf...@li...> Subject: RE: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files The pattern_map solution for formats has worked out great! I am now hitting a wall with a similar but simpler problem. Our documents collection stores an organizational or group id in the 100 u and 700 u fields. I created a field in schema.xml and a group facet and through our marc_local.properties, I am able to import those values. BUT some of these group id values enclosed by parenthesis. I've used regex in Perl but I don't understand how vufind uses it in its import configuration. Here is my latest failed attempt: llgrp = 100u:700u,(pattern_map.llgrp) pattern_map.llgrp.pattern_0 = \\([^)]*\)\\ u(.*)=>$1 Can someone point me in the right direction? I simply want to remove parenthesis from both our 100h and 700h fields. Thanks, Thom Shepard From: Robert Haschart [mailto:rh...@vi...] Sent: Wednesday, June 18, 2014 2:14 PM To: Shepard, Thomas - 1150 - MITLL Cc: sol...@go...<mailto:sol...@go...>; Tod Olson; Demian Katz; vuf...@li...<mailto:vuf...@li...>; vuf...@li...<mailto:vuf...@li...> Subject: Re: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files The index specification below, with the map name surrounded by parentheses indicates that the map is to be found in the same file as the index specification line, marc_local.properties: medium_display = 245h, (pattern_map.medium), first It can also be written as: medium_display = 245h, medium_map.properties, first which indicates that the translation map can be found in the file named medium.properties a third form can be used: medium_display = 245h, translation_maps.properties(pattern_map.medium), first which indicates that the translation map can be found in the file named translation_maps.properties, but only the entries there that start with the prefix "pattern_map.medium" are a part of the translation map. -Bob On 6/18/2014 1:37 PM, Shepard, Thomas - 1150 - MITLL wrote: Wow, I didn't realize I could do that! This could be exactly what I need for this specific problem. Forgive this dumb question, but would this pattern.map also exist in marc_local.properties or maintained separately? I am in the middle of working on Todd's suggestion, but I am sure I'll use both suggestions for the various fields that I have to tinker with. Thanks! Thom From: Robert Haschart [mailto:rh...@vi...] Sent: Wednesday, June 18, 2014 1:26 PM To: sol...@go...<mailto:sol...@go...> Cc: Tod Olson; Shepard, Thomas - 1150 - MITLL; Demian Katz; vuf...@li...<mailto:vuf...@li...>; vuf...@li...<mailto:vuf...@li...> Subject: Re: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files Thomas Shepard: For question 1, you could use a pattern-based translation map, which allows regular-expression based pattern matching. The following example is one that we use for mapping the contents of the 245h field to a more normalized form. medium_display = 245h, (pattern_map.medium), first pattern_map.medium.pattern_0 = [Ss]ound[ ]+recording=>sound recording pattern_map.medium.pattern_1 = [Vv]ideo[-]?recording=>videorecording pattern_map.medium.pattern_2 = [Ee]lectronic book=>electronic book pattern_map.medium.pattern_3 = [Ee]lectronic [a-z]*=>electronic resource pattern_map.medium.pattern_4 = [Mm]icro(form|film|fiche)=>microform pattern_map.medium.pattern_5 = [Mm]icrofiche=>microform pattern_map.medium.pattern_6 = [Ss]lide=>slide pattern_map.medium.pattern_7 = CD=>sound recording pattern_map.medium.pattern_8 = DVD=>videorecording pattern_map.medium.pattern_9 = [Cc]omputer[ ]*file=>computer file pattern_map.medium.pattern_10 = [Mm]anuscript=>manuscript pattern_map.medium.pattern_11 = [Pp]icture=>picture pattern_map.medium.pattern_12 = \b[Gg]raphic\b=>graphic pattern_map.medium.pattern_13 = [Mm]ap=>cartographic material pattern_map.medium.pattern_13 = [Cc]artographic material=>cartographic material pattern_map.medium.pattern_14 = [Ss]eries record=>series record pattern_map.medium.pattern_15 = [Mm]otion picture=>motion picture pattern_map.medium.pattern_16 = [Aa]rt reproduction=>art reproduction pattern_map.medium.pattern_17 = [Aa]rt original=>art original pattern_map.medium.pattern_18 = [Mm]otion picture=>motion picture pattern_map.medium.pattern_19 = ^([Cc]hart|[Kk]it|[Bb]raille|[Rr]ealia|[Gg]ame|[Ee]quipment|[Ff]ilmstrip|[Ww]ebsite|[Tt]ransparency|[Mm]odel)$=>$1 For question 2 if you change the specification to be like this, I think it will work: format2 = 521a:300e, format2_map.properties, first -Bob Haschart On 6/18/2014 11:21 AM, Tod Olson wrote: Yes, either focus on the bash scripts for what you want to do, or if you're comfortable with Java you can do a mixin. If you want to do the bash script, pick apart this line from import/marc_local.properties: dewey-hundreds = script(dewey.bsh), getDeweyNumber(082a:083a, 100), ddc22_map.properties(hundreds) That names the script file to load, the function in the script file to invoke, and the translation map. getDeweyNumber() should show you how to process the tag string to pick up the fields you want. Best, -Tod On Jun 18, 2014, at 8:39 AM, Shepard, Thomas - 1150 - MITLL <tsh...@ll...<mailto:tsh...@ll...>> wrote: I have a couple of follow-up questions to my March 2014 post (which I should add has been resolved by escaping those pesky periods). 1. I want to create a new translation_map but it appear that the original values (left column) must be single strings/words without spaces. I would like vufind to be able to take a character string with spaces and punctuation and change it to a uniform lookup value. I suppose the real solution is to copy and edit one of those bash scripts found in index_scripts, but I was hoping to take this problem down an easier road. I tried using quotes but those didn't work. 2. In my marc_local.properties, I have attempted in various ways to set up two marc fields to share the same translation_map. To be specific, I want 521a and 300e to share format2_map.properties. format2 = 521a, format2_map.properties, first works just fine but none of the following variations do: format2 = 521a, format2_map.properties, first:300e, format2_map.properties, first format2 = 521a, format2_map.properties, first:300e, format2_map.properties This also works; format2 = 521a:300e to give me the raw values from those fields, but of course does nothing to normalize those values. So... should I focus on editing one of those bash scripts to get what I want? Thanks, Thom Shepard From: Demian Katz [mailto:dem...@vi...] Sent: Monday, March 24, 2014 2:48 PM To: Shepard, Thomas - 1150 - MITLL; vuf...@li...<mailto:vuf...@li...>; vuf...@li...<mailto:vuf...@li...> Cc: sol...@go...<mailto:sol...@go...> Subject: RE: vufind import options and mapping files I'm copying the solrmarc-tech list in case anyone there has more specific insights. I'm guessing that the problem here is that periods have a special meaning in the property map, because you can use them to create named maps. You might want to try escaping them with backslashes to see if that helps. If you remove all of the values from your map that contain dots, do you at least see the other values coming through? I would expect so - I see no other obvious problems with your map or properties configuration. If even a stripped-down map is failing, perhaps there is some other problematic detail that I am failing to spot right now. - Demian From: Shepard, Thomas - 1150 - MITLL [mailto:tsh...@ll...] Sent: Monday, March 24, 2014 1:53 PM To: vuf...@li...<mailto:vuf...@li...>; vuf...@li...<mailto:vuf...@li...> Subject: [VuFind-Tech] vufind import options and mapping files I have created a new facet in vufind that uses our marc 521 field. The facet successfully displayed all values, but that's now the problem. We see that some of the older marc records have inconsistent values. So I created a format2_map.properties file that contains the following: eBook. = eBook eBooks. = eBook eBook = eBook eBooks = eBook eBooks = eBook Book = Book Audiovisual. = Audiovisual Audiovisual = Audiovisual My expectation was that the values on the left (with periods) would map to the values on the right. I also thought that the values not listed in this file would vanish from the facet. Instead my format2 facet disappeared entirely. Here is my configuration for this field with our mapping lookup in our marc_local.properties format2 = 521a, format2_map.properties, first (I tried other variations as well) Again, the above results in no format2 facet getting displayed And, no surprise, the following results in ALL values in 521 being displayed. format2 = 521a So my question is, Must I remove periods from my 521 field either before I import my records or is there a simple way to tell vufind to do so? I realize I could do this in a bsh script but I am hoping I don't need to. Thanks in advance, Thom Shepard Thom Shepard Information Services Dept. MIT Lincoln Lab, SM-730A 244 Wood St. Lexington, 02420-9176 tsh...@ll...<mailto:tsh...@ll...> 781 981 0370 ------------------------------------------------------------------------------ HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions Find What Matters Most in Your Big Data with HPCC Systems Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. Leverages Graph Analysis for Fast Processing & Easy Data Exploration http://p.sf.net/sfu/hpccsystems_______________________________________________ VuFind-General mailing list VuF...@li...<mailto:VuF...@li...> https://lists.sourceforge.net/lists/listinfo/vufind-general -- You received this message because you are subscribed to the Google Groups "solrmarc-tech" group. To unsubscribe from this group and stop receiving emails from it, send an email to sol...@go...<mailto:sol...@go...>. To post to this group, send email to sol...@go...<mailto:sol...@go...>. Visit this group at http://groups.google.com/group/solrmarc-tech. For more options, visit https://groups.google.com/d/optout. |