Glad to help!

 

If memory serves, we don’t actually have a mechanism for sorting facets by values rather than counts. It gets discussed a lot, but the inherent problem is that because facet lists have an arbitrary cut-off, the user interface tends to be rather odd when you sort alphabetically – the idea of “top 30 results” makes sense, but the idea of “first 30 alphabetical results” just tends to look incomplete. If you really want to do this anyway, I can look into the easiest way to make it happen… but so far people usually decide against it.

 

- Demian

 

From: Shepard, Thomas - 1150 - MITLL [mailto:tshepard@ll.mit.edu]
Sent: Monday, June 23, 2014 1:43 PM
To: Demian Katz; Robert Haschart
Cc: solrmarc-tech@googlegroups.com; Tod Olson; vufind-tech@lists.sourceforge.net; vufind-general@lists.sourceforge.net
Subject: RE: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files

 

Yes, this second one worked! You’ve come through again for us, Demian!

The first regex only returned the character string values, and I wanted numbers (83,84, etc).

Anyway, it did exactly as I was hoping, so thanks!

I will be looking into sorting this facet by values, not by counts, but I assume that’s done in factets.ini, right?

Thom

 

From: Demian Katz [mailto:demian.katz@villanova.edu]
Sent: Monday, June 23, 2014 1:04 PM
To: Shepard, Thomas - 1150 - MITLL; Robert Haschart
Cc: solrmarc-tech@googlegroups.com; Tod Olson; vufind-tech@lists.sourceforge.net; vufind-general@lists.sourceforge.net
Subject: RE: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files

 

I think you need to account for both the “with parentheses” and “without parentheses” cases. You may be able to do this by putting a ? after each escaped parenthesis in the regex to make them optional. However, there may be some other problems there… I think perhaps your \) should be a \\), and I’m not sure what the \\ u(.*) is about. Are you trying to capture the text that is inside the parentheses, or the text that comes later? Right now, it looks like $1 is going to be text that comes later. Is it possible you actually want something more like this:

 

pattern_map.llgrp.pattern_0 = \\(?([^)]*)\\)?\s+.*=>$1

 

(Ohh, if only regular expressions were actually readable. So useful, but so hard to talk about!)

 

Also, if you’re just looking to remove parentheses entirely, and assuming you’ll never have multiple or out-of-order parentheses, this one might work:

 

pattern_map.llgrp.pattern_0 = ([^(]*)\\(?([^)]*)\\)?(.*)=>$1$2$3

 

- Demian

 

 

From: Shepard, Thomas - 1150 - MITLL [mailto:tshepard@ll.mit.edu]
Sent: Monday, June 23, 2014 12:05 PM
To: Robert Haschart
Cc: solrmarc-tech@googlegroups.com; Tod Olson; Demian Katz; vufind-tech@lists.sourceforge.net; vufind-general@lists.sourceforge.net
Subject: RE: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files

 

The pattern_map solution for formats has worked out great!

 

I am now hitting a wall with a similar but simpler problem.

 

Our documents collection stores an organizational or group id in the 100 u and 700 u fields.

I created a field in schema.xml and a group facet and through our marc_local.properties, I am able to import those values.

 

BUT some of these group id values enclosed by parenthesis.

I’ve used regex in Perl but I don’t understand how vufind uses it in its import configuration.

 

Here is my latest failed attempt:

 

llgrp = 100u:700u,(pattern_map.llgrp)

pattern_map.llgrp.pattern_0 = \\([^)]*\)\\ u(.*)=>$1

 

Can someone point me in the right direction?

I simply want to remove parenthesis from both our 100h and 700h fields.

Thanks,

Thom Shepard

 

From: Robert Haschart [mailto:rh9ec@virginia.edu]
Sent: Wednesday, June 18, 2014 2:14 PM
To: Shepard, Thomas - 1150 - MITLL
Cc: solrmarc-tech@googlegroups.com; Tod Olson; Demian Katz; vufind-tech@lists.sourceforge.net; vufind-general@lists.sourceforge.net
Subject: Re: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files

 

The index specification below, with the map name surrounded by parentheses indicates that the map is to be found in the same file as the index specification line,   marc_local.properties:

medium_display = 245h, (pattern_map.medium), first

It can also be written as:

medium_display = 245h, medium_map.properties, first

which indicates that the translation map can be found in the file named medium.properties

a third form can be used:

medium_display = 245h, translation_maps.properties(pattern_map.medium), first

which indicates that the translation map can be found in the file named  translation_maps.properties, but only the entries there that start with the prefix "pattern_map.medium" are a part of the translation map.


-Bob


On 6/18/2014 1:37 PM, Shepard, Thomas - 1150 - MITLL wrote:

Wow, I didn’t realize I could do that!

This could be exactly what I need for this specific problem.

Forgive this dumb question, but would this pattern.map also exist in marc_local.properties or maintained separately?

 

I am in the middle of working on Todd’s suggestion, but I am sure I’ll use both suggestions for the various fields that I have to tinker with.

 

Thanks!

Thom

 

From: Robert Haschart [mailto:rh9ec@virginia.edu]
Sent: Wednesday, June 18, 2014 1:26 PM
To: solrmarc-tech@googlegroups.com
Cc: Tod Olson; Shepard, Thomas - 1150 - MITLL; Demian Katz; vufind-tech@lists.sourceforge.net; vufind-general@lists.sourceforge.net
Subject: Re: [solrmarc-tech] Re: [VuFind-General] vufind import options and mapping files

 

Thomas Shepard:

 For question 1, you could use a pattern-based translation map, which allows regular-expression based pattern matching.  The following example is one that we use for mapping the contents of the 245h field to a more normalized form.

medium_display = 245h, (pattern_map.medium), first

pattern_map.medium.pattern_0 = [Ss]ound[ ]+recording=>sound recording
pattern_map.medium.pattern_1 = [Vv]ideo[-]?recording=>videorecording
pattern_map.medium.pattern_2 = [Ee]lectronic book=>electronic book
pattern_map.medium.pattern_3 = [Ee]lectronic [a-z]*=>electronic resource
pattern_map.medium.pattern_4 = [Mm]icro(form|film|fiche)=>microform
pattern_map.medium.pattern_5 = [Mm]icrofiche=>microform
pattern_map.medium.pattern_6 = [Ss]lide=>slide
pattern_map.medium.pattern_7 = CD=>sound recording
pattern_map.medium.pattern_8 = DVD=>videorecording
pattern_map.medium.pattern_9 = [Cc]omputer[ ]*file=>computer file
pattern_map.medium.pattern_10 = [Mm]anuscript=>manuscript
pattern_map.medium.pattern_11 = [Pp]icture=>picture
pattern_map.medium.pattern_12 = \b[Gg]raphic\b=>graphic
pattern_map.medium.pattern_13 = [Mm]ap=>cartographic material
pattern_map.medium.pattern_13 = [Cc]artographic material=>cartographic material
pattern_map.medium.pattern_14 = [Ss]eries record=>series record
pattern_map.medium.pattern_15 = [Mm]otion picture=>motion picture
pattern_map.medium.pattern_16 = [Aa]rt reproduction=>art reproduction
pattern_map.medium.pattern_17 = [Aa]rt original=>art original
pattern_map.medium.pattern_18 = [Mm]otion picture=>motion picture
pattern_map.medium.pattern_19 = ^([Cc]hart|[Kk]it|[Bb]raille|[Rr]ealia|[Gg]ame|[Ee]quipment|[Ff]ilmstrip|[Ww]ebsite|[Tt]ransparency|[Mm]odel)$=>$1



For question 2 if you change the specification to be like this, I think it will work:

format2 = 521a:300e, format2_map.properties, first


-Bob Haschart


On 6/18/2014 11:21 AM, Tod Olson wrote:

Yes, either focus on the bash scripts for what you want to do, or if you’re comfortable with Java you can do a mixin.

 

If you want to do the bash script, pick apart this line from import/marc_local.properties:

 

dewey-hundreds = script(dewey.bsh), getDeweyNumber(082a:083a, 100), ddc22_map.properties(hundreds)

 

That names the script file to load, the function in the script file to invoke, and the translation map. getDeweyNumber() should show you how to process the tag string to pick up the fields you want.

 

Best,

 

-Tod

 

On Jun 18, 2014, at 8:39 AM, Shepard, Thomas - 1150 - MITLL <tshepard@ll.mit.edu> wrote:

 

I have a couple of follow-up questions to my March 2014 post (which I should add has been resolved by escaping those pesky periods).

 

1.       I want to create a new translation_map but it appear that the original values (left column) must be single strings/words without spaces. I would like vufind to be able to take a character string with spaces and punctuation and change it to a uniform lookup value. I suppose the real solution is to copy and edit one of those bash scripts found in  index_scripts, but I was hoping to take this problem down an easier road.  I tried using quotes but those didn’t work.

2.       In my marc_local.properties, I have attempted in various ways to set up two marc fields to share the same translation_map.

To be specific, I want 521a and 300e to share format2_map.properties.

format2 = 521a, format2_map.properties, first works just fine but none of the following variations do:

format2 = 521a, format2_map.properties, first:300e, format2_map.properties, first

format2 = 521a, format2_map.properties, first:300e, format2_map.properties

 

This also works;

format2 = 521a:300e

to give me the raw values from those fields, but of course does nothing to normalize those values.

 

So… should I focus on editing one of those bash scripts to get what I want?

 

Thanks,

Thom Shepard

 

 

 

 

From: Demian Katz [mailto:demian.katz@villanova.edu] 
Sent: Monday, March 24, 2014 2:48 PM
To: Shepard, Thomas - 1150 - MITLL; vufind-tech@lists.sourceforge.net; vufind-general@lists.sourceforge.net
Cc: solrmarc-tech@googlegroups.com
Subject: RE: vufind import options and mapping files

 

I’m copying the solrmarc-tech list in case anyone there has more specific insights.

 

I’m guessing that the problem here is that periods have a special meaning in the property map, because you can use them to create named maps. You might want to try escaping them with backslashes to see if that helps.

 

If you remove all of the values from your map that contain dots, do you at least see the other values coming through? I would expect so – I see no other obvious problems with your map or properties configuration. If even a stripped-down map is failing, perhaps there is some other problematic detail that I am failing to spot right now.

 

- Demian

 

From: Shepard, Thomas - 1150 - MITLL [mailto:tshepard@ll.mit.edu] 
Sent: Monday, March 24, 2014 1:53 PM
To: vufind-tech@lists.sourceforge.net; vufind-general@lists.sourceforge.net
Subject: [VuFind-Tech] vufind import options and mapping files

 

I have created a new facet in vufind that uses our marc  521 field.

The facet successfully displayed all values, but that’s now the problem.

We see that some of the older marc records have inconsistent values.

So I created a format2_map.properties file that contains the following:

 

eBook. = eBook

eBooks. = eBook

eBook = eBook

eBooks = eBook

eBooks = eBook

Book = Book

Audiovisual. = Audiovisual

Audiovisual = Audiovisual

 

My expectation was that the values on the left (with periods) would map to the values on the right.

I also thought that the values not listed in this file would vanish from the facet.

 

Instead my format2 facet disappeared entirely.

 

Here is my configuration for this field with our mapping lookup in our marc_local.properties  

format2 = 521a, format2_map.properties, first

(I tried other variations as well)

 

Again, the above results in no format2 facet getting displayed

 

And, no surprise, the following results in ALL values in 521 being displayed.

format2 = 521a

 

So my question is, Must I remove periods from my 521 field either before I import my records or is there a simple way to tell vufind to do so?

I realize I could do this in a bsh script but I am hoping I don’t need to.

 

Thanks in advance,

 

Thom Shepard

 

 

Thom Shepard

Information Services Dept.

MIT Lincoln Lab, SM-730A

244 Wood St. Lexington, 02420-9176

781 981 0370

 

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems_______________________________________________
VuFind-General mailing list
VuFind-General@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/vufind-general

 

--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To unsubscribe from this group and stop receiving emails from it, send an email to solrmarc-tech+unsubscribe@googlegroups.com.
To post to this group, send email to solrmarc-tech@googlegroups.com.
Visit this group at http://groups.google.com/group/solrmarc-tech.
For more options, visit https://groups.google.com/d/optout.