Could do either, or both. Internally they will be variables anyhow, and easy to take from the environment. But probably making switches available is more accessible for many people, even though getopt is annoying. May as well do both. Have a thought on whether the switch or the environment should take precedence?


On Dec 14, 2012, at 10:29 AM, Demian Katz <>

That makes sense to me.  Do you plan to add additional switches for these things or just handle them through environment variables?

- Demian

From: Tod Olson []
Sent: Friday, December 14, 2012 11:25 AM
To: Demian Katz
Cc: Tod Olson; Tech Mailinglist
Subject: Re: VF2.0 import scripts taking more that one file

Yes, I'm coming to the conclusion that the -p is a waste of time, but the logging, basepath, and no-move options are a good use of time.


On Dec 13, 2012, at 9:55 AM, Demian Katz <demian.katz@VILLANOVA.EDU> wrote:

Even if you simply revert your changes, the scripts are not really inconsistent in the sense that they do not use the same syntax to do different things. supports a switch that does not, and supports a second parameter that does not.
If you want consistent interfaces between and, then the thing to do would be:
1.)    Add -p support to so that users can override the default PROPERTIES_FILE value
2.)    Add a second parameter to so that users can provide additional mappings to be appended onto the MAPPINGS_FILE list
It’s really a question of whether this offers any value for anyone.  It certainly wouldn’t hurt to add these things, but if nobody uses them, it’s a waste of your time.
Does that make sense?
- Demian
From: Tod Olson [] 
Sent: Thursday, December 13, 2012 10:48 AM
To: Demian Katz
Cc: Tod Olson; Tech Mailinglist
Subject: Re: VF2.0 import scripts taking more that one file
Re. JIRA: good, will do.
Re. parameter handling: aha! Thanks for seeing that. in this area, I can easily change the -p to be correct, or I could just revert. Do you have a sense of which makes more sense? 
For me, while I like consistency in the command-line scripts, the worst outcome would be altering the scripts in a way that would not be rolled back into the trunk.
On Dec 13, 2012, at 8:01 AM, Demian Katz <>

Thanks for sharing this.  A couple of comments:
- It probably wouldn’t hurt to open a JIRA ticket for this; I don’t want to commit anything until I have time to port changes to the Windows batch versions for consistency, and it may be a while before I have time for that… so having a ticket will prevent it from getting lost and forgotten.
- The parameter handling in should probably be reverted or changed.  There are two different properties files used by SolrMarc: the “import properties” (which is general settings for the application) and the “marc properties” (which is the mappings for importing).  The -p parameter to is used to set “import properties,” but you have changed so that it instead sets “marc properties.”  If you want to implement -p in, it should actually affect the PROPERTIES_FILE variable, not the MAPPINGS_FILE variable.  The optional mapping overrides should probably remain an optional second parameter for backward compatibility.
From: Tod Olson [] 
Sent: Wednesday, December 12, 2012 3:18 PM
To: Demian Katz
Cc: Tod Olson; Tech Mailinglist
Subject: Re: VF2.0 import scripts taking more that one file

Here's a patch. Let me know if you'd prefer this as a JIRA ticket.

The path does the following:

- now takes a -p option to specify the properties file, matching
- batch-import-marc*.sh captures stderr to log file, stdout is not captured
- echoes the command to stderr, so it gets logged with the solrmarc messages
- per-input-file logs by default, setting LOG_FILE sends entire run to one log file
- output to log files now appends, so above is possible

The handling of LOG_FILE is a bit schizophrenic, with per-file logs vs. one big log, but it allows LOG_FILE=/dev/null to send all to the bit bucket. But the switch to appending may create log maintenance issues for some sites. I'm quite open to revising this. I could also create a command-line switch for this.

You mentioned that the harvest directory is hard-coded, and allowing an override would be nice. The obvious way to do that would be to allow BASEPATH in the environment to take precedent.


On Dec 12, 2012, at 8:58 AM, Demian Katz <demian.katz@VILLANOVA.EDU> wrote:

> I don't have a strong preference, except that I think it would be wise to avoid merging the stdout/stderr streams when generating logs -- it's probably useful to keep that granularity in the form of multiple logs if nothing else.  I do think you're right that there may be some value in leaving the "Now importing" stuff as the stdout stream and capturing the rest to logs...
> - Demian
>> -----Original Message-----
>> From: Tod Olson []
>> Sent: Tuesday, December 11, 2012 5:08 PM
>> To: Demian Katz
>> Cc: Tod Olson; Tech Mailinglist
>> Subject: Re: VF2.0 import scripts taking more that one file
>> Returning to capturing output from the harvest scripts, I'd like some input on
>> a minor point.
>> Currently stdout gets informative messages like so:
>> Now Importing /data/magma/vufind2/local/harvest/auth/auth_full_marc_utf-
>> 8_00_121206230000.mrc ...
>> /usr/local/bin/java -Xms512m -Xmx512m -
>> /magma/vufind2/import/ -jar
>> /data/magma/vufind2/import/SolrMarc.jar
>> /data/magma/vufind2/local/import/
>> /data/magma/vufind2/local/harvest/auth/auth_full_marc_utf-
>> 8_00_121206230000.mrc
>> and all of the solrmarc messages (record number, stack traces on failure,
>> etc.) go to stderr.
>> I kind of think that the options are:
>> (a) everything goes to a log file,
>> (b) stdout can go to the terminal and stderr should go to the log file, or
>> (c) maybe that chatty "Now importing..." goes to stdout/terminal and all else
>> goes to stderr/the log.
>> Are there any strong feelings about which is the right way? Personally, I'm
>> kind of inclined towards (c), but maybe sites who are in production have a
>> different view.
>> -Tod
>> On Oct 31, 2012, at 12:32 PM, Demian Katz <demian.katz@VILLANOVA.EDU> wrote:
>>> I don't have a problem with changing the batch MARC import scripts to
>> capture stderr; I believe that when they were originally written, all SolrMarc
>> output was written to stdout -- it began using stderr more appropriately in
>> relatively recent updates.
>>> The only other refactoring you might need to do is to allow a way of
>> specifying a full directory path -- right now, the scripts assume that all
>> files live under VuFind's harvest directory, but in a situation not linked to
>> the OAI harvester, the files might be somewhere else.  You might also want to
>> add a switch to disable the "move to processed directory" functionality and/or
>> a switch to control logging (i.e. optionally disable by sending to null).
>>> - Demian
>>>> -----Original Message-----
>>>> From: Tod Olson []
>>>> Sent: Wednesday, October 31, 2012 1:14 PM
>>>> To: Demian Katz
>>>> Cc: Tod Olson; Tech Mailinglist
>>>> Subject: Re: VF2.0 import scripts taking more that one file
>>>> Yes, looking at the harvest/ scripts for marc records, I see see that
>> stdout
>>>> is directed to an output file, but stderr is not written to disk. Since
>> stderr
>>>> has all of the error info, I'm inclined to capture it. I can also see where
>>>> people would not want the error logs taking up disc space, since there's a
>>>> message for every record. But sending all output to a file is a little more
>>>> cron-friendly.
>>>> I may be willing to refactor a couple of those batch scripts (no commitment
>>>> yet), but I'd like a little input on what sort of requirements other sites
>>>> would have.
>>>> -Tod
>>>> On Oct 31, 2012, at 11:22 AM, Tod Olson <>
>>>> wrote:
>>>>> Aha, I'd dismissed harvest as exclusively the province of OAI. Thanks for
>>>> correcting that.
>>>>> I'll pop a patch into JIRA when I can.
>>>>> -Tod
>>>>> On Oct 30, 2012, at 10:52 PM, Demian Katz <>
>>>>> wrote:
>>>>>> There are batch import scripts in the harvest directory -- you might be
>>>> able to use those.  If not, perhaps some refactoring can make all the
>> existing
>>>> tools more flexible.  Also, if you add -p support to the auth script,
>> please
>>>> submit a patch and I'll be happy to merge that into master.
>>>>>> thanks,
>>>>>> Demian
>>>>>> ________________________________________
>>>>>> From: Tod Olson []
>>>>>> Sent: Tuesday, October 30, 2012 8:06 PM
>>>>>> To: Tech Mailinglist
>>>>>> Subject: [VuFind-Tech] VF2.0 import scripts taking more that one file
>>>>>> I find that it would be useful for my site if the and
>>>> I could easily hack those two shell scripts to take
>> some
>>>> arbitrary number of files as arguments and loop over them, and submit a
>> patch.
>>>> Would that be of use to other sites?
>>>>>> Otherwise, I'll just write wrappers around them for local use.
>>>>>> The one interface change that I'd want to implement: it would be easier
>> if
>>>> I changed to take a profile file with a -p argument
>> like
>>>>>> -Tod
>>>>>> Tod Olson <>
>>>>>> Systems Librarian
>>>>>> University of Chicago Library
>>>>>> -------------------------------------------------------------------------
>> --
>>>> ---
>>>>>> Everyone hates slow websites. So do we.
>>>>>> Make your web apps faster with AppDynamics
>>>>>> Download AppDynamics Lite for free today:
>>>>>> _______________________________________________
>>>>>> Vufind-tech mailing list