That makes sense to me.  Do you plan to add additional switches for these things or just handle them through environment variables?

- Demian

From: Tod Olson [tod@uchicago.edu]
Sent: Friday, December 14, 2012 11:25 AM
To: Demian Katz
Cc: Tod Olson; vufind-tech@lists.sourceforge.net Tech Mailinglist
Subject: Re: VF2.0 import scripts taking more that one file

Yes, I'm coming to the conclusion that the -p is a waste of time, but the logging, basepath, and no-move options are a good use of time.

-Tod

On Dec 13, 2012, at 9:55 AM, Demian Katz <demian.katz@VILLANOVA.EDU> wrote:

Even if you simply revert your changes, the scripts are not really inconsistent in the sense that they do not use the same syntax to do different things.  import-marc.sh supports a switch that import-marc-auth.sh does not, and import-marc-auth.sh supports a second parameter that import-marc.sh does not.
 
If you want consistent interfaces between import-marc.sh and import-marc-auth.sh, then the thing to do would be:
 
1.)    Add -p support to import-marc-auth.sh so that users can override the default PROPERTIES_FILE value
2.)    Add a second parameter to import-marc.sh so that users can provide additional mappings to be appended onto the MAPPINGS_FILE list
 
It’s really a question of whether this offers any value for anyone.  It certainly wouldn’t hurt to add these things, but if nobody uses them, it’s a waste of your time.
 
Does that make sense?
 
- Demian
 
From: Tod Olson [mailto:tod@uchicago.edu] 
Sent: Thursday, December 13, 2012 10:48 AM
To: Demian Katz
Cc: Tod Olson; vufind-tech@lists.sourceforge.net Tech Mailinglist
Subject: Re: VF2.0 import scripts taking more that one file
 
Re. JIRA: good, will do.
 
Re. parameter handling: aha! Thanks for seeing that. in this area, I can easily change the -p to be correct, or I could just revert. Do you have a sense of which makes more sense? 
 
For me, while I like consistency in the command-line scripts, the worst outcome would be altering the scripts in a way that would not be rolled back into the trunk.
 
-Tod
 
On Dec 13, 2012, at 8:01 AM, Demian Katz <demian.katz@villanova.edu>
 wrote:


Thanks for sharing this.  A couple of comments:
 
- It probably wouldn’t hurt to open a JIRA ticket for this; I don’t want to commit anything until I have time to port changes to the Windows batch versions for consistency, and it may be a while before I have time for that… so having a ticket will prevent it from getting lost and forgotten.
 
- The parameter handling in import-marc-auth.sh should probably be reverted or changed.  There are two different properties files used by SolrMarc: the “import properties” (which is general settings for the application) and the “marc properties” (which is the mappings for importing).  The -p parameter to import-marc.sh is used to set “import properties,” but you have changed import-marc-auth.sh so that it instead sets “marc properties.”  If you want to implement -p in import-marc-auth.sh, it should actually affect the PROPERTIES_FILE variable, not the MAPPINGS_FILE variable.  The optional mapping overrides should probably remain an optional second parameter for backward compatibility.
 
thanks,
Demian
 
 
From: Tod Olson [mailto:tod@uchicago.edu] 
Sent: Wednesday, December 12, 2012 3:18 PM
To: Demian Katz
Cc: Tod Olson; vufind-tech@lists.sourceforge.net Tech Mailinglist
Subject: Re: VF2.0 import scripts taking more that one file
 

Here's a patch. Let me know if you'd prefer this as a JIRA ticket.

The path does the following:

- import-marc-auth.sh now takes a -p option to specify the properties file, matching import-marc.sh
- batch-import-marc*.sh captures stderr to log file, stdout is not captured
- import-marc.sh echoes the command to stderr, so it gets logged with the solrmarc messages
- per-input-file logs by default, setting LOG_FILE sends entire run to one log file
- output to log files now appends, so above is possible

The handling of LOG_FILE is a bit schizophrenic, with per-file logs vs. one big log, but it allows LOG_FILE=/dev/null to send all to the bit bucket. But the switch to appending may create log maintenance issues for some sites. I'm quite open to revising this. I could also create a command-line switch for this.

You mentioned that the harvest directory is hard-coded, and allowing an override would be nice. The obvious way to do that would be to allow BASEPATH in the environment to take precedent.

-Tod


On Dec 12, 2012, at 8:58 AM, Demian Katz <demian.katz@VILLANOVA.EDU> wrote:

> I don't have a strong preference, except that I think it would be wise to avoid merging the stdout/stderr streams when generating logs -- it's probably useful to keep that granularity in the form of multiple logs if nothing else.  I do think you're right that there may be some value in leaving the "Now importing" stuff as the stdout stream and capturing the rest to logs...
> 
> - Demian
> 
>> -----Original Message-----
>> From: Tod Olson [mailto:tod@uchicago.edu]
>> Sent: Tuesday, December 11, 2012 5:08 PM
>> To: Demian Katz
>> Cc: Tod Olson; vufind-tech@lists.sourceforge.net Tech Mailinglist
>> Subject: Re: VF2.0 import scripts taking more that one file
>> 
>> Returning to capturing output from the harvest scripts, I'd like some input on
>> a minor point.
>> 
>> Currently stdout gets informative messages like so:
>> 
>> Now Importing /data/magma/vufind2/local/harvest/auth/auth_full_marc_utf-
>> 8_00_121206230000.mrc ...
>> /usr/local/bin/java -Xms512m -Xmx512m -Dsolr.core.name=authority -
>> Dsolr.indexer.properties=/data/magma/vufind2/import/marc_auth.properties,/data
>> /magma/vufind2/import/marc_auth.properties -jar
>> /data/magma/vufind2/import/SolrMarc.jar
>> /data/magma/vufind2/local/import/import_auth.properties
>> /data/magma/vufind2/local/harvest/auth/auth_full_marc_utf-
>> 8_00_121206230000.mrc
>> 
>> and all of the solrmarc messages (record number, stack traces on failure,
>> etc.) go to stderr.
>> 
>> I kind of think that the options are:
>> (a) everything goes to a log file,
>> (b) stdout can go to the terminal and stderr should go to the log file, or
>> (c) maybe that chatty "Now importing..." goes to stdout/terminal and all else
>> goes to stderr/the log.
>> 
>> Are there any strong feelings about which is the right way? Personally, I'm
>> kind of inclined towards (c), but maybe sites who are in production have a
>> different view.
>> 
>> -Tod
>> 
>> On Oct 31, 2012, at 12:32 PM, Demian Katz <demian.katz@VILLANOVA.EDU> wrote:
>> 
>>> I don't have a problem with changing the batch MARC import scripts to
>> capture stderr; I believe that when they were originally written, all SolrMarc
>> output was written to stdout -- it began using stderr more appropriately in
>> relatively recent updates.
>>> 
>>> The only other refactoring you might need to do is to allow a way of
>> specifying a full directory path -- right now, the scripts assume that all
>> files live under VuFind's harvest directory, but in a situation not linked to
>> the OAI harvester, the files might be somewhere else.  You might also want to
>> add a switch to disable the "move to processed directory" functionality and/or
>> a switch to control logging (i.e. optionally disable by sending to null).
>>> 
>>> - Demian
>>> 
>>>> -----Original Message-----
>>>> From: Tod Olson [mailto:tod@uchicago.edu]
>>>> Sent: Wednesday, October 31, 2012 1:14 PM
>>>> To: Demian Katz
>>>> Cc: Tod Olson; vufind-tech@lists.sourceforge.net Tech Mailinglist
>>>> Subject: Re: VF2.0 import scripts taking more that one file
>>>> 
>>>> Yes, looking at the harvest/ scripts for marc records, I see see that
>> stdout
>>>> is directed to an output file, but stderr is not written to disk. Since
>> stderr
>>>> has all of the error info, I'm inclined to capture it. I can also see where
>>>> people would not want the error logs taking up disc space, since there's a
>>>> message for every record. But sending all output to a file is a little more
>>>> cron-friendly.
>>>> 
>>>> I may be willing to refactor a couple of those batch scripts (no commitment
>>>> yet), but I'd like a little input on what sort of requirements other sites
>>>> would have.
>>>> 
>>>> -Tod
>>>> 
>>>> On Oct 31, 2012, at 11:22 AM, Tod Olson <tod@uchicago.edu>
>>>> wrote:
>>>> 
>>>>> Aha, I'd dismissed harvest as exclusively the province of OAI. Thanks for
>>>> correcting that.
>>>>> 
>>>>> I'll pop a patch into JIRA when I can.
>>>>> 
>>>>> -Tod
>>>>> 
>>>>> On Oct 30, 2012, at 10:52 PM, Demian Katz <demian.katz@villanova.edu>
>>>>> wrote:
>>>>> 
>>>>>> There are batch import scripts in the harvest directory -- you might be
>>>> able to use those.  If not, perhaps some refactoring can make all the
>> existing
>>>> tools more flexible.  Also, if you add -p support to the auth script,
>> please
>>>> submit a patch and I'll be happy to merge that into master.
>>>>>> 
>>>>>> thanks,
>>>>>> Demian
>>>>>> ________________________________________
>>>>>> From: Tod Olson [tod@uchicago.edu]
>>>>>> Sent: Tuesday, October 30, 2012 8:06 PM
>>>>>> To: vufind-tech@lists.sourceforge.net Tech Mailinglist
>>>>>> Subject: [VuFind-Tech] VF2.0 import scripts taking more that one file
>>>>>> 
>>>>>> I find that it would be useful for my site if the import-marc.sh and
>>>> import-marc-auth.sh. I could easily hack those two shell scripts to take
>> some
>>>> arbitrary number of files as arguments and loop over them, and submit a
>> patch.
>>>> Would that be of use to other sites?
>>>>>> 
>>>>>> Otherwise, I'll just write wrappers around them for local use.
>>>>>> 
>>>>>> The one interface change that I'd want to implement: it would be easier
>> if
>>>> I changed import-marc-auth.sh to take a profile file with a -p argument
>> like
>>>> import-marc.sh.
>>>>>> 
>>>>>> -Tod
>>>>>> 
>>>>>> Tod Olson <tod@uchicago.edu>
>>>>>> Systems Librarian
>>>>>> University of Chicago Library
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -------------------------------------------------------------------------
>> --
>>>> ---
>>>>>> Everyone hates slow websites. So do we.
>>>>>> Make your web apps faster with AppDynamics
>>>>>> Download AppDynamics Lite for free today:
>>>>>> http://p.sf.net/sfu/appdyn_sfd2d_oct
>>>>>> _______________________________________________
>>>>>> Vufind-tech mailing list
>>>>>> Vufind-tech@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/vufind-tech
>>>>> 
>>> 
>