Hi Matt,
Thanks for this (apologies for the delay in replying!). I'm still not
having any joy, but am finding that by putting in the print_r statement
you suggest, I'm now getting:
=20
DOMNodeList Object ( )=20
When I run the batch import (of an eprints export_xml file).=20
It seems that the import isn't picking up any nodes, as you suggested.=20
The eprints data we have is in the same format as yours
(eprintsdata/record).
I've tried both ways of configuring the URL for the eprints instance, in
the config file (as suggested in the wiki).
Is there anything else I can look at, to see if I can get them to pick
up?
Thanks
Caroline
----------------
Caroline Ayers
RUBRIC Technical Officer
ayersc@...
----------------
-----Original Message-----
From: Matthew Smith [mailto:m.smith@...
Sent: Thursday, 25 May 2006 2:06 PM
To: Caroline Ayers
Subject: Re: [Fwd: FW: [Fez-users] Error with Fez 1.2 Batch Import and
Export]
Hi Caroline,
Christiaan is on holidays / conferences in the US but forwarded this to=20
me as I am looking after Fez (in a limited capacity) while he's away.
The file in question is class.batchimport.php, see line 90, function
handleEntireEprintsImport($pid, $collection_pid, $xmlObj)
You can piece together what it's looking for by looking at the xpath=20
queries e.g
$recordNodes =3D $xpath->query('//eprintsdata/record');
means it is getting all the XML record elements inside eprintsdata
records:
<eprintsdata>
<record>
...
</record>
<record>
...
</record>
etc..
Then the next queries obtain various elements from each record. An=20
important value is the type field:
<field name=3D"type">confpaper</field>
If the value here is not one we know about, then the record will be=20
skipped. (see the switch statement starting on line 195). The default=20
is to print 'Unrecognised record type'.
In the list of background processes on the MyFez page, there is a black=20
monitor symbol on the right, if you click it, it should show any error=20
messages seen during the background process.
I'm guessing that the initial xpath query is not obtaining any eprints=20
records. Try doing a print_r($recordNodes) on line 113.
Here's an example of one record from our repository:
<eprintsdata>
<record >
<field name=3D"eprintid">2</field>
<field name=3D"userid">2</field>
<field name=3D"dir">disk0/00/00/00/02</field>
<field name=3D"datestamp">2004-02-03</field>
<field name=3D"type">confpaper</field>
<field name=3D"succeeds" />
<field name=3D"commentary" />
<field name=3D"replacedby" />
<field name=3D"abstract">The paper explains how librarians can =
adapt=20
existing
library skills to find good information online. To do so, librarians=20
should vie
w the Internet as a vast repository of sources, complementary to those=20
that exi
st in print. Using the organising structures provided by Internet=20
domains, libra
rians can identify the main search tools in each domain, thus=20
simplifying the se
arch for information. </field>
<field =20
name=3D"altloc">http://www.alia.org.au/conferences/alia2000/proceeding
s/belinda.weaver.html</field>
<field name=3D"authors" id=3D"Weaver, Belinda"><part =20
name=3D"given">Belinda</par
t><part name=3D"family">Weaver</part></field>
<field name=3D"chapter" />
<field name=3D"commref" />
<field name=3D"confdates">24-26 October, 2000</field>
<field name=3D"confdates">24-26 October, 2000</field>
<field name=3D"conference">ALIA 2000 Capitalising On Knowledge - =
The
Informa
tion Profession In The 21st Century</field>
<field name=3D"confloc">Canberra, ACT, Australia</field>
<field name=3D"department" />
<field name=3D"institution" />
<field name=3D"ispublished">pub</field>
<field name=3D"keywords">internet searching; information strategy;=20
reference
librarianship; electronic information sources; information
literacy</field>
<field name=3D"month" />
<field name=3D"note" />
<field name=3D"number" />
<field name=3D"pages" />
<field name=3D"pubdom">TRUE</field>
<field name=3D"publication" />
<field name=3D"publisher">Australian Library and Information=20
Association (ALI
A)</field>
<field name=3D"refereed">FALSE</field>
<field name=3D"referencetext" />
<field name=3D"reportno" />
<field name=3D"series" />
<field name=3D"subjects">400200</field>
<field name=3D"thesistype" />
<field name=3D"title">Making The Most Of The Web</field>
<field name=3D"volume" />
<field name=3D"year">2000</field>
<field name=3D"suggestions" />
</record>
<record >
<field name=3D"eprintid">6</field>
<field name=3D"userid">3</field>
<field name=3D"dir">disk0/00/00/00/06</field>
<field name=3D"datestamp">2004-02-16</field>
<field name=3D"type">journale</field>
<field name=3D"succeeds" />
<field name=3D"commentary" />
<field name=3D"replacedby" />
<field name=3D"abstract">The line between comment and opinion is=20
increasingly
becoming blurred in newspapers. Some critics are concerned that the=20
public cann
ot distinguish between comment and reporting. Comment may even be=20
pushing out sp
ace traditionally reserved for news, for example, in reporting on=20
Parliament. Th
e correction of errors of fact in opinion pieces needs addressing.=20
Measures are
needed to improve the standard of comment and to disentangle it from=20
factual rep
orting. More diversity of opinion and the use of outside authorities and
experts
to provide it is one suggestion. A greater emphasis on fact-checking=20
would help
. Improved public literacy about the media and education via the school=20
system t
o teach people how to read and interpret the media are other=20
suggestions.</field
>
<field =20
name=3D"altloc">http://www.ejournalism.au.com/ejournalist/facts.pdf</f
ield>
<field name=3D"authors" id=3D"Weaver, Belinda"><part =20
name=3D"given">Belinda</par
t><part name=3D"family">Weaver</part></field>
<field name=3D"chapter" />
<field name=3D"commref" />
<field name=3D"confdates" />
<field name=3D"conference" />
<field name=3D"confloc" />
<field name=3D"department" />
<field name=3D"editors" id=3D""><part =
name=3D"given">Alan</part><part =20
name=3D"fam
ily">Knight</part></field>
<field name=3D"institution" />
<field name=3D"ispublished">pub</field>
<field name=3D"keywords">Opinion pieces; Editorials;Commentary,=20
Journalistic
comment</field>
<field name=3D"month" />
<field name=3D"note">This is the first draft of a piece originally=20
written fo
r the Australian Press Council's essay competition, 2001.</field>
<field name=3D"number">1</field>
<field name=3D"pages" />
<field name=3D"pubdom">FALSE</field>
<field name=3D"publication">Ejournalist</field>
<field name=3D"publisher">Ejournalism, Central Queensland=20
University</field>
<field name=3D"refereed">TRUE</field>
<field name=3D"referencetext">ABC Radio National, The Media Report=20
(2001) 'Cu
lture of Journalism'.=20
<http://www.abc.net.au/rn/talks/8.30/mediarpt/stories/s
293877.htm>. 10 May. [Accessed 20 May 2001].
Australian Press Council (2001a) Statement of Principles.=20
<http://www.pressco
uncil.org.au/pcsite/complaints/sop.html> [Accessed 20 May 2001].
Australian Press Council (2001b) A Matter of Opinion: Case Study 5=20
(April). <
http://www.presscouncil.org.au/pcsite/studs/case5.html> [Accessed 20=20
May 2001
].
Brown, H. (2001) <hughie@...; 'Comment and fact'.
E-mail
message to Belinda Weaver <b.weaver@...; 17 May.
Cotton, P. (2001) 'House loses the argument', The Australian (Media=20
supplement),
10 May: 3.
Craven, P. (2001) 'Arguing for the long haul', The Australian: 4 April:
13.
Dunn, J. (2001) 'Tough calls in new ballgame', The Weekend Australian,=20
28 April:
34.
Flint, D. (2001) 'How News Is Made In Australia', presentation to ABA=20
Conference
, Radio Television and the New Media, Canberra, 3-4 May.
Flint, D. (1999) 'Sorry seems to be the hardest word - the=20
responsibility of the
media', speech to the Sydney Institute Seminar, 25 May.
Fuller, J. (1996) News Values. Chicago: University of Chicago Press.
Glover, R. (1992) 'Just checking the facts', Australian Book Review,=20
(143) Augus
t: 27-33.
Henningham, J. (2001) <johnhenningham@...; 'Comment and=20
fact'. E-
mail to Belinda Weaver <b.weaver@...; 30 April.
Keating, P. (2000) 'Pressing problems', The Australian (Media=20
supplement), 15 Ju
ne: 1.
Kelly, P. (1993) 'Media Ownership, Journalistic Ethics and Community=20
Responsibil
ity', Address to Fulbright Symposium, Sydney, October 27.
Kelly, P. (2001) Telephone interview, 14 May.
Knight, A. (2001) <a.knight@...; 'Comment and fact'. E-mail
to Beli
nda Weaver <b.weaver@...; 30 April.
Koch, T. (1991) Journalism in the 21st Century. Twickenham, UK:=20
Adamantine Press
.
Linnell, G. (2001) 'Silent partner'. Sydney Morning Herald (Good Weekend
supplem
ent), 2 June: 18-24.
Nash, C. (2001) <c.nash@...; 'Comment and fact'. E-mail to=20
Belinda
Weaver <b.weaver@...; 11 May.
Pearson, M. (2001) <Mark_Pearson@...; 'Comment and fact'.=20
E-mail t
o Belinda Weaver <b.weaver@...; 29 May.
Pearson, M.(1997) The journalist's guide to media law. St. Leonards,=20
N.S.W. : Al
len & Unwin.
Simons, M. (2001a) 'GST, Now and Then', The Australian (Media=20
supplement), 26 Ap
ril: 15.
Simons, M. (2001b) 'A Matter of Opinion', The Australian (Media=20
supplement), 10
May: 6.
(2001) Sources of News and Current Affairs: a research report conducted=20
by Bond
University for the Australian Broadcasting Authority.=20
<http://www.aba.gov.au/
what/research/attitude.htm> [Accessed 19 May 2001].
Steketee, M. (2000) 'Newspapers said yes more often in republic debate',
The Aus
tralian, 13 November: 6.
Stretton, R. (1990) 'Rating the media watchdogs', Bulletin, 31 July:=20
110-112.
Warby, M. (2000) 'Print's elite puts virtue above veracity', The=20
Australian (Med
ia supplement), 22 June: 14.
Ward, I. (2001) Telephone interview. 28 May.
Waterford, J. (2001) <jack.waterford@...
'Comment and fa
ct'. E-mail to Belinda Weaver <b.weaver@...; 13
April.
White, S. (1996) Reporting in Australia. 2nd ed. Melbourne:=20
Macmillan.</field>
<field name=3D"reportno" />
<field name=3D"series" />
<field name=3D"subjects">400101</field>
<field name=3D"thesistype" />
<field name=3D"title">The Fewer the Facts, the Stronger the=20
Opinion</field>
<field name=3D"volume">1</field>
<field name=3D"year">2001</field>
<field name=3D"suggestions" />
</record>
Christiaan Kortekaas wrote:
>
>
>
------------------------------------------------------------------------
>
> Subject:
> FW: [Fez-users] Error with Fez 1.2 Batch Import and Export
> From:
> Caroline Ayers <ayersc@...>
> Date:
> Wed, 24 May 2006 16:39:59 +1000
> To:
> Christiaan Kortekaas <c.kortekaas@...>
>
> To:
> Christiaan Kortekaas <c.kortekaas@...>
>
>
> Hi Christiaan,
>
> Just wanted to follow this up, off the list. Once we've got it sorted,
> I'll put the solution up, so that people on a newer version of ePrints
> can use it.
>
> Still having no joy - I have configured the config file for both the
> directory, and the URL of our eprints, and then created the import
> directory for the eprints xml file to sit in (the output of the
> export_xml command). I go through the steps of setting up the batch
> import, but when I go to My Fez, it says that it's finished, and that
> it's imported 0 records.=20
>
> I'm now wondering if (not knowing how the batch importer is working)
Fez
> is having problems with the format of the metadata, based on the
> different versions. I recall from an earlier version of ePrints that
> there was a change in how authors were defined, but not sure if that
was
> the only major change.=20
>
> The perl oai URL that I have is ok I think - I can test it in a
browser,
> and when given an eprint ID, it brings a result.
>
> So two questions - are there any logs of this that I can check to see
> what's throwing the fit, and also - do you have a copy of a metadata
> sample that Fez is looking for - so I can either run an xsl transform
> over ours, or hack Fez to get it to work.
>
> Any other suggestions appreciated too.
>
> Thanks
> Caroline
>
> ----------------
> Caroline Ayers
> RUBRIC Technical Officer
> ayersc@...
> ----------------
>
> -----Original Message-----
> From: fez-users-admin@...
> [mailto:fez-users-admin@...] On Behalf Of Christiaan
> Kortekaas
> Sent: Friday, 19 May 2006 5:25 PM
> To: fez-users@...
> Subject: Re: [Fez-users] Error with Fez 1.2 Batch Import and Export
>
> Hi Caroline,
>
> ePrints has an export command line tool "export_xml" perl script which
> will generate an XML file for an ePrints "archive". This XML file
> contains metadata for each record in ePrints but does not contain the
> file attachments or URLs to the files (eg PDFs). However these file
> links are available in the ePrints OAI service provider.
>
> When Fez finds an ePrints XML export file during batch import of a
> selected directory it will create a Fez Fedora object for each ePrints
> record and do an OAI getRecord lookup to the ePrints server to get the
> URL links for each file attachment, download them and add them to the
> new Fez object. Fez will also read the document type of each ePrints
> record and match those against Fez document types automatically. If no
> document type match is found it is created as the "Generic Document"
> type.
>
> You also need to configure the Fez config.inc.php variable eg ours is:
>
> @define("EPRINTS_OAI",
>
"http://eprint.uq.edu.au/perl/oai2?verb=3DGetRecord&metadataPrefix=3Doai_=
dc&
> identifier=3Doai%3Aeprint.uq.edu.au%3A");
>
> It needs to point to your eprints repository OAI getrecord verb with
> everything except the identifying eprint ID.
>
> We are using a slightly older version of ePrints 2 as I know when we
> configured this at QUT and did some batch importing with their Eprints
> it needed a slightly different syntax - I think talking about archive
> name instead of identifier. I think we are on 2.1 and they are
> on 2.2 or something like that.
>
> Cheers,
> Christiaan
>
>
> Caroline Ayers wrote:
> =20
>> Hi Christiaan,
>>
>> I've done some more testing, and can now get this to work for some
>> =20
> types
> =20
>> of data. However, I can't get ePrints data to work and was wondering
>> whether you could give me some finer details of what the ePrints data
>> accepted by Fez is?=20
>>
>> For example, will Fez take an entire output file (when the eprints
>> export script is run, all the metadata goes into one big file), or
>> =20
> does
> =20
>> this file need to be split up?
>>
>> Does Fez take data from a certain version of ePrints only? I realize
>> there were some differences between v1 and v2 of ePrints.
>>
>> Also, is Fez able to pull the pdfs/images/etc associated with a
record
>> from the ePrints server, or do these need to be held somewhere
locally
>> (with the metadata?)
>>
>> What currently happens when I try to import is as follows: I am able
>> =20
> to
> =20
>> select the directory (having configured it already), and set the
Batch
>> Import to run. I go to My Fez, and it lists as a Running Process, but
>> then once it's done, it says that it Imported 0 Records.=20
>>
>> I've tried importing eprints data in various shapes and forms, to no
>> avail.=20
>>
>> Thanks in advance,
>> Caroline
>>
>>
>>
>> ----------------
>> Caroline Ayers
>> RUBRIC Technical Officer
>> ayersc@...
>> ----------------
>>
>> -----Original Message-----
>> From: fez-users-admin@...
>> [mailto:fez-users-admin@...] On Behalf Of
Christiaan
>> Kortekaas
>> Sent: Thursday, 11 May 2006 3:16 PM
>> To: fez-users@...
>> Subject: Re: [Fez-users] Error with Fez 1.2 Batch Import and Export
>>
>> Hi Caroline,
>>
>> For a quick response, yes Fez does need the PHP configuration setting
>> 'allow_call_time_pass_reference' to be on so in your php.ini file you
>> must have:
>>
>> allow_call_time_pass_reference =3D On
>>
>> This is covered in the Fez installation instructions.
>>
>> So far there is little doco on the batch import system however I can
>> give you a quick rundown.
>>
>> The APP_SAN_IMPORT_DIR Fez config.inc.php variable is where Fez will
>> look for subdirectories to import.
>>
>> EG for a linux server: @define("APP_SAN_IMPORT_DIR",
>> "/espace/incoming");
>>
>> So when you do a batch import the subdirectories of that directory
>> =20
> will
> =20
>> show as places Fez can batch import from.
>> So if there was a directory /espace/incoming/images, then images
would
>> show in the combo box for the batch import form.
>>
>> If you chose that directory and continue the batch import process
will
>> scan that directory for files and create a fez
>> record for each one. Automatic workflows for mimetypes will run
>> =20
> against
> =20
>> those files, so for images Fez will create thumbnails,
>> preview and web copies of each image.
>>
>> These records will show up in your 'My Fez' area as unpublished
>> =20
> records
> =20
>> for you to look at, add metadata and publish.
>>
>> Batch import will also scan for ePrints xml export files and mets
>> objects and create a Fez record for each, with the dublin core
details
>> from ePrints and METS going into the FOXML DC datastreams.
>>
>> For the export, at the moment there is no 'unpreselected object'
>> =20
> export
> =20
>> workflow defined so the export button on My Fez. This must have
>> been missed in the Fez 1.2 schema.sql.
>>
>> Instead you can click the '...' extra workflows link on lists of
>> communties, collections of objects and choose 'Export to csv' there.
>> If it is a community/collection you are exporting it will get all the
>> child objects as as well.
>>
>> Cheers,
>> Christiaan
>>
>>
>> Caroline Ayers wrote:
>> =20
>>> Hi All,
>>>
>>> =20
>>>
>>> I'm keen to get the Batch Import work in Fez 1.2, but am having some
>>> hassles. I've set up the directory from which the metadata can be=20
>>> imported, and through the GUI, things seem to go fine - I can select
>>> =20
> a
> =20
>>> Location, and click on Batch Import, then I get a message saying
it's
>>> =20
>
> =20
>>> started and that I can see it from My Fez. From there though, I go
to
>>> =20
>> My=20
>> =20
>>> Fez, but the process never ends (I know it's a background process,
>>> =20
> but
> =20
>>> it doesn't finish). I get the following error in my apache error
log:
>>>
>>> =20
>>>
>>> PHP Warning: Call-time pass-by-reference has been deprecated -
>>> =20
>> argument=20
>> =20
>>> passed by value; If you would like to pass it by referece, modify
the
>>> =20
>
> =20
>>> declaration of [runtime function name](). If you would like to
enable
>>> =20
>
> =20
>>> call-time pass-by-reference, you can set
>>> =20
>> allow_call_time_pass_reference=20
>> =20
>>> to true in your INI file. However, future versions may not support
>>> =20
>> this=20
>> =20
>>> any longer. in /usr/local/apache/htdocs/fez/include/class.misc.php
on
>>> =20
>
> =20
>>> line 419
>>>
>>> PHP Parse error: parse error, unexpected '&', expecting T_VARIABLE
or
>>> =20
>
> =20
>>> '$' in /usr/local/apache/htdocs/fez/include/class.misc.php on line
>>> =20
>> 1108
>> =20
>>> =20
>>>
>>> In the My Fez section, the import shows up as a Running Process,
with
>>> =20
>
> =20
>>> nothing under the 'Progress', 'Message', and 'Last Heartbeat'
>>> =20
> columns,
> =20
>>> and with 'undefined' under the 'Status' column.
>>>
>>> =20
>>>
>>> The other question I would have too - is there any documentation on
>>> =20
>> the=20
>> =20
>>> Batch Import? What I've found so far as been scattered through
>>> =20
> several
> =20
>>> sources, so I'm not even sure if I've set it up correctly.
>>>
>>> =20
>>>
>>> Last question is to do with the Export - if I click on Export, I get
>>> =20
>> an=20
>> =20
>>> error message about "No workflows defined for Export". Is there=20
>>> documentation somewhere that can tell me what I need to do to define
>>> =20
>> the=20
>> =20
>>> workflow for this?
>>>
>>> =20
>>>
>>> Thanks
>>>
>>> Caroline
>>>
>>> =20
>>>
>>> =20
>>>
>>> ----------------
>>>
>>> Caroline Ayers
>>>
>>> //RUBRIC Technical Officer//
>>>
>>> DeC
>>>
>>> USQ
>>>
>>> Toowoomba, QLD
>>>
>>> AUSTRALIA
>>>
>>> ayersc@... <mailto:ayersc@...>
>>>
>>> Ph: (0)7 4631 5338
>>>
>>> ----------------
>>>
>>> =20
>>>
>>> =20
>> =20
>
>
> =20
|