Document Archive / Feature Requests / #17 Import from Filesystem / Bulk import

#17 Import from Filesystem / Bulk import

Status: open

Owner: nobody

Labels: Core (12)

Priority: 5

Updated: 2005-10-12

Created: 2005-10-12

Creator: André Pohl

Private: No

Bulk-Importing of Documents via Filesystem.

Discussion

Konrad Kieling - 2005-10-17

Logged In: YES
user_id=449742

what exactly do you mean?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

André Pohl - 2005-10-17

Logged In: YES
user_id=583614

For me there are two ways to use a document archiving
system:

1. Adding a single document with handmade comments ...
verry interesting for a handmade knowledgebase, where the
mass of input is verry low (per user)

2. Importing a document (or normaly a couple of documents)
via Filesystem / Batchupload. Here the
informations/comments should be autogenerated by the
indexing-engine (normaly a fulltext-index). This is verry
usefull, if you are trying to archive a couple of old records (for
example: scanning old records for storing them in a electronic
archive).

At the moment no document-archiv-system provide such a
soloution (I dont know such a system).

Greetings, Andr

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Konrad Kieling - 2005-10-17

Logged In: YES
user_id=449742

a batchupload is possible with the command line interface
provided you have a .bib file with the information.

using the indexing engine itself is a little tricky. i've
been working on content extraction but the different
structure of all the documents renders this problem very
hard. it is possible to write such a script for each
journal separately , but i do not know how to extract at
least author and title of the document.

so if you have many articles of the same journal a solution
can be built. they have to be not too old since my tests
with the available ocr software did not give reliable results.

any additional information (other databases, specific
filename conventions, .bib files) would help.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.