Excel csv to mod xml then 2 refbase possible?

  • vidhi

    vidhi - 2011-11-29

    Hello. I am new to refbase and recently have been working around to get some of my library records in excel csv format imported to refbase.

    While I have already noticed that there is no csv import option (correct me if wrong), I am trying to convert the csv file to mods xml using the example from here: www.refbase.net/index.php/Import_Example:_MODS_XML

    But I am failing to import records from my csv sheet. I get an error stating that the maps relationship with an element cannot be preserved. Is it possible for anyone to give me a xml map of elements required for the import in refbase?

    Or, is there any other work around to this? Would be greatful for any kind of help.


  • Matthias Steffens

    Hi Vids,

    it's true that the public refbase source code doesn't contain any support for generic import of CSV files. However, I've once written a 'tabdelimToRis()' function which converts records from a custom tab-delimited text format to RIS format which could then be imported via function 'risToRefbase()'. The 'tabdelimToRis()' function did only work with a single custom tab-delimited text format, thus it was never published (since a generic CSV import option would have required way more work).

    The custom tab-delimited data format did only support these fields/columns:


    where 'source' did only support books and journal articles, and the 'species' & 'keywords' columns were merged into refbase keywords. The 'ID' and 'callnumber' columns were merged and mapped to the RIS 'ID' field (which gets written to the refbase 'call_number' field). The 'read' column was added to refbase's user-specific 'user_notes' field.

    Example tab-delimited data supported by this function:

    authors year    title   source  species keywords    notes   ID  callnumber  read
    Aagaard A, Warman CG, Depledge MH   1995    Tidal and seasonal changes in the temporal and spatial distribution of foraging Carcinus maenas in the weakly tidal littoral zone of Kerteminde Fjord, Denmark  Mar. Ecol. Prog. Ser. 122: 165-172   Carcinus maenas            1       N
    Aarup T 2002    Transparency of the North Sea and Baltic Sea - a Secchi depth data mining study Oceanologia 44: 323-337     water transparency      2   K   J
    Abad R  1998    Acoustic estimation of abundance and distribution of sardine in the northwestern Mediterranean  Fish. Res. 34: 239-245              3       N
    Abelló P, Oro D 1998    Offshore distribution of seabirds in the northwestern Mediterranean in June 1995    Colon. Waterbirds 21: 422-426   Calonectris diomedea, Puffinus yelkouan, Larus audouinii, Larus cachinnans  distribution, abundance, colony     4   O   J
    Abrams RW   1985    Energy and Food Requirements of Pelagic Aerial Seabirds in Different Regions of the African Sector of the Southern Ocean    In: Siegfried WR, Condy PR, Laws RM (eds): Antarctic Nutrient Cycles and Food Webs. pp. 466-472.- Springer, Berlin.     energy, food requirement, carbon flux       14  K   J
    Ainley DG   1977    Feeding methods of seabirds: a comparison of polar and tropical communities In: Llano, G.A. (ed.): Adaptions within Antarctic Ecosystems. pp. 669-685.- Smithsonian Inst., Washington               55      N

    If you're interested, I'd be happy to send you the source code of this function.

    If you've got complex CSV data, then conversion into MODS XML might be a necessary step. However, if your data could be also mapped to RIS, I'd recommend to convert to RIS instead. Depending on your data this may be far easier. To give you better advice, I'd need to see a representative example of your CSV data file.

    W.r.t. your import error ("maps relationship with an element cannot be preserved"): I've never seen such an error. Maybe your MODS (or XML) data is malformed? What does your MODS output look like?

    Thanks, Matthias

  • vidhi

    vidhi - 2011-11-30

    Hello Matthias,

    Thank you very much for your prompt reply. Well, I believe my csv file is a simple one. Please have a look at my file :http://www.sendspace.com/file/tzb6ks. This is a sample file (the final one will contain around 30,000 records)

    Do suggest if I could use your function, if so, would be indeed grateful if you could share & also explain how I could utilize it.


  • Matthias Steffens

    Hi Vids,

    thanks for the sample file. This looks doable. I could help you with this, i.e. I could try to rework my (previously mentioned) 'tabdelimToRis()' function to work with your data. You could then further tweak that base function if needed.

    However, the quality of the import function can only be as good as the sample data you provide. To give it a try, I'd need a much bigger sample file which ideally covers/represents any kind of variation present in the data columns.

    Some questions:

    - Does the real CSV file contain any other columns than: ID,CALL NUMBER,Type,Language,cardauthor,cardtitle,publication year?

    - Is there a list of type values that can be present in the "Type" column?

    - Similarly, is there a list of language values that can be present in the "Language" column?

    - What possible author formats can be present in the "cardauthor" column? I.e., is the string formatting always "lastname, initials or full firstname"? Are multiple author initials always separated with a dot? And how are multiple authors separated?

    Also, the data would need to get converted to tab-delimited format (and without column values being enclosed in quotation marks), but that should be easy I guess.


  • vidhi

    vidhi - 2011-12-13

    Hi Matthias,

    Greetings. Here is the sample file containing the fields I want to display in my library.

    I understand that it has to be converted to tab delimited, here is that format

    Can you please give this file a try with your function? Would be really grateful.




Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks