Menu

preg_replace() failing

Help
2013-02-25
2013-02-26
  • Simon Levermann

    Simon Levermann - 2013-02-25

    I'm currently migrating a large database, and the insertion of the data stops with this error:
    http://pastebin.com/BdJBWGxZ

    Looking at the php bug tracker, this was fixed AGES ago. I'm running the most recent php version on Ubuntu I can get via apt (PHP 5.4.6-1ubuntu1.1). Any ideas on this one?

     
  • Mark Grimshaw

    Mark Grimshaw - 2013-02-25

    Hi Lee,

    I have posted elsewhere on these forums/tickets about this (but can't find it). The best I can say is that it has not been fixed in PHP as I'm running a version in which it should have been fixed but I still get it. It's a warning and, as far as I can tell, does no harm.

    I notice you have a truncation of data in field resourcemiscField6. That field is of type INT and you're trying to insert a non-int character. What format is your source data in?

    Mark

     
  • Simon Levermann

    Simon Levermann - 2013-02-25

    Alright, tried it on a debian server rather than my local testing machine, and the warnings are gone. However, one error still prevails and stops me while inserting my data.

    The first parse of the results (wikindx checking the bibtex file and displaying non-standard fields) takes about 156 seconds, I upped the max execution time to reflect this.

    I hit submit in order to start the insertion process. Now, the script loads for quite a while (large file, lots of things to do), and at some point it stops with:

    Unable to write to database.

    INSERT INTO resource_misc (resourcemiscField6, resourcemiscCollection, resourcemiscPublisher, resourcemiscTag, resourcemiscAddUserIdResource, resourcemiscId) VALUES('387-439', '46', '139', '2', '1', '281')
    --> (Data truncated for column 'resourcemiscField6' at row 1)

    About 100 of the entries are already added, and accessible in the database. The rest (about 5k) are not inserted.

    EDIT:
    Posted this while you were writing your comment. It's a .bib file I exported from biborb, about 5M in size.

     

    Last edit: Simon Levermann 2013-02-25
  • Stephan Matthiesen

    About the preg_replace error, I get that too but it doesn't seem to affect the actual data. I think you can ignore it.

    I believe that inserting large datasets should work (the programme splits it into smaller chunks if there is danger of timeout).

    However, when I imported my data (with about 2400 records at that time) I found that you always get a few "bad apples" (datasets with characters in numeral fields, missing required fields, brackets in the wrong place etc.) where the export from the previous programme messed up or something else unexpected happened. I found it easier therefore to import my data in smaller chunks, and then try to identify the corrupt records by hand.

    Note that you the bibtex import will skip duplicates, so you can try to import the same file (minus corrupt data) several times.

    Unfortunately I think a certain degree of manual cleanup is unavoidable.

    Hope this helps
    Stephan

     

    Last edit: Stephan Matthiesen 2013-02-25
  • Simon Levermann

    Simon Levermann - 2013-02-25

    Update: This seems to be a pages error:

    @incollection{reynolds1966,
    author = {W. C. Reynolds and H. M.
          Satterlee},
    editor = {H. N. Abramson},
    year = {1966},
    title = {Liquid Propellant Behavior at Low and Zero
          Gravity},
    booktitle = {The Dynamics Behavior of Liquids in Moving
          Containers},
    number = {NASA SP-106},
    pages = {387-439},
    note = {},
    publisher = {NASA},
    address = {Washington, D.C.},
    key = {},
    keywords = {},
    lastname = {Reynolds},
    dateadded = {2005-04-19},
    lastdatemodified = {2005-04-19},
    originator = {MD}
    }
    

    Almost all publications use this format for pages, can there only be a single page in wikindx?
    (The pages section matches the failed input data exactly, so it seems to be the culprit)

     

    Last edit: Simon Levermann 2013-02-25
  • Stephan Matthiesen

    I tried importing that record into my db and it also fails.

     
  • Stephan Matthiesen

    Is the hyphen in the pages field a real hyphen or some UTF character? When I delete it and put in a normal hyphen, then I can import it.

     
  • Stephan Matthiesen

    Sorry, ignore my last comment...

     
  • Stephan Matthiesen

    OK: In my case, I get the error, but it is actually imported but without the booktitle field.

     
  • Simon Levermann

    Simon Levermann - 2013-02-25

    If I leave out the pages, it imports it nicely, including the booktitle. With pages in, it does not import it at all. This only seems to happen with Incollections, all other datatypes seem to be fine with the page numbers being 123-456

    Maybe this is a bug?

     
  • Mark Grimshaw

    Mark Grimshaw - 2013-02-25

    I did a copy and paste of the bibtex entry into Paste Bibtex (resources menu), ignored warnings about unrecognized fields I get no errors (just the above PHP warnings) and produce the formatted resource:

    Reynolds, W. C., & Satterlee, H. M. (1966). Liquid propellant behavior at low and zero gravity. In H. N. Abramson (Ed.), The Dynamics Behavior of Liquids in Moving Containers (pp. 387–439). Washington, D.C. NASA.

    Page numbers, book title etc. imported just fine.

    I then saved it as a file and used Import bibtex (Admin menu) -- again same result. Perfect import. Results should be the same as both paste and import use the same parsing code.

    Perhaps the answer is in what Stephan suggests above re weird characters -- characters that are not picked up copying 'n' pasting from Firefox?

    Mark

     
  • Mark Grimshaw

    Mark Grimshaw - 2013-02-25

    BTW -- which version of wikindx are you using?

    Mark

     
  • Mark Grimshaw

    Mark Grimshaw - 2013-02-25

    Version prior to 4.1.6 had a bug in the pages field for some resources particularly relating to the resourcemiscField6 field.

    Mark

     
  • Simon Levermann

    Simon Levermann - 2013-02-25

    4.1.9 Release downloaded from Sourceforge.

    I just retyped the entry manually, so I could avoid odd errors with unicode characters, however I still get the same result as before.

     
  • Mark Grimshaw

    Mark Grimshaw - 2013-02-25

    Can you email the bibtex file to me please Lee? I can create a new database for you and dump it and email it to you. Tell me what category you want it placing in.

    Regards,

    Mark

    P.S: Can't do this tonight but should be able to do it tomorrow for you.

     
  • Simon Levermann

    Simon Levermann - 2013-02-25

    Sent the file.
    Thanks for your quick help!

     
  • Mark Grimshaw

    Mark Grimshaw - 2013-02-26

    Re preg_replace error, I have found a fault and have fixed it in SVN.
    NB: https://sourceforge.net/p/wikindx/discussion/326884/thread/f2855d73/

    Lee: I've emailed a dump for v4 to you this morning. I had no issues importing it.

    Regards,

    Mark

     

Log in to post a comment.