From: Demian K. <dem...@vi...> - 2011-12-16 15:56:29
|
Did you commit and optimize at the end of your process? Sometimes records are missing until the final commit is performed. (Not the most likely explanation, but it's worth making sure). Is it possible there is some kind of edge case paging bug in your code that is causing the last file to duplicate 100 records from the next-to-last file? It might be worth testing some edges: if you load just the first two files, do you get 20,000 records in your index? If you load just the last two files, do you get 14,844 records? - Demian ________________________________________ From: Oliver Goldschmidt [o.g...@tu...] Sent: Friday, December 16, 2011 10:44 AM To: vuf...@li... Subject: [VuFind-Tech] Dumping Solr Hi anybody, we are trying to get a complete dump of all MARC records from Solr. To do that I have written a script, that loops through the index and gets the fullrecord field from each indexed record. For memory reasons I build several files, each of them containing 10000 records (except for the last file). This works fine, but now I get a strange problem reimporting these records: after finishing import 100 records are missing. I have no idea where they could be gone... The importer log tells me, that all records have been imported successfully. I have 54 files with 10000 records and the last file with 4844 records, so my calculation says that I should have 544844 records in my index. If I look into the log file, I find 54 times "Adding 10000 of 10000 records" and one time "Adding 4844 of 4844 records". There are no errors. But if I look into the index now, it shows me only 544744 records. Has anyone any idea, how to explain this difference? Or has anyone another approach of dumping MARC from the Vufind index? Any hints are appreciated. Have a nice 4. advent! Oliver -- Oliver Goldschmidt TU Hamburg-Harburg / Universitätsbibliothek / Digitale Dienste Denickestr. 22 21071 Hamburg - Harburg Tel. +49 (0)40 / 428 78 - 32 91 eMail o.g...@tu... -- GPG/PGP-Schlüssel: http://www.tub.tu-harburg.de/keys/Oliver_Marahrens_pub.asc -- Projekt DISCUS http://discus.tu-harburg.de Projekt TUBdok http://doku.b.tu-harburg.de ------------------------------------------------------------------------------ Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure _______________________________________________ Vufind-tech mailing list Vuf...@li... https://lists.sourceforge.net/lists/listinfo/vufind-tech |