From: Julia B. <jul...@gm...> - 2013-12-13 15:37:07
|
Alas for your theory, we're on OpenJDK version 1.6.0_27. Julia On Fri, Dec 13, 2013 at 6:08 AM, Tod Olson <to...@uc...> wrote: > I'm curious about this, as we routinely process files with about a > million records but have not seen this error. Digging around, I find a > couple JAXP documents that suggest this limit is one of the "Secure > Processing" features, and is all about protecting from untrusted XML that > might be constructed to consume all of the heap. If I read correctly, the > Secure Processing features are on by default as of JAXP 1.4.3 and comes > with the Oracle JDK as of 7u45.[1][2] In particular, it is a feature of the > SAX parser and can be turned off in the SAXParserFactory, but apparently > not just by a Java parameter.[1] > > What Java version are you two running? > > Right now we're at Java 1.6.0_32 on the FreeBSD machines, and 1.7.0_25 > on the RHEL VMs. so it makes sense that we don't see the problem, but I'll > be watching out to see if this starts to affect us whenever those java > versions are upgraded. Meanwhile, I'm really curious if there is a kind of > entity expansion that "counts", but maybe I'll hold off groveling the JAXP > sources for the moment. > > I suspect that one of two things is the case, either: > (a) your input data has a lot more entities to expand, or maybe some funny > entities to expand, somewhere in the solrmarc processing stream, or > (b) this will become more common as more sites, especially the ones with > larger numbers of MARC records, upgrade their Java implementations. > > My money's on (b). > > Assuming the workaround is effective, it may well be good enough. But it > may also be worth trying to understand the details better in case there is > something to do at the Solrmarc level, perhaps an option to turn off the > Secure Processing. > > -Tod > > > [1] JAXP Compatibility: https://jaxp.java.net/1.4/JAXP-Compatibility.html > [2] "Lesson: Processing Limits" section in the JAXP tutorial: > http://docs.oracle.com/javase/tutorial/jaxp/limits/index.html > > On Dec 12, 2013, at 3:27 PM, Osullivan L. <L.O...@sw...> > wrote: > > Hi Julia, > > I haven't experimented yet as I had the error on my local pc and haven't > had the chance to test the solution. > > If you try it, will you let me know how you get on? > > Thanks, > > Luke > > Sent from my HTC > > ----- Reply message ----- > From: "Julia Bauder" <jul...@gm...> > To: "Osullivan L." <L.OSULLIVAN@SWANSEA.AC.UK> > Cc: "vuf...@li..." <vuf...@li... > > > Subject: [VuFind-Tech] Console Indexing Error > Date: Thu, Dec 12, 2013 20:56 > > YES! I just hit the same error this morning. We got more than 64000 > records added before it hit that error, though -- about 110,000 records in > one batch and 80,000 in another -- suggesting to me that "entity > expansions" (whatever they are) aren't one-to-one with documents. > > Did you try Demian's proposed workaround? Any luck? > > Julia > > > On Thu, Dec 5, 2013 at 8:13 AM, Osullivan L. <L.O...@sw...>wrote: > >> Hi Folks, >> >> I've indexed some records on my local instance and come across an >> oddity. >> >> I started vufind in one terminal window and then, in another, I called >> the import-marc.sh script. >> >> The import was successful and there were no problems in the import >> terminal. The terminal where I started the index however showed lots of >> java errors (see below) and suggested that only 64000 records were >> added. >> >> Has anyone else ever come across this? >> >> Thanks, >> >> Luke >> >> >> Message: JAXP00010001: The parser has encountered more than "64000" >> entity expansions in this document; this is the limit imposed by the >> JDK. >> ERROR [main] (MarcImporter.java:383) - ******** Halting indexing! >> ******** >> INFO [main] (MarcImporter.java:617) - Adding 64000 of 580033 documents >> to index >> INFO [main] (MarcImporter.java:618) - Deleting 0 documents from index >> INFO [main] (MarcImporter.java:491) - Calling commit (with optimize set >> to false) >> >> ------------------------------------------------------------------------------ >> Sponsored by Intel(R) XDK >> Develop, test and display web and hybrid apps with a single code base. >> Download it for free now! >> >> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk >> _______________________________________________ >> Vufind-tech mailing list >> Vuf...@li... >> https://lists.sourceforge.net/lists/listinfo/vufind-tech >> > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics > Pro! > > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk_______________________________________________ > > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech > > > |