From: Tod O. <to...@uc...> - 2014-03-04 16:35:49
|
For completeness, here’s a summary of the security issue that the limit is there to address: http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/2.0/jaxp/JAXP-Compatibility_150.html#JAXP_security Again, this does not seem like a problem in our particular use. -Tod On Mar 4, 2014, at 10:27 AM, Tod Olson <to...@uc...<mailto:to...@uc...>> wrote: [dropping solrmarc-tech from this] Anna, Julia, and anyone else who ran into this console indexing problem: The entityExpansionLimit setting came up on the dev call this morning. When I was reading about this setting on my own, it seems the setting was put in place to guard against some kind of attack that can happen when pulling DTDs and schemas across the net. I don’t think our use of JAXP in solrmarc does that, so it seems like a scenario we don’t have to worry about. So turning off the limit seems reasonable According to the following document, setting entityExpansionLimit=0 will turn off the limit: http://docs.oracle.com/javase/tutorial/jaxp/limits/limits.html Would one of you be willing to try setting entityExpansionLimit=0 and see if this setting solves the problem? If so, we may wish to set entityExpansionLimit=0 by default in the distribution. Please let the list know if you’re willing to try this, and what your results are. Best, -Tod Tod Olson <to...@uc...<mailto:to...@uc...>> Systems Librarian University of Chicago Library On Feb 10, 2014, at 12:28 PM, Demian Katz <demian.katz@VILLANOVA.EDU<mailto:demian.katz@VILLANOVA.EDU>> wrote: I suspect that this is probably safe to add to master, but perhaps we should discuss it on the next dev call to see if anyone has any concerns. Also copying this to solrmarc-tech in case anyone on that list has opinions about the wisdom of setting a high default entityExpansionLimit to avoid Java errors. - Demian From: anna headley [mailto:an...@gm...] Sent: Thursday, February 06, 2014 3:47 PM To: vuf...@li...<mailto:vuf...@li...> Subject: Re: [VuFind-Tech] Console Indexing Error I hit this this morning and the -DentityExpansionLimit=1000000 workaround worked great. any reason not to add this to master? Would it mess with other versions or distros of java? we're on: $ java -version java version "1.6.0_28" OpenJDK Runtime Environment (IcedTea6 1.13.0pre) (rhel-1.66.1.13.0.el6-x86_64) OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode) Not loading XML. And I got the error on my 64001st record. Thanks for hitting this before me! Anna On Thu, Jan 16, 2014 at 3:33 PM, Andrew Preater <And...@lo...<mailto:And...@lo...>> wrote: Hi Luke Sorry, I've been away. Yes, import-marc.sh is correct. Andrew -- Andrew Preater Associate Director, Information Systems and Services Senate House Libraries, University of London Tel: 020 7862 8452 Twitter: @preater The University of London is an exempt charity in England and Wales and a charity registered in Scotland (reg. no. SC041194) On 07/01/2014 13:26, "Demian Katz" <dem...@vi...<mailto:dem...@vi...>> wrote: >I think you need to make the change to import-marc.sh -- if I understand >correctly, it's the SolrMarc Java app that is complaining, not Solr >itself. > >- Demian > >> -----Original Message----- >> From: Osullivan L. [mailto:L.O...@sw...<mailto:L.O...@sw...>] >> Sent: Tuesday, January 07, 2014 7:18 AM >> To: Andrew Preater >> Cc: vuf...@li...<mailto:vuf...@li...> >> Subject: Re: [VuFind-Tech] Console Indexing Error >> >> Hi Andrew, >> >> Thanks for sharing this. >> >> Did you make the change to the vufind.sh script? Do you need to add >> anything to import-marc.sh? I've just tried it in vufind.sh and it >> doesn't seem to work for me. >> >> Thanks, >> >> Luke >> >> >> >> On Wed, 2013-12-18 at 18:33 +0000, Andrew Preater wrote: >> > Hi all, >> > >> > We just ran into this same problem following a Java upgrade on Ubuntu >> > 12.04 LTS. >> > >> > I'm pleased to report the workaround adding >>-DentityExpansionLimit=1000000 >> > to >> > Java options works for us. Able to run in multiple files of 150,000 >> > records >> > without issues; was previously failing at 64,000. >> > >> > We're using: >> > >> > >> > $ java -version >> > java version "1.6.0_27" >> > OpenJDK Runtime Environment (IcedTea6 1.12.6) >> > (6b27-1.12.6-1ubuntu0.12.04.4) >> > OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) >> > >> > >> > Thanks, >> > >> > Andrew >> > >> > -- >> > Andrew Preater >> > Associate Director, Information Systems and Services >> > Senate House Libraries, University of London >> > Tel: 020 7862 8452 >> > Twitter: @preater >> > >> > The University of London is an exempt charity in England and Wales >>and a >> > charity registered in Scotland (reg. no. SC041194) >> > >> > >> > >> > >> > >> > On 16/12/2013 10:07, "Osullivan L." <L.O...@sw...<mailto:L.O...@sw...>> wrote: >> > >> > >Hi Tod, >> > > >> > >I'm running: >> > > >> > >java version "1.7.0_45" >> > >OpenJDK Runtime Environment (fedora-2.4.3.0.fc19-x86_64 u45-b15) >> > >OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode) >> > > >> > >Cheers, >> > > >> > >Luke >> > > >> > >On Fri, 2013-12-13 at 12:08 +0000, Tod Olson wrote: >> > >> I'm curious about this, as we routinely process files with about a >> > >> million records but have not seen this error. Digging around, I >>find a >> > >> couple JAXP documents that suggest this limit is one of the "Secure >> > >> Processing" features, and is all about protecting from untrusted >>XML >> > >> that might be constructed to consume all of the heap. If I read >> > >> correctly, the Secure Processing features are on by default as of >>JAXP >> > >> 1.4.3 and comes with the Oracle JDK as of 7u45.[1][2] In >>particular, >> > >> it is a feature of the SAX parser and can be turned off in the >> > >> SAXParserFactory, but apparently not just by a Java parameter.[1] >> > >> >> > >> >> > >> What Java version are you two running? >> > >> >> > >> >> > >> Right now we're at Java 1.6.0_32 on the FreeBSD machines, and >>1.7.0_25 >> > >> on the RHEL VMs. so it makes sense that we don't see the problem, >>but >> > >> I'll be watching out to see if this starts to affect us whenever >>those >> > >> java versions are upgraded. Meanwhile, I'm really curious if there >>is >> > >> a kind of entity expansion that "counts", but maybe I'll hold off >> > >> groveling the JAXP sources for the moment. >> > >> >> > >> >> > >> I suspect that one of two things is the case, either: >> > >> (a) your input data has a lot more entities to expand, or maybe >>some >> > >> funny entities to expand, somewhere in the solrmarc processing >>stream, >> > >> or >> > >> (b) this will become more common as more sites, especially the ones >> > >> with larger numbers of MARC records, upgrade their Java >> > >> implementations. >> > >> >> > >> >> > >> My money's on (b). >> > >> >> > >> >> > >> Assuming the workaround is effective, it may well be good enough. >>But >> > >> it may also be worth trying to understand the details better in >>case >> > >> there is something to do at the Solrmarc level, perhaps an option >>to >> > >> turn off the Secure Processing. >> > >> >> > >> >> > >> -Tod >> > >> >> > >> >> > >> >> > >> >> > >> [1] JAXP >> > >> Compatibility: https://jaxp.java.net/1.4/JAXP-Compatibility.html >> > >> [2] "Lesson: Processing Limits" section in the JAXP >> > >> tutorial: >>http://docs.oracle.com/javase/tutorial/jaxp/limits/index.html >> > >> >> > >> On Dec 12, 2013, at 3:27 PM, Osullivan L. >><L.O...@sw...<mailto:L.O...@sw...>> >> > >> wrote: >> > >> >> > >> > Hi Julia, >> > >> > >> > >> > >> > >> > I haven't experimented yet as I had the error on my local pc and >> > >> > haven't had the chance to test the solution. >> > >> > >> > >> > >> > >> > If you try it, will you let me know how you get on? >> > >> > >> > >> > >> > >> > Thanks, >> > >> > >> > >> > >> > >> > Luke >> > >> > >> > >> > >> > >> > Sent from my HTC >> > >> > >> > >> > ----- Reply message ----- >> > >> > From: "Julia Bauder" <jul...@gm...<mailto:jul...@gm...>> >> > >> > To: "Osullivan L." <L.OSULLIVAN@SWANSEA.AC.UK<mailto:L.OSULLIVAN@SWANSEA.AC.UK>> >> > >> > Cc: "vuf...@li...<mailto:vuf...@li...>" >> > >> > <vuf...@li...<mailto:vuf...@li...>> >> > >> > Subject: [VuFind-Tech] Console Indexing Error >> > >> > Date: Thu, Dec 12, 2013 20:56 >> > >> > >> > >> > YES! I just hit the same error this morning. We got more than >>64000 >> > >> > records added before it hit that error, though -- about 110,000 >> > >> > records in one batch and 80,000 in another -- suggesting to me >>that >> > >> > "entity expansions" (whatever they are) aren't one-to-one with >> > >> > documents. >> > >> > >> > >> > >> > >> > Did you try Demian's proposed workaround? Any luck? >> > >> > >> > >> > Julia >> > >> > >> > >> > >> > >> > >> > >> > On Thu, Dec 5, 2013 at 8:13 AM, Osullivan L. >> > >> > <L.O...@sw...<mailto:L.O...@sw...>> wrote: >> > >> > Hi Folks, >> > >> > >> > >> > I've indexed some records on my local instance and come >> > >> > across an >> > >> > oddity. >> > >> > >> > >> > I started vufind in one terminal window and then, in >> > >> > another, I called >> > >> > the import-marc.sh script. >> > >> > >> > >> > The import was successful and there were no problems in >>the >> > >> > import >> > >> > terminal. The terminal where I started the index however >> > >> > showed lots of >> > >> > java errors (see below) and suggested that only 64000 >> > >> > records were >> > >> > added. >> > >> > >> > >> > Has anyone else ever come across this? >> > >> > >> > >> > Thanks, >> > >> > >> > >> > Luke >> > >> > >> > >> > >> > >> > Message: JAXP00010001: The parser has encountered more >>than >> > >> > "64000" >> > >> > entity expansions in this document; this is the limit >> > >> > imposed by the >> > >> > JDK. >> > >> > ERROR [main] (MarcImporter.java:383) - ******** Halting >> > >> > indexing! >> > >> > ******** >> > >> > INFO [main] (MarcImporter.java:617) - Adding 64000 of >> > >> > 580033 documents >> > >> > to index >> > >> > INFO [main] (MarcImporter.java:618) - Deleting 0 >>documents >> > >> > from index >> > >> > INFO [main] (MarcImporter.java:491) - Calling commit >>(with >> > >> > optimize set >> > >> > to false) >> > >> > >> > >>>>----------------------------------------------------------------------- >>>>-- >> > >>----- >> > >> > Sponsored by Intel(R) XDK >> > >> > Develop, test and display web and hybrid apps with a >>single >> > >> > code base. >> > >> > Download it for free now! >> > >> > >> > >>>>http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.c >>>>lk >> > >>trk >> > >> > _______________________________________________ >> > >> > Vufind-tech mailing list >> > >> > Vuf...@li...<mailto:Vuf...@li...> >> > >> > https://lists.sourceforge.net/lists/listinfo/vufind-tech >> > >> > >> > >> > >> > >> > >> > >>>>----------------------------------------------------------------------- >>>>-- >> > >>----- >> > >> > Rapidly troubleshoot problems before they affect your business. >>Most >> > >> > IT >> > >> > organizations don't have a clear picture of how application >> > >> > performance >> > >> > affects their revenue. With AppDynamics, you get 100% visibility >> > >> > into your >> > >> > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >> > >> > AppDynamics Pro! >> > >> > >> > >>>>http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.cl >>>>kt >> > >>rk_______________________________________________ >> > >> > Vufind-tech mailing list >> > >> > Vuf...@li...<mailto:Vuf...@li...> >> > >> > https://lists.sourceforge.net/lists/listinfo/vufind-tech >> > >> >> > >> >> > > >> > >>>------------------------------------------------------------------------ >>>-- >> > >---- >> > >Rapidly troubleshoot problems before they affect your business. Most >>IT >> > >organizations don't have a clear picture of how application >>performance >> > >affects their revenue. With AppDynamics, you get 100% visibility into >> > >your >> > >Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>AppDynamics >> > >Pro! >> > >>>http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clk >>>tr >> > >k >> > >_______________________________________________ >> > >Vufind-tech mailing list >> > >Vuf...@li...<mailto:Vuf...@li...> >> > >https://lists.sourceforge.net/lists/listinfo/vufind-tech >> > >> > >> > >>------------------------------------------------------------------------- >>--- >> -- >> > Rapidly troubleshoot problems before they affect your business. Most >>IT >> > organizations don't have a clear picture of how application >>performance >> > affects their revenue. With AppDynamics, you get 100% visibility into >>your >> > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>AppDynamics >> Pro! >> > >>http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clkt >>rk >> > _______________________________________________ >> > Vufind-tech mailing list >> > Vuf...@li...<mailto:Vuf...@li...> >> > https://lists.sourceforge.net/lists/listinfo/vufind-tech >> >> >>------------------------------------------------------------------------- >>----- >> Rapidly troubleshoot problems before they affect your business. Most IT >> organizations don't have a clear picture of how application performance >> affects their revenue. With AppDynamics, you get 100% visibility into >>your >> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of >>AppDynamics Pro! >> >>http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clkt >>rk >> _______________________________________________ >> Vufind-tech mailing list >> Vuf...@li...<mailto:Vuf...@li...> >> https://lists.sourceforge.net/lists/listinfo/vufind-tech > >-------------------------------------------------------------------------- >---- >Rapidly troubleshoot problems before they affect your business. Most IT >organizations don't have a clear picture of how application performance >affects their revenue. With AppDynamics, you get 100% visibility into >your >Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics >Pro! >http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktr >k >_______________________________________________ >Vufind-tech mailing list >Vuf...@li...<mailto:Vuf...@li...> >https://lists.sourceforge.net/lists/listinfo/vufind-tech ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ Vufind-tech mailing list Vuf...@li...<mailto:Vuf...@li...> https://lists.sourceforge.net/lists/listinfo/vufind-tech -- You received this message because you are subscribed to the Google Groups "solrmarc-tech" group. To unsubscribe from this group and stop receiving emails from it, send an email to sol...@go...<mailto:sol...@go...>. To post to this group, send email to sol...@go...<mailto:sol...@go...>. Visit this group at http://groups.google.com/group/solrmarc-tech. For more options, visit https://groups.google.com/groups/opt_out. |