I'm relatively new to OpenCCG (and NLP, and computers in general...). I hope you can help me out with an error I received while using ccg-grammardoc on the grammar.xml file of Trevor Benjamin and Geert-Jan Kruijff's Moloko grammar. It successfully generates index.html and lexicon.html, but gets hung up on morph.html for a few seconds, whereupon it crashes with the following stack trace:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2219)
at java.util.Vector.grow(Vector.java:262)
at java.util.Vector.ensureCapacityHelper(Vector.java:242)
at java.util.Vector.addElement(Vector.java:616)
at com.sun.org.apache.xml.internal.dtm.ref.sax2dtm.SAX2DTM2.startElement(SAX2DTM2.java:2172)
at com.sun.org.apache.xalan.internal.xsltc.dom.SAXImpl.startElement(SAXImpl.java:921)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:378)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2770)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:458)
at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:252)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.getDOM(TransformerImpl.java:565)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.retrieveDocument(TransformerImpl.java:1319)
at com.sun.org.apache.xalan.internal.xsltc.dom.LoadDocument.document(LoadDocument.java:199)
at com.sun.org.apache.xalan.internal.xsltc.dom.LoadDocument.document(LoadDocument.java:157)
at com.sun.org.apache.xalan.internal.xsltc.dom.LoadDocument.documentF(LoadDocument.java:140)
at GregorSamsa.http$colon$$slash$$slash$www$dot$w3$dot$org$slash$1999$slash$xhtml$colon$template$dot$23()
at GregorSamsa.applyTemplates()
at GregorSamsa.http$colon$$slash$$slash$www$dot$w3$dot$org$slash$1999$slash$xhtml$colon$template$dot$22()
at GregorSamsa.applyTemplates1()
at GregorSamsa.http$colon$$slash$$slash$www$dot$w3$dot$org$slash$1999$slash$xhtml$colon$template$dot$4()
at GregorSamsa.applyTemplates()
at GregorSamsa.applyTemplates()
at GregorSamsa.transform()
at com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet.transform(AbstractTranslet.java:617)
I did a Runtime.getRuntime().totalMemory(), which returned 87.5 MB. Any ideas as to what might be causing this?
Thanks,
Michael
Last edit: Michael Lane 2014-06-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the quick reply! I did what you said, first changing the line to JAVA_MEM="-Xmx4g", then to JAVA_MEM="-Xmx2g", because I thought that maybe 4g was "too large". Didn't work in both cases. In fact, the stack trace has now become:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.sun.org.apache.xerces.internal.xni.XMLString.toString(XMLString.java:188)
at com.sun.org.apache.xerces.internal.util.XMLAttributesImpl.getValue(XMLAttributesImpl.java:541)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser$AttributesProxy.getValue(AbstractSAXParser.java:2321)
at com.sun.org.apache.xml.internal.dtm.ref.sax2dtm.SAX2DTM2.startElement(SAX2DTM2.java:2143)
at com.sun.org.apache.xalan.internal.xsltc.dom.SAXImpl.startElement(SAXImpl.java:921)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509)
at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:182)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:355)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2770)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:458)
at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:252)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.getDOM(TransformerImpl.java:565)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.retrieveDocument(TransformerImpl.java:1319)
at com.sun.org.apache.xalan.internal.xsltc.dom.LoadDocument.document(LoadDocument.java:199)
at com.sun.org.apache.xalan.internal.xsltc.dom.LoadDocument.document(LoadDocument.java:157)
at com.sun.org.apache.xalan.internal.xsltc.dom.LoadDocument.documentF(LoadDocument.java:140)
at GregorSamsa.http$colon$$slash$$slash$www$dot$w3$dot$org$slash$1999$slash$xhtml$colon$template$dot$23()
at GregorSamsa.applyTemplates()
at GregorSamsa.http$colon$$slash$$slash$www$dot$w3$dot$org$slash$1999$slash$xhtml$colon$template$dot$22()
at GregorSamsa.applyTemplates1()
at GregorSamsa.http$colon$$slash$$slash$www$dot$w3$dot$org$slash$1999$slash$xhtml$colon$template$dot$4()
at GregorSamsa.applyTemplates()
at GregorSamsa.applyTemplates()
at GregorSamsa.transform()
at com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet.transform(AbstractTranslet.java:617)
Also, I just did a Runtime.getRuntime().maxMemory(), which returns the maximum allowance of memory for the JRE, rather than a .totalMemory(), which returns the current usage. The former number is 1.27 GB, even after both memory alterations in ccg-env.bat.
Thanks a bunch,
Michael
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry, I misspoke. I'm running Linux, not Windows. Earlier, I meant ccg-env, not ccg-env.bat.
I did what those guys at stack overflow suggested. In particular, I changed the line in ccg-env to
JAVA_MEM="-Xmx2g -XX:-UseGCOverheadLimit"
and then to
JAVA_MEM="-Xmx2g -XX:-UseConcMarkSweepGC",
because there was some discussion that options with the -XX prefix are "highly unstable" and "highly prone to change without notice", and they were the only two options I found. The options intend to turn off the "GC Overhead limit", which throws an OutOfBoundsError if the JVM is spending the majority of its time doing garbage collection. In the first case, the only effect was that I got the first stack trace (the "Java Heap Space" one) when I tried to use ccg-grammardoc. In the second case, there was no effect... I got the second stack trace again.
Unrelatedly, do I need to do ccg-build on $OPENCCG_HOME every time I change something in $OPENCCG_HOME/bin? Cause nothing changes in the echo when I change the values of those variables.
Finally, I don't know if this is interesting to you, but I found this. As I highly impressionable undergrad I find it intriguing. Perhaps there's something wrong with ccg-grammardoc?
Thanks,
Michael
Last edit: Michael Lane 2014-06-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hmm, I'm not sure what might be going on here. I just tried ccg-grammardoc in grammars/routes/ and it worked fine. This one may remain a mystery.
The ccg-env script is invoked by the other scripts in bin, so it's environment changes are not intended to persist (and thus there's no need to run ccg-build).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello CCGers,
I'm relatively new to OpenCCG (and NLP, and computers in general...). I hope you can help me out with an error I received while using ccg-grammardoc on the grammar.xml file of Trevor Benjamin and Geert-Jan Kruijff's Moloko grammar. It successfully generates index.html and lexicon.html, but gets hung up on morph.html for a few seconds, whereupon it crashes with the following stack trace:
I did a Runtime.getRuntime().totalMemory(), which returned 87.5 MB. Any ideas as to what might be causing this?
Thanks,
Michael
Last edit: Michael Lane 2014-06-09
Looks like you need to increase the memory limit. See the "Increasing Java memory limit" section of the README.
Hi Michael,
Thanks for the quick reply! I did what you said, first changing the line to JAVA_MEM="-Xmx4g", then to JAVA_MEM="-Xmx2g", because I thought that maybe 4g was "too large". Didn't work in both cases. In fact, the stack trace has now become:
Also, I just did a Runtime.getRuntime().maxMemory(), which returns the maximum allowance of memory for the JRE, rather than a .totalMemory(), which returns the current usage. The former number is 1.27 GB, even after both memory alterations in ccg-env.bat.
Thanks a bunch,
Michael
Hmm, I've never seen that particular error. Are you running on Windows? The ccg-env.bat is for Windows, ccg-env is for Unix.
There's some discussion of related JVM options here: http://stackoverflow.com/questions/5839359/java-lang-outofmemoryerror-gc-overhead-limit-exceeded
Sorry, I misspoke. I'm running Linux, not Windows. Earlier, I meant ccg-env, not ccg-env.bat.
I did what those guys at stack overflow suggested. In particular, I changed the line in ccg-env to
JAVA_MEM="-Xmx2g -XX:-UseGCOverheadLimit"
and then to
JAVA_MEM="-Xmx2g -XX:-UseConcMarkSweepGC",
because there was some discussion that options with the -XX prefix are "highly unstable" and "highly prone to change without notice", and they were the only two options I found. The options intend to turn off the "GC Overhead limit", which throws an OutOfBoundsError if the JVM is spending the majority of its time doing garbage collection. In the first case, the only effect was that I got the first stack trace (the "Java Heap Space" one) when I tried to use ccg-grammardoc. In the second case, there was no effect... I got the second stack trace again.
Unrelatedly, do I need to do ccg-build on $OPENCCG_HOME every time I change something in $OPENCCG_HOME/bin? Cause nothing changes in the echo when I change the values of those variables.
Finally, I don't know if this is interesting to you, but I found this. As I highly impressionable undergrad I find it intriguing. Perhaps there's something wrong with ccg-grammardoc?
Thanks,
Michael
Last edit: Michael Lane 2014-06-09
Hmm, I'm not sure what might be going on here. I just tried ccg-grammardoc in grammars/routes/ and it worked fine. This one may remain a mystery.
The ccg-env script is invoked by the other scripts in bin, so it's environment changes are not intended to persist (and thus there's no need to run ccg-build).
I'll try it on a different machine later tonight and get back to you.
Thanks for your help.