Re: [opennms-discuss] 1.3.3-1 java.lang.OutOfMemoryError: Javaheapspace

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Paul,

Please see the question in the last paragraph of my original response to
this thread regarding RRD data storage.  Let us know what you find.  
Here's what I wrote before:

> Have you verified that the disks are still keeping up with RRD 
> data storage and have you looked at the RRD queue 
> statistics--see my previous message to this list for details.

David Hustace has also asked in another email for the full exception
that you are seeing in the logs, but I haven't seen a response from you
on that.

To help you to diagnose this, we need you provide the information we are
asking for.  Other than RRD queueing-related out of memory errors, we
really don't see OutOfMemory errors regularly, so we need details on
what is happening to try to figure out the cause.

        - djg

On Sun, 24 Jun 2007 05:46:02 -0700, "Paul Mona" <pm...@co...>
said:
> 
> 
> This was happening every 4 hours with linkd enabled.  Since disabling
> linkd, 14 hours have past with out an error.
> 
> 
> 
>  
> 
> |-----Original Message-----
> |From: ope...@li... 
> |[mailto:ope...@li...] On 
> |Behalf Of DJ Gregor
> |Sent: Saturday, June 23, 2007 9:48 AM
> |To: General OpenNMS Discussion
> |Subject: Re: [opennms-discuss] 1.3.3-1 
> |java.lang.OutOfMemoryError: Java heapspace
> |
> |On Sat, 23 Jun 2007 06:46:59 -0700, "Paul Mona" <pm...@co...>
> |said:
> |> While running the 1.3.3-1 release, I've seen a number of 
> |> "OutOfMemeoryError" exceptions happen.  This was also very prevalent 
> |> in previous releases.  To avoid this, we have deployed 
> |multiple disks 
> |> in our boxes with a raid0 array dedicated solely to writing 
> |rrd data.  
> |> But still the problem occurs.
> |>  
> |> In a past thread, David wrote:
> |> [ additional quoting added ]
> |> > This I suspect to be the problem.  I've recently seen this at 
> |> > another site.  We have a poor implementation of of a call used in 
> |> > Hibernate to
> |>>  load up all a node's data in the collector because we 
> |currently have 
> |>> no
> |> > transaction boundary in the collector code (I know, lot of 
> |mumbo jumbo).
> |> > So when a node has a *lot* of interfaces like these, then there is 
> |> > the potential for an extraordinary amount of memory to be 
> |used when, 
> |> > if implemented correctly, we wouldn't have this problem.  We're 
> |> > going to have to address it before we release 1.3.3.
> |>  
> |> Does anyone know if this has been addressed?
> |
> |"PostgreSQL JDBC driver runs out of memory when 
> |NodeDaoHibernate.getHierarchy is called on a node with many interfaces"
> |http://bugzilla.opennms.org//show_bug.cgi?id=1888
> |
> |A patch has been applied and is in 1.3.3 that eliminates the 
> |complex query that returns a very large number of rows.
> |
> |If you were running into bug #1888, you would see that the JVM 
> |would temporarily run out of memory when it was calling 
> |NodeDaoHibernate.getHierarchy on nodes with a large number of 
> |interfaces.  I think we were seeing this in case where a node 
> |has hundreds of interfaces.  The memory would end up getting 
> |freed once the NodeDaoHibernate.getHierarchy call failed, and 
> |OpenNMS would generally continue to run okay (I believe).  You 
> |would see things like this in the logs (collectd.log, I think):
> |
> |    org.postgresql.util.PSQLException: Ran out of memory retrieving
> |    query results.
> |    org.opennms.netmgt.dao.hibernate.NodeDaoHibernate.getHierarchy
> |
> |The latter line would be part of an exception stack trace, 
> |usually from the stack trace of the PSQLException shown above.
> |
> |If you aren't getting those errors, they you aren't running 
> |into this problem, and since the problem you are seeing 
> |appears to be permanent, and not temporary, I would suspect 
> |something else.
> |
> |Have you verified that the disks are still keeping up with RRD 
> |data storage and have you looked at the RRD queue 
> |statistics--see my previous message to this list for details.

Re: [opennms-discuss] 1.3.3-1 java.lang.OutOfMemoryError: Javaheapspace

A Java based fault and performance management system

Re: [opennms-discuss] 1.3.3-1 java.lang.OutOfMemoryError: Javaheapspace