Potential bug in XmlBehaviorEventPersisterImp

Help
wismax
2009-05-07
2013-05-09
  • wismax

    wismax - 2009-05-07

    Hello,

    I currently evaluating beet for some applications monitoring that we need. I think that there is a bug in the XML persister when using plain XML mode instead of binary mode.

    Looking at the source in the rotateLog method, I can see that the XMLStreamWriter is not initialized the same way for plain or binary:
    if (binary) {
        StAXDocumentSerializer sds = new StAXDocumentSerializer();
        sds.setOutputStream(stream);
        writer = sds;
        writer.writeStartDocument();
        writer.writeStartElement("event-log");
    } else {
        writer = XMLOutputFactory.newInstance().createXMLStreamWriter(stream);
    }

    If not binary, the document is not started and the start element is missing. So when events are added to this writer, the "event" becomes the root element instead of "event-log" and exceptions are thrown as soon as a second event is added.

    This is a trace of the error I had:

    06 mai 2009 15:21:00 ERROR com.mtgi.analytics.XmlBehaviorEventPersisterImpl - Error persisting events; discarding 5 events without saving
    javax.xml.stream.XMLStreamException: Trying to output second root, <event>
        at com.ctc.wstx.sw.BaseStreamWriter.throwOutputError(BaseStreamWriter.java:1473)
        at com.ctc.wstx.sw.BaseStreamWriter.throwOutputError(BaseStreamWriter.java:1480)
        at com.ctc.wstx.sw.BaseStreamWriter.reportNwfStructure(BaseStreamWriter.java:1508)
        at com.ctc.wstx.sw.BaseNsStreamWriter.checkStartElement(BaseNsStreamWriter.java:444)
        at com.ctc.wstx.sw.BaseNsStreamWriter.writeStartElement(BaseNsStreamWriter.java:292)
        at com.mtgi.analytics.BehaviorEventSerializer.serialize(BehaviorEventSerializer.java:31)
        at com.mtgi.analytics.XmlBehaviorEventPersisterImpl.persist(XmlBehaviorEventPersisterImpl.java:180)
        at com.mtgi.analytics.BehaviorTrackingManagerImpl.flush(BehaviorTrackingManagerImpl.java:226)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.springframework.util.MethodInvoker.invoke(MethodInvoker.java:276)
        at org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean$MethodInvokingJob.executeInternal(MethodInvokingJobDetailFactoryBean.java:260)
        at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)

    Everything is working fine when I use binary mode. The application is deployed on a WebLogic 10.1 server with a JDK 1.5.

    Do you think that you can fix this for the next beta ?
    Thank you !

    wis

     
    • Jason Trump

      Jason Trump - 2009-05-07

      thanks for the report, you've definitely spotted a bug there.  i've added a defect report here:
      https://sourceforge.net/tracker/?func=detail&aid=2788435&group_id=258926&atid=1127321

      this will definitely be fixed before the next beta.  looks like the XML stream impl on WebLogic is more strict than the one that ships with the sun SDK, so we missed this issue in the unit tests.

      just curious -- is there a particular reason you're going with text over binary?  i'd also be interested to hear about your overall experience with beet on the WebLogic platform.

      -jt

       
    • wismax

      wismax - 2009-05-08

      Hello,

      Good to know that it will be fixed soon !

      As for the reason I use text over binary, it will only be easier to read directly the events logs during this evaluation phase than using beet-utils to convert them. And it will let me to keep my notepad++ open and see the log grows as the events come. Well that not so important...

      I can say that my overall experience with beet and WebLogic is quite good, the information gathered by beet will be very informative for our application. My workstation is not very powerfull so I can't tell if the beet integration creates a big performance hit or not, do you have any numbers on that subject to share ?

      I have a problem with the bt:http-requests element, it does not work for me. However declaring the filter works fine. Maybe it's a classloading problem, our application is deployed as an EAR, containing a web application, some third-party libraries and our own modules as jars. WebLogic provides a way to deploy common libraries without using manifest entries, you simply put them all in APP-INF/lib. That's where I put the beet jar and it's dependencies. The beet config is in its own file in one of our modules, it's loaded by a spring web application context with the rest of our spring config files. I also tried to put the config file directly in the webapp, but it does not work either.

      I also spent some time to figure out how to register the mbeans in the right mbeans server (not the VM one). Finally I had to do the registration myself (see my spring config below). Maybe there a better way to do that, i'm really interested if you have any idea about it. Maybe you could add an attribute to the bt:manager element to provide the reference to the mbeanserver to use for the registration. Anyway I can view and perform operations on the BeetManager and BeetLog and that's good.

      By the way, it would be nice if you can provide an archive with the sources so that I can link it with the class archive to be able to put some breakpoints. I'm behind a proxy and can't configure Git to download the entire source tree.

      I spent some time browsing the code to understand how beet works behind the scene and I think it's quite nicely done. I have one question about the suspended mode, I understand that it will only prevent the events to be sent to the persister, so events are still created and their data are set with parameters values. The seriaIization of this parameters can be a costly operation (if you have of lot of parameters, or when the SQL query are big). Don't you think that, in order to minimize the performance hit of the monitoring when suspended, it would be better to simply return a null event in the createEvent when suspend mode is true ? Next each client of this method should be responsible to handle a null event and discard all the values analysis part.

      Finally the next step of my evaluation is to check if beet can be integrated in another application which is not completely Spring based. It's an old J2EE 1.4 application that is using some components that are Spring based, but the rest of the application is built with custom factories for services and DAO. For these, I think that the integration will be possible with dynamic proxies. When a factory (which will be configured to be monitored) creates a service, it can wrap the service in a dynamic proxy and handle the event creation/start/stop lifecycle. The big problem is that for some services, we use stateless EJBs... With these, the threadlocale pattern used by beet won't work, unless I find a way to propagate the session context over the EJB calls. Did you already think about this kind of integration ?

      Well another "finally" point. It would be great to have a small and really simple Swing application that can read the log and display the events in a hierachic way. Just to give the final users an idea of the data that can be gathered with beet, something more visually appealing than an XML editor. Well maybe this tool already exists, but I didn't find any talk about that on the beet web site. Actually if we decide to go on with beet, we will probably develop a GUI allowing to display usefull information based on beet events imported in an Oracle DB. I let you know if this happens.

      One last word, thanks for this really nice product and keep up the good work !

      wis

      PS: This is my custom config file for the mbean registration:

          <bean id="mbeanServer" class="org.springframework.jndi.JndiObjectFactoryBean">
              <property name="jndiName" value="java:comp/env/jmx/runtime"/>
          </bean>

          <bean id="assembler" class="org.springframework.jmx.export.assembler.MetadataMBeanInfoAssembler">
              <property name="attributeSource">
                <bean class="org.springframework.jmx.export.annotation.AnnotationJmxAttributeSource"/>
              </property>
          </bean>

          <bean id="exporter" class="org.springframework.jmx.export.MBeanExporter">
              <property name="beans">
                  <map>
                      <entry key="tao-is:name=BeetManager" value-ref="defaultTrackingManager"/>
                      <entry key="tao-is:name=BeetLog" value-ref="btXmlPersister"/>
                  </map>
              </property>
              <property name="assembler" ref="assembler"/>
              <property name="server" ref="mbeanServer"/>
              <property name="autodetectModeName" value="AUTODETECT_MBEAN"/>
          </bean>
                 
          <bt:xml-persister id="btXmlPersister" binary="true" compress="true" file="monitoring/tao-is-perf.xml"/>
         
          <bt:manager
              id="defaultTrackingManager"
              application="tao-is"
              flush-schedule="0/30 * * * * ?"
              persister="btXmlPersister"
              track-method-expression="execution(* ch.vd.dfin.tao.is.business.service..*(..))">
          </bt:manager>

       
    • Jason Trump

      Jason Trump - 2009-05-08

      > I can say that my overall experience with beet and WebLogic is quite good, the information gathered by beet will be very informative for our application. My
      > workstation is not very powerfull so I can't tell if the beet integration creates a big performance hit or not, do you have any numbers on that subject to share ?

      The unit tests include some simple timing tests that execute a large number of cases, both instrumented and uninstrumented, and fail if average overhead exceeds 0.1ms per execution.  They also fail if worst-case overhead exceeds 0.5ms.  These tests aren't a great measure, mostly created to raise an alarm if we're doing anything that grossly impacts performance.  They pass by a comfortable margin both on my workstation:

         Windows XP Professional  5.1.2600 Service Pack 3
         Processor x86 Family 6 Model 13 Stepping 8 GenuineIntel ~1861 Mhz
         Physical Memory    2,048.00 MB

      and on our build server, which is

        Ubuntu 4.2.3
        Intel(R) Xeon(TM) CPU 2.66GHz
        Physical Memory    2048 MB

      It seems to me that there are two types of numbers you'd want:  1) what is the impact of beet on application responsiveness; and 2) what is the total system resource impact of beet.  We've examined question #1 a little bit (we need more), but not really question #2.

      > I have a problem with the bt:http-requests element, it does not work for me.

      Yeah, <bt:http-requests> element only works if beet.jar appears in WEB-INF/lib, because the servlet container must find it there to register request listeners.  We need to make this more clear in the documentation.  If there is some way to tell weblogic to search for taglib jars in APP-INF/lib, that would be another workaround.

      I think in a future release we will offer two deployment models: 

      1) the current beet.jar will be called beet-all.jar for simple single-app deployments
      2) we'll split it into beet-core.jar, beet-web.jar, etc, so that you can put beet-core in APP-INF, beet-web in WEB-INF if you like.  i think this approach is especially beneficial as we attempt to support more monitor types out-of-the-box, for example EJBs.

      > I also spent some time to figure out how to register the mbeans in the right mbeans server (not the VM one).

      I think you found the best way to do it.  Beet configuration is basically doing the same thing behind the scenes anyway.  I was wondering if anyone would need to target an MBean server other than platform default... and there you are:).  Beta 3 focuses a lot on JMX integration, so I'll see about adding an "mbean-server" attribute to <bt:manager> to make this kind of configuration easier.

      > By the way, it would be nice if you can provide an archive with the sources so that I can link it with the class archive to be able to put some breakpoints. I'm
      > behind a proxy and can't configure Git to download the entire source tree.

      A source archive is actually required by SourceForge, I've just been too lazy to put it out there for the betas.  There will be one on SF in the final release, but until then you can find it now in our ivy repository here:

        http://beet.sourceforge.net/ivyrep/mtgi/beet/1.4.0_b2-SNAPSHOT/sources/

      > The seriaIization of this parameters can be a costly operation (if you have of lot of parameters, or when the SQL query are big). Don't you think that, in order to
      > minimize the performance hit of the monitoring when suspended, it would be better to simply return a null event in the createEvent when suspend mode is true ? > Next each client of this method should be responsible to handle a null event and discard all the values analysis part.

      Yes, the approach you describe is clearly much better than the one we have now.  The current JMX controls are very limited, and I didn't want to invest too much in the implementation until I had a better idea what was actually needed. 

      The parameter logging is both a blessing and a curse.  I'd like to offer more control over what data is captured.  I've got a feature in development allowing selective disabling of parameter data capture by event-type.  I'm on the fence about whether or not to include it in the current release, as I'm worried about destabilizing the code.

      > in a dynamic proxy and handle the event creation/start/stop lifecycle. The big problem is that for some services, we use stateless EJBs... With these, the
      > threadlocale pattern used by beet won't work, unless I find a way to propagate the session context over the EJB calls. Did you already think about this kind of
      > integration ?

      For EJBs... we are in need of a stronger SessionContext implementation.  com.mtgi.analytics.JAASSessionContext is a start in that direction, but you lose session-id logging.  I had thought that we could just provide an EJB interceptor that interacts with the manager instance, more or less in the same manner as our current AOP class BehaviorTrackingAdvice does. 

      I'm not sure why stateless EJBs would pose a particular problem for our thread-local event tracking... can you elaborate a little?  EJB 3.1 asynchronous calls will obviously present some challenges, but we're not there yet.

      > Well another "finally" point. It would be great to have a small and really simple Swing application that can read the log and display the events in a hierachic way.
      > Just to give the final users an idea of the data that can be gathered with beet, something more visually appealing than an XML editor. Well maybe this tool
      > already exists, but I didn't find any talk about that on the beet web site. Actually if we decide to go on with beet, we will probably develop a GUI allowing to
      > display usefull information based on beet events imported in an Oracle DB. I let you know if this happens.

      that sounds awesome, and do let me know. 

      you will have slightly better view of the data in beta 3; there's an "mbean-persister" implementation that will allow you to track aggregate statistics from a JMX console.  however this doesn't get you the drill-down behavior you describe, which is a definite long-term goal of the project.

      thanks for the detailed response.  i can give you a credit in the changelog for identifying the fix for issue 2788435 if you want to send me your name via email; it's up to you.

       

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks