Menu

Cache, Cookies and more...

2008-07-15
2012-10-08
  • Luc Boudreau

    Luc Boudreau - 2008-07-15

    Hello Olap4j Community =)

    I hearby submit a few optimizations to the XML/A implementation along with a bug fix which affected the compatibility with complex schemas.

    I'm sorry if this is many fixes in the same "commit" and be assured that I'll try to minimize this in the future. I had to do it this time because of the common nature of those optimizations.


    *** SOAP caching ***


    We talked a while back about caching of SOAP requests and how this could be done. Well, tadam, here it is. It's a flexible mechanism which uses a common interface and I've provided the first useful implementation along with it. The structure is as follows.

    ====== CODE ======
    public interface IXmlaOlap4jCache {

    public String setParameters(final Map<String,String> config, final Properties props);

    public byte[] get(final String id, final URL url, final byte[] request) throws
    InvalidStateException;

    public void put(final String id, final URL url, final byte[] request, final byte[] response)
    throws InvalidStateException;

    public void flushCache();

    }
    ====== CODE ======

    This fixed many performance issues for me and by default, the XML/A implementation doesn't use it. It falls back to regular execute-all behavior. I know there is already a cache for metadata, but this wasn't sufficient, since using Olap4j with a connection pool made it close connections and the metadata cache was lost, thus triggering a whole boatload of requests each time.

    In order to use a cache when creating connections, one would simply add those parameters to his JDBC connection string :

    ====== CODE ======
    Cache=org.olap4j.driver.xmla.cache.XmlaOlap4jNamedMemoryCache
    ====== CODE ======
    This tells which implementation to use. In this case, it is the default one.

    This impementation is thread safe and stores the caches at the static level, thus enabling cache sharing among threads. It makes use of the java.concurrent package and synchronizes all access to it's internal cache hashes. Careful attention was given not to synchronize uselessly big blocks of code, but some more optimization work couldn't hurt, especially with the eviction code.

    You can configure the cache implementation by passing more arguments in the structure "Cache.Property=Value". All those properties will be relayed to the cache implementation for custom configuration.

    The NamedMemoryCache uses the following properties

    ====== CODE ======
    Cache.Name=MyCacheSpaceName
    ====== CODE ======
    This defines a cache space in memory under which the cached requests will be stored. Using the same name in many connections will share this cache space as long as the same JVM is used, since caches are at the static level. It is thread safe and should not cause too much overhead.

    If no cache name is given, a random value will be used. This value is then given back as a return value of the setParameters() method, as described in IXmlaOlap4jCache.

    ====== CODE ======
    Cache.Size=50
    ====== CODE ======
    This is the maximum number of elements to maintain in the cache. If the limit is reached, eviction will be done according to the chosen eviction mode. This value has to be a positive integer.

    ====== CODE ======
    Cache.Timeout=300
    ====== CODE ======
    This is the TTL of cached elements. It is a value given in seconds and the expired elements are cleaned at each cache access to minimize memory usage. This value has to be a positive integer.

    ====== CODE ======
    Cache.Mode=LFU
    ====== CODE ======
    This defines the eviction mode to use. By default, the Least Frequently Used mode is used. This means that if elements need to be evicted because the Size parameter has been reached, the element which has been the least used will be evicted.

    The supported eviction modes are :

    LFU : Least frequently used
    MFU : Most frequently used
    LIFO : Last in first out
    FIFO : First in first out

    I've included test cases, but since it's pretty hard to test a HTTP proxy without creating a custom web service which gets queried and returns arbitrary values, only configuration issues are tested. There is space for better testing here, but since major refactoring was required, I decided to push this to later commits.

    Another thing to note is that I've had to create a subinterface to Proxy, CachedProxy, which adds a simple cache management function. The HttpProxy now extends an abstract CachedProxy class which manages cache. This will help further development of proxys by giving helper methods to subclasses.


    ** Cookies *


    The XML/A driver now supports HTTP cookies. I've added this because for each request, the end-point would create a new user session. This puts a lot of stress on the service end-point and it was pretty easy to fix it. The cookies are stored per proxy instance and are not shared among threads nor connections. Each connection has it's own cookies, which is the safest behavior.

    The cookies are validated by domain and expiration, so it conforms to the standard cookie mechanism. There is a CookieManager class that does all this.

    I've put it in a special package, org.olap4j.driver.xmla.proxy. This package could be in the future the namespace for all proxy related classes. Since this commit has widened the Proxy functions, we will have to move all Proxy classes and interfaces in this package for my own sanity. For the moment, all proxy classes are defined as inner classes of the driver class, which is a no no at this scale.

    Test cases are included in the commit.


    ** HTTP Headers **


    There was another small fix for better "good citized" behavior. These changes don't affect anything special, but makes Olap4j requests valid through security systems like ModSecurity. Keeping Olap4j XML/A driver compliant is important since it's sole role is to integrate systems...

    The HttpProxy now sends the User-Agent header as "Olap4j([version])". There's not much to be said about that. =)

    The Accept header is also in function. It now tells the end-point that only XML messages will be accepted, and the text/xml priority is set to 1.

    The Accept-Charset header is sent and uses the value defined in getEncodingCharsetName with a priority of 1.


    ** Bugfixes **


    There was a not-so-hidden bug in the metadata logic. Old code used the catalog name as the schema name. Creating a schema which name was different than the catalog triggered an error on server side.

    Usually, the schema name would be queried by the DBSCHEMA_CATALOGS metadata query, but for some reason, Microsoft decided not to conform to the XML/A specification and not to return the SCHAME_NAME column. This is dumb, I agree, but we have to live with it. The workarround is to query for cubes via MDSCHEMA_CUBES, then find the correct catalog and fetch the schema name there. This works seamlessly with Mondrian and whatever XML/A provider. The catalog parameter in mandatory for the XML/A driver, so it doesn't change anything for the end-user. The API stays the same and everyone lives happily everafter.

    I've created a new handler for this metadata request in the Connection object since the old behavior used the catalog handler even if the information we needed was the schema name.

    That's about it. Now, should I create a package to let people evaluate those (numerous) fixes or do I commit on trunk or on a branch ?

     
    • Luc Boudreau

      Luc Boudreau - 2008-07-18

      ok, everything commited as of SVN revision 098. Please note that the MondrianInprocProxy, which is included in the Mondrian project, will have to be modified in order to support the new package.

       
      • Julian Hyde

        Julian Hyde - 2008-07-18

        Thanks for the heads up Luc. I am going to fix mondrian's main line, but note that mondrian-3.0.4, which will be released from a branch in the next week or so, will not include this change. Mondrian-3.0.4 will be based on olap4j-0.9.5.076.

        Julian

         
    • Luc Boudreau

      Luc Boudreau - 2008-07-15
       
    • Julian Hyde

      Julian Hyde - 2008-07-15

      First of all, let me get the usual comments about coding style out of the way. Don't use tabs, line length 80 maximum, braces should be on ends of lines unless you've broken a long parameter list.

      Quite a few javadoc comments start with the superfluous word 'This ...'. 'This fetches...' should read 'Fetches...', and 'This is the standard API for the XMLA driver cache' should read 'XMLA driver cache'.

      Please add overview.html in new packages org/olap4j/driver/proxy and org/olap4j/driver/cache. New top-level classes need copyright notice, '@version $Id:$' string & footer comment.

      Now the code comments.

      The cookie fix is much needed. Thanks for this.

      Likewise the fix to deconfuse schemas & catalogs. I'd done most of my development against mondrian, as opposed to SSAS, and it shows.

      The SOAP cache is a great idea - and I like how you have integrated it with the proxy framework.

      If it makes sense, move all of the Proxy interfaces & classes into the Proxy package. Proxy was initially within XmlaOlap4jDriver because I used it only for testing.

      For setParameters, is it possible to use a Map<String,String> instead of a Properties object? Properties is a bit outdated (based on Hashtable<Object,Object>).

      Can you rename IXmlaOlap4jCache to XmlaOlap4jCache? I've used the 'I' prefix to indicate interfaces on other projects, but we don't use it on this one.

      Looks like a good set of tests. I'm hoping that the tests will 'just run' - is that true? Is it worth planning to run the entire olap4j suite with and without caching?

      XmlaOlap4jConnection.randomSeed - better to use java.util.UUID. Random number generators don't guarantee that they won't generate the same number consecutively - although it's very unlikely.

      We may have some problems with java.util.concurrent and UUID on jdk 1.4. But let me worry about that one.

      Great work Luc. Once you've addressed the above concerns, go ahead and check in.

      Julian

       
      • Luc Boudreau

        Luc Boudreau - 2008-07-15

        > If it makes sense, move all of the Proxy interfaces & classes (...)because I used it only for testing.

        Will do.

        > For setParameters, is it possible to use a Map<String,String> (...)

        Sure. Whatever.

        > Can you rename IXmlaOlap4jCache to XmlaOlap4jCache?

        Consider it done.

        > Looks like a good set of tests. I'm hoping that the tests will 'just run' - is that true?

        Err, I don't get it. Of course they run. I MADE THEM ! :P

        > Is it worth planning to run the entire olap4j suite with and without caching?

        Well, since the previous test suite used MondrianInProcProxy instead of HttpProxy, I preferred to keep the current test suite as-is and simply test the HTTP proxy against it's own suite of tests. It's easier to test this component alone since it's role is simple and well defined and doesn't have anything to do with higher levels of Olap4j. The proxy is used for SOAP messages without any dependency on it's message (thus the olap4j part).

        I'm planning to extend the current HttpProxy tests with a URLConnection stub, as I used with the Cookie manager tests. This could be possible and would create more extended tests for the proxy classes.

        > XmlaOlap4jConnection.randomSeed - better to use java.util.UUID

        This is a vestige from previous attempts and has been removed this morning. Sorry it stayed there; it isn't anymore.

        > We may have some problems with java.util.concurrent

        You're right. But before tackling this, I'd like someone to peer review it. Concurrent code is not my mojo and I don't feel very confident about this part. For now, the cache optional so I don't mind to commit it. But still, after proper peer review, we might end up without any classes of this package, thus the problem would be solved.

         
        • Julian Hyde

          Julian Hyde - 2008-07-15

          > > Julian:
          > > Looks like a good set of tests. I'm hoping that the tests
          > > will 'just run' - is that true?

          > Luc:
          > Err, I don't get it. Of course they run. I MADE THEM ! :P

          I was thinking of environmental issues. Depending on what you have in your test.properties file, the olap4j suite will talk to a different server, even use a different driver. And if I haven't put the right things in test.properties it won't run at all.

          I was just checking that these tests are independent of that kind of envirnmental stuff.

          > > We may have some problems with java.util.concurrent

          > You're right. But before tackling this, I'd like someone to peer
          > review it. Concurrent code is not my mojo and I don't feel very
          > confident about this part. For now, the cache optional so I don't mind
          > to commit it. But still, after proper peer review, we might end up
          > without any classes of this package, thus the problem
          > would be solved.

          Go ahead and check in. We can review after you have checked in. Your feature is optional, so there's always a workaround if you've broken something.

          Julian

           
          • Luc Boudreau

            Luc Boudreau - 2008-07-15

            > I was just checking that these tests are independent of that kind of envirnmental stuff.

            They are independent and don't even use the TestContext object.

             

Log in to post a comment.