Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Adding xpath CONTAINS

Help
2012-03-07
2014-02-19
  • Hi Bigdata devs,

    I was trying out a sparql query using the CONTAINS function on my local end point. However, it looks like in 1.1 this is not yet implemented. So I thought optimistically  that would be a nice little fix a starter can do.

    So when I looked at the stacktrace I saw that the function was not registered in the FunctionRegistry.
    unknown function: http://www.w3.org/2005/xpath-functions#contains
            at com.bigdata.rdf.sparql.ast.FunctionRegistry.toVE(FunctionRegistry.java:932

    I thought that this would be very similar to the REGEX function so I started from there.
    So I added a ContainsBOp and registered it. And wrote a junit test which was adapted from TestSubstrBOp.

    However, I am having trouble running the JUnit tests either in eclipse or via ant. In eclipse they throw this
    java.lang.IllegalStateException: testClass: property not defined, could not configure delegate.
    at com.bigdata.rdf.store.ProxyTestCase.getOurDelegate(ProxyTestCase.java:107)
    at com.bigdata.rdf.store.ProxyTestCase.setUp(ProxyTestCase.java:149)
    at junit.framework.TestCase.runBare(TestCase.java:125)

    And in ant all seems to hang when zookeeper is launched.
    Most likely to errors in finding a jini service.
    stopLookup:
          java -Dapp.home=/scratch/BIGDATA_RELEASE_1_1_0 -Djini.lib=/scratch/BIGDATA_RELEASE_1_1_0/dist/bigdata/lib -Djini.lib.dl=/scratch/BIGDATA_RELEASE_1_1_0/dist/bigdata/lib-dl -Djava.security.policy=/scratch/BIGDATA_RELEASE_1_1_0/dist/bigdata/var/config/policy/policy.all -Dlog4j.configuration=resources/logging/log4j.properties -Djava.net.preferIPv4Stack=true -Dbigdata.fedname=bigdata.test.group-lin-073 -Ddefault.nic=${default.nic} -jar /scratch/BIGDATA_RELEASE_1_1_0/bigdata-test/lib/lookupstarter.jar -stop
        
          WARN : 0      main com.bigdata.service.jini.util.LookupStarter.waitForLookupServiceDiscovery(LookupStarter.java:312): NO lookup services discovered
          FAILED to discover lookup service

    stopHttpd:
          java -jar /scratch/BIGDATA_RELEASE_1_1_0/dist/bigdata/lib/classserver.jar -port 23333 -dir /scratch/BIGDATA_RELEASE_1_1_0/dist/bigdata/lib-dl -stop
          Mar 7, 2012 3:20:47 PM com.sun.jini.tool.ClassServer main
          WARNING: requesting shutdown
          java.net.ConnectException: Connection refused
              at java.net.PlainSocketImpl.socketConnect(Native Method)
              at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
              at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
              at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
              at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
              at java.net.Socket.connect(Socket.java:529)
              at java.net.Socket.connect(Socket.java:478)
              at java.net.Socket.<init>(Socket.java:375)
              at java.net.Socket.<init>(Socket.java:218)
              at com.sun.jini.tool.ClassServer.main(ClassServer.java:840)
    I changed the zookeeper properties in the build.properties as well as the javacc one to point to my installs. But I am not sure what piece of software is missing for the JUnit tests to run.

    I then thought to just compile the code as is and use it in my project. At which point I get a
    java.lang.NoSuchFieldError: SUBSTRING_AFTER
            at com.bigdata.rdf.sparql.ast.FunctionRegistry.<clinit>(FunctionRegistry.java:159)
            at com.bigdata.rdf.sparql.ast.eval.AST2BOpUtility.toVE(AST2BOpUtility.java:3610)
            at com.bigdata.rdf.sparql.ast.optimizers.ASTSetValueExpressionsOptimizer.convert2(ASTSetValueExpressionsOptimizer.java:178)

    Which means something is going haywire in the compilation step. My first thought was that I needed to rerun the ant javacc task but that has not helped either.

    I specifically did not paste in my code, even though it is straight derivative without originality from your code, because of the requirement for a signed contributor license agreement which I have not seen nor signed yet, just to keep legalities out of it for now.

    Regards,
    Jerven

     
  • Bryan Thompson
    Bryan Thompson
    2012-03-07

    Jerven,

    SUBSTRING_AFTER is not implemented yet.  This is one of three functions which was added in the Jan 2012 version of the SPARQL 1.1 last call working draft.  The others are substring-before, and replace.  There is an open ticket for this .  However, that does not explain why you are seeing the compilation error.  Perhaps you have updated your code base, but not done an "ant clean"? 

    To run the unit tests which use the delegation pattern you need to specify
    <pre>
    -DtestClass=XXX
    </pre>

    Where XXXX is the name of the delegate class.  You are probably extending QuadsTestCase.  If you look at that class it has the following in the javadoc:

    <pre>
    -DtestClass=com.bigdata.rdf.sail.TestBigdataSailWithQuads
    </pre>

    Let me know if you are extending a different test class.

    In general, the delegate is responsible for setting up the backing database and triple store instance, including specifying any configuration options which will be applied for the test suite.

    The CLA is based on the Apache CLA.  We do require an explicit CLA for both the individual and their employer.  I am happy to send you a copy if you want to see it.

    Thanks,
    Bryan

    https://sourceforge.net/apps/trac/bigdata/ticket/499

     
  • Hi Bryan,

    I would not mind seeing the CLA. No guarantees I can get it signed though. My previous employer would not sign the apache one. I think the current one is more reasonable but still no guarantees.

    However, I am now completely confused three already exists a class StrcontainsBOp (also in com.bigdata.rdf.internal.constraints). However, I can't find any reference to it in the code.

    Anyway I based my code test code on com.bigdata.rdf.internal.constraints.TestSubstrBOp so my test case also extends ProxyTestCase. But I am unsure which test case to put in the -DtestClass to just call this one.

    I finally figured out why I was getting the no such field error. The project code was still using sesame 2.5.0. I had not noticed that the bigdata 1.1.0 branch was using 2.6.3 instead. That fixed that issue and the code now runs and seems to give the correct result.

    Regards,
    Jerven

     
  • Bryan Thompson
    Bryan Thompson
    2012-03-07

    It looks like that method was added very early on one developer but never integrated into the FunctionRegistry.  Have you tried it to see if it has the right semantics?  If so, I would be happy to add it to the FunctionRegistry.

    Thanks,
    Bryan

     
  • Hi Bryan,

    Yes the semantics are ok. Its just a string contains call just like my implementation.
    You will just need to update the generics in the StrcontainsBOp constructor from <IV> to <? extends IV>

    Regards,
    Jerven

     
  • Its actually quite a bit faster than my implementation. Mine gave quite a few log messages about
    2012-03-07 16:12:20 +0000 com.bigdata.relation.accesspath.BlockingBuffer
      WARN: Iterator is not progressing: ntries=267, elapsed=2002ms : BlockingIterator{ open=true, futureIsDone=false, bufferIsOpen=true, nextE=false}
    While this was not the case for the one already in the code base. Might be worth a look to implement the REGEX one using the same logic.

    Regards,
    Jerven

     
  • Bryan Thompson
    Bryan Thompson
    2012-03-07

    I've committed the change to FunctionRegistry and a clean up on StrcontainsBOp.  Please update and try it out.  Let me know if you have any problems.

    I'll mention this thread to Mike.  He handles most of the value expression bops.  Maybe there is some gain to be had in REGEX.

    Thanks,
    Bryan

     
  • Working well. The functional tests on my server are now passing with BigData svnRevision="6079" as its backend.

     
  • Bryan Thompson
    Bryan Thompson
    2012-03-07

    Excellent.

    We will be pushing through support for BINDINGS and SPARQL 1.1 Federated Query quite shortly and then doing SPARQL 1.1 UPDATE.  We expect to have another release out near the end of the month with those features.

    Thanks,
    Bryan