Thread: [dotNetRDF-bugs] Scope of autassigned bland node IDs

Brought to you by: kal_ahmed, rvesse

dotnetrdf-bugs

[dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Tomek P. <to...@pl...> - 2013-01-10 19:15:24

Hi Rob

I have just updated to latest dotNetRDF available on NuGet and I'm
experiencing two issues.

1. In my unit tests I relied on the way the library assigns blank node
identifiers: autos1, autos2 and so on. When I run the tests separately
each one passes but when I batch them they fail because in subsequent
tests blank nodes are name autos2, autos3, etc. However they don't
share the same graph or triple store. Have you changed this behavior
delbierately?

2. There is a bad memory leak in during SPARQL execution of this:

PREFIX rr: <http://www.w3.org/ns/r2rml#>
DELETE { ?map rr:graph ?value . }
INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
WHERE { ?map rr:graph ?value } ;

DELETE { ?map rr:object ?value . }
INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
WHERE { ?map rr:object ?value } ;

DELETE { ?map rr:predicate ?value . }
INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
WHERE { ?map rr:predicate ?value } ;

DELETE { ?map rr:subject ?value . }
INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
WHERE { ?map rr:subject ?value }

The full code is simply:

var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
            ISparqlUpdateProcessor processor = new
LeviathanUpdateProcessor(dataset);
            var updateParser = new SparqlUpdateParser();

            processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsReplaceSparql));

Is this a know problem and has been already fixed or should I
investigate closely?

Thanks,
Tom

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-01-14 11:34:05

Comments inline:

On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote:

>Hi Rob
>
>I have just updated to latest dotNetRDF available on NuGet and I'm
>experiencing two issues.
>
>1. In my unit tests I relied on the way the library assigns blank node
>identifiers: autos1, autos2 and so on. When I run the tests separately
>each one passes but when I batch them they fail because in subsequent
>tests blank nodes are name autos2, autos3, etc. However they don't
>share the same graph or triple store. Have you changed this behavior
>delbierately?

Yes this behavior changed in the 0.8.x releases, the change was made in
order to resolve a bug in SPARQL 1.1 Update support and also uncovered a
bug in graph isomorphism calculation which was fixed.

You shouldn't rely on an internal implementation detail like how the
library assigns blank node identifiers.  Blank nodes should always be
identifiable by the triples they appear in so it should be possible to
formulate API calls or SPARQL queries that validate that you have produced
the data you expected.

>
>2. There is a bad memory leak in during SPARQL execution of this:

Define bad memory leak?

Updates are transactional so it may be a side effect of the library
maintaining the state necessary to rollback the transaction should it fail
or be aborted.  Also the fact that you are replacing constant nodes with
blank nodes will assign a lot of new identifiers and those identifiers
have to be tracked to prevent collisions.

>
>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>DELETE { ?map rr:graph ?value . }
>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>WHERE { ?map rr:graph ?value } ;
>
>DELETE { ?map rr:object ?value . }
>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>WHERE { ?map rr:object ?value } ;
>
>DELETE { ?map rr:predicate ?value . }
>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>WHERE { ?map rr:predicate ?value } ;
>
>DELETE { ?map rr:subject ?value . }
>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>WHERE { ?map rr:subject ?value }
>
>The full code is simply:
>
>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>            ISparqlUpdateProcessor processor = new
>LeviathanUpdateProcessor(dataset);
>            var updateParser = new SparqlUpdateParser();
>
>            
>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsRe
>placeSparql));
>
>Is this a know problem and has been already fixed or should I
>investigate closely?

This is not a known issue, I would also guess that the data being used
would have some bearing on the severity of the problem.  Please go ahead
and investigate but I would suspect it is the two things I outlined above
which are the culprits here.

Rob

>
>Thanks,
>Tom
>
>--------------------------------------------------------------------------
>----
>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>MVPs and experts. ON SALE this month only -- learn more at:
>http://p.sf.net/sfu/learnmore_122712
>_______________________________________________
>dotNetRDF-bugs mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Tomasz P. <tom...@gm...> - 2013-04-07 17:37:07

Hi Rob

I finally got back to R2RML to analyze why I am getting that memory
leak. It seems connected to the changes you had to introduce for
SPARQL 1.1.

I have determined that it happens in GraphMatcher#GenerateMappings
method. The graphs are equal and I'm not sure what causes the problem.
As soon as TryBruteForceMapping is reached memory consumption explodes
to gigabytes within minutes.

The low-level problem is the mappings variable in the
GenerateMappings, which within a few iteration contains thousands of
elements.

This problem no longer occurs on trunk. Have you actually been
introducing any fixes around that area?

Tom

On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> wrote:
> Comments inline:
>
> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote:
>
>>Hi Rob
>>
>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>experiencing two issues.
>>
>>1. In my unit tests I relied on the way the library assigns blank node
>>identifiers: autos1, autos2 and so on. When I run the tests separately
>>each one passes but when I batch them they fail because in subsequent
>>tests blank nodes are name autos2, autos3, etc. However they don't
>>share the same graph or triple store. Have you changed this behavior
>>delbierately?
>
> Yes this behavior changed in the 0.8.x releases, the change was made in
> order to resolve a bug in SPARQL 1.1 Update support and also uncovered a
> bug in graph isomorphism calculation which was fixed.
>
> You shouldn't rely on an internal implementation detail like how the
> library assigns blank node identifiers.  Blank nodes should always be
> identifiable by the triples they appear in so it should be possible to
> formulate API calls or SPARQL queries that validate that you have produced
> the data you expected.
>
>>
>>2. There is a bad memory leak in during SPARQL execution of this:
>
> Define bad memory leak?
>
> Updates are transactional so it may be a side effect of the library
> maintaining the state necessary to rollback the transaction should it fail
> or be aborted.  Also the fact that you are replacing constant nodes with
> blank nodes will assign a lot of new identifiers and those identifiers
> have to be tracked to prevent collisions.
>
>>
>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>DELETE { ?map rr:graph ?value . }
>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>WHERE { ?map rr:graph ?value } ;
>>
>>DELETE { ?map rr:object ?value . }
>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>WHERE { ?map rr:object ?value } ;
>>
>>DELETE { ?map rr:predicate ?value . }
>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>WHERE { ?map rr:predicate ?value } ;
>>
>>DELETE { ?map rr:subject ?value . }
>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>WHERE { ?map rr:subject ?value }
>>
>>The full code is simply:
>>
>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>            ISparqlUpdateProcessor processor = new
>>LeviathanUpdateProcessor(dataset);
>>            var updateParser = new SparqlUpdateParser();
>>
>>
>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsRe
>>placeSparql));
>>
>>Is this a know problem and has been already fixed or should I
>>investigate closely?
>
> This is not a known issue, I would also guess that the data being used
> would have some bearing on the severity of the problem.  Please go ahead
> and investigate but I would suspect it is the two things I outlined above
> which are the culprits here.
>
> Rob
>
>>
>>Thanks,
>>Tom
>>
>>--------------------------------------------------------------------------
>>----
>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>MVPs and experts. ON SALE this month only -- learn more at:
>>http://p.sf.net/sfu/learnmore_122712
>>_______________________________________________
>>dotNetRDF-bugs mailing list
>>dot...@li...
>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>
>
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. SALE $99.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122412
> _______________________________________________
> dotNetRDF-bugs mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Tomasz P. <tom...@gm...> - 2013-04-07 18:21:09

Hm, I was wrong actually.

I tried comparing the exact same graphs loaded from Turtle in
dotNetRDF test project but I got the unit test wrong.

I have added the CORE-345 bug and committed a failing test case [1].
Could you please have a look at this?

Thanks,
Tom

[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345

On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
<tom...@gm...> wrote:
> Hi Rob
>
> I finally got back to R2RML to analyze why I am getting that memory
> leak. It seems connected to the changes you had to introduce for
> SPARQL 1.1.
>
> I have determined that it happens in GraphMatcher#GenerateMappings
> method. The graphs are equal and I'm not sure what causes the problem.
> As soon as TryBruteForceMapping is reached memory consumption explodes
> to gigabytes within minutes.
>
> The low-level problem is the mappings variable in the
> GenerateMappings, which within a few iteration contains thousands of
> elements.
>
> This problem no longer occurs on trunk. Have you actually been
> introducing any fixes around that area?
>
> Tom
>
> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> wrote:
>> Comments inline:
>>
>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote:
>>
>>>Hi Rob
>>>
>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>experiencing two issues.
>>>
>>>1. In my unit tests I relied on the way the library assigns blank node
>>>identifiers: autos1, autos2 and so on. When I run the tests separately
>>>each one passes but when I batch them they fail because in subsequent
>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>share the same graph or triple store. Have you changed this behavior
>>>delbierately?
>>
>> Yes this behavior changed in the 0.8.x releases, the change was made in
>> order to resolve a bug in SPARQL 1.1 Update support and also uncovered a
>> bug in graph isomorphism calculation which was fixed.
>>
>> You shouldn't rely on an internal implementation detail like how the
>> library assigns blank node identifiers.  Blank nodes should always be
>> identifiable by the triples they appear in so it should be possible to
>> formulate API calls or SPARQL queries that validate that you have produced
>> the data you expected.
>>
>>>
>>>2. There is a bad memory leak in during SPARQL execution of this:
>>
>> Define bad memory leak?
>>
>> Updates are transactional so it may be a side effect of the library
>> maintaining the state necessary to rollback the transaction should it fail
>> or be aborted.  Also the fact that you are replacing constant nodes with
>> blank nodes will assign a lot of new identifiers and those identifiers
>> have to be tracked to prevent collisions.
>>
>>>
>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>DELETE { ?map rr:graph ?value . }
>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>WHERE { ?map rr:graph ?value } ;
>>>
>>>DELETE { ?map rr:object ?value . }
>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>WHERE { ?map rr:object ?value } ;
>>>
>>>DELETE { ?map rr:predicate ?value . }
>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>WHERE { ?map rr:predicate ?value } ;
>>>
>>>DELETE { ?map rr:subject ?value . }
>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>WHERE { ?map rr:subject ?value }
>>>
>>>The full code is simply:
>>>
>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>            ISparqlUpdateProcessor processor = new
>>>LeviathanUpdateProcessor(dataset);
>>>            var updateParser = new SparqlUpdateParser();
>>>
>>>
>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsRe
>>>placeSparql));
>>>
>>>Is this a know problem and has been already fixed or should I
>>>investigate closely?
>>
>> This is not a known issue, I would also guess that the data being used
>> would have some bearing on the severity of the problem.  Please go ahead
>> and investigate but I would suspect it is the two things I outlined above
>> which are the culprits here.
>>
>> Rob
>>
>>>
>>>Thanks,
>>>Tom
>>>
>>>--------------------------------------------------------------------------
>>>----
>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>http://p.sf.net/sfu/learnmore_122712
>>>_______________________________________________
>>>dotNetRDF-bugs mailing list
>>>dot...@li...
>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>> http://p.sf.net/sfu/learnmore_122412
>> _______________________________________________
>> dotNetRDF-bugs mailing list
>> dot...@li...
>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-09 17:35:31

Hey Tom

The problem is that graph isomorphism is NP-hard so sometimes the only
option we have is to attempt to brute force the problem

I've started added some Debug.WriteLine() to GraphMatcher to track down
where things go wrong

For your graphs they may look trivially equal but to code they are not,
the reason this worked prior to 0.8.0 is that one of the things we try is
a trivial mapping (assume blank nodes have same IDs in both graphs) so in
previous releases you would likely have hit this case and been fine.

You have 33 blank nodes in the graph of which only 6 are uniquely
identifiable and mappable.  The matcher generates a candidate mapping for
the whole graph but its best effort is incorrect, so then it falls back to
brute force.  I need to dig further into whether the candidate mapping
could be improved but this is not trivial to debug and will take some time
to resolve.

We may be able to reduce the "memory leak" by using yield rather than
pre-generating all possible mapping but this is a tricky refactor, it's
been a long time since I wrote the code originally and I remember that
doing the mapping in the yield form proved thorny at the time so I chose
not to.  The code itself for generating the mappings has some slightly
strange things in it so I really need to spend a block of time refreshing
myself on the logic there to check that it is sound before I attempt to
refactor.

Rob

On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...>
wrote:

>Hm, I was wrong actually.
>
>I tried comparing the exact same graphs loaded from Turtle in
>dotNetRDF test project but I got the unit test wrong.
>
>I have added the CORE-345 bug and committed a failing test case [1].
>Could you please have a look at this?
>
>Thanks,
>Tom
>
>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>
>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
><tom...@gm...> wrote:
>> Hi Rob
>>
>> I finally got back to R2RML to analyze why I am getting that memory
>> leak. It seems connected to the changes you had to introduce for
>> SPARQL 1.1.
>>
>> I have determined that it happens in GraphMatcher#GenerateMappings
>> method. The graphs are equal and I'm not sure what causes the problem.
>> As soon as TryBruteForceMapping is reached memory consumption explodes
>> to gigabytes within minutes.
>>
>> The low-level problem is the mappings variable in the
>> GenerateMappings, which within a few iteration contains thousands of
>> elements.
>>
>> This problem no longer occurs on trunk. Have you actually been
>> introducing any fixes around that area?
>>
>> Tom
>>
>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...>
>>wrote:
>>> Comments inline:
>>>
>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote:
>>>
>>>>Hi Rob
>>>>
>>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>>experiencing two issues.
>>>>
>>>>1. In my unit tests I relied on the way the library assigns blank node
>>>>identifiers: autos1, autos2 and so on. When I run the tests separately
>>>>each one passes but when I batch them they fail because in subsequent
>>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>>share the same graph or triple store. Have you changed this behavior
>>>>delbierately?
>>>
>>> Yes this behavior changed in the 0.8.x releases, the change was made in
>>> order to resolve a bug in SPARQL 1.1 Update support and also uncovered
>>>a
>>> bug in graph isomorphism calculation which was fixed.
>>>
>>> You shouldn't rely on an internal implementation detail like how the
>>> library assigns blank node identifiers.  Blank nodes should always be
>>> identifiable by the triples they appear in so it should be possible to
>>> formulate API calls or SPARQL queries that validate that you have
>>>produced
>>> the data you expected.
>>>
>>>>
>>>>2. There is a bad memory leak in during SPARQL execution of this:
>>>
>>> Define bad memory leak?
>>>
>>> Updates are transactional so it may be a side effect of the library
>>> maintaining the state necessary to rollback the transaction should it
>>>fail
>>> or be aborted.  Also the fact that you are replacing constant nodes
>>>with
>>> blank nodes will assign a lot of new identifiers and those identifiers
>>> have to be tracked to prevent collisions.
>>>
>>>>
>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>DELETE { ?map rr:graph ?value . }
>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>WHERE { ?map rr:graph ?value } ;
>>>>
>>>>DELETE { ?map rr:object ?value . }
>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>WHERE { ?map rr:object ?value } ;
>>>>
>>>>DELETE { ?map rr:predicate ?value . }
>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>WHERE { ?map rr:predicate ?value } ;
>>>>
>>>>DELETE { ?map rr:subject ?value . }
>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>WHERE { ?map rr:subject ?value }
>>>>
>>>>The full code is simply:
>>>>
>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>            ISparqlUpdateProcessor processor = new
>>>>LeviathanUpdateProcessor(dataset);
>>>>            var updateParser = new SparqlUpdateParser();
>>>>
>>>>
>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmap
>>>>sRe
>>>>placeSparql));
>>>>
>>>>Is this a know problem and has been already fixed or should I
>>>>investigate closely?
>>>
>>> This is not a known issue, I would also guess that the data being used
>>> would have some bearing on the severity of the problem.  Please go
>>>ahead
>>> and investigate but I would suspect it is the two things I outlined
>>>above
>>> which are the culprits here.
>>>
>>> Rob
>>>
>>>>
>>>>Thanks,
>>>>Tom
>>>>
>>>>-----------------------------------------------------------------------
>>>>---
>>>>----
>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>http://p.sf.net/sfu/learnmore_122712
>>>>_______________________________________________
>>>>dotNetRDF-bugs mailing list
>>>>dot...@li...
>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>
>>>
>>>
>>>
>>>
>>> 
>>>------------------------------------------------------------------------
>>>------
>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>> http://p.sf.net/sfu/learnmore_122412
>>> _______________________________________________
>>> dotNetRDF-bugs mailing list
>>> dot...@li...
>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>--------------------------------------------------------------------------
>----
>Minimize network downtime and maximize team effectiveness.
>Reduce network management and security costs.Learn how to hire
>the most talented Cisco Certified professionals. Visit the
>Employer Resources Portal
>http://www.cisco.com/web/learning/employer_resources/index.html
>_______________________________________________
>dotNetRDF-bugs mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 17:24:42

Hey Tom

So the logic for generating the brute force mappings was completely broken
causing it to get stuck in a memory sucking spin cycle :(

I rewrote the GenerateMappings() method from scratch to use yield return
and the test will not complete within the timeout but it fails so I still
need to dig further

We may still be generating incorrect possible mappings or the logic for
brute force may be flawed elsewhere

Rob

On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:

>Hey Tom
>
>The problem is that graph isomorphism is NP-hard so sometimes the only
>option we have is to attempt to brute force the problem
>
>I've started added some Debug.WriteLine() to GraphMatcher to track down
>where things go wrong
>
>For your graphs they may look trivially equal but to code they are not,
>the reason this worked prior to 0.8.0 is that one of the things we try is
>a trivial mapping (assume blank nodes have same IDs in both graphs) so in
>previous releases you would likely have hit this case and been fine.
>
>You have 33 blank nodes in the graph of which only 6 are uniquely
>identifiable and mappable.  The matcher generates a candidate mapping for
>the whole graph but its best effort is incorrect, so then it falls back to
>brute force.  I need to dig further into whether the candidate mapping
>could be improved but this is not trivial to debug and will take some time
>to resolve.
>
>We may be able to reduce the "memory leak" by using yield rather than
>pre-generating all possible mapping but this is a tricky refactor, it's
>been a long time since I wrote the code originally and I remember that
>doing the mapping in the yield form proved thorny at the time so I chose
>not to.  The code itself for generating the mappings has some slightly
>strange things in it so I really need to spend a block of time refreshing
>myself on the logic there to check that it is sound before I attempt to
>refactor.
>
>Rob
>
>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>wrote:
>
>>Hm, I was wrong actually.
>>
>>I tried comparing the exact same graphs loaded from Turtle in
>>dotNetRDF test project but I got the unit test wrong.
>>
>>I have added the CORE-345 bug and committed a failing test case [1].
>>Could you please have a look at this?
>>
>>Thanks,
>>Tom
>>
>>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>
>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>><tom...@gm...> wrote:
>>> Hi Rob
>>>
>>> I finally got back to R2RML to analyze why I am getting that memory
>>> leak. It seems connected to the changes you had to introduce for
>>> SPARQL 1.1.
>>>
>>> I have determined that it happens in GraphMatcher#GenerateMappings
>>> method. The graphs are equal and I'm not sure what causes the problem.
>>> As soon as TryBruteForceMapping is reached memory consumption explodes
>>> to gigabytes within minutes.
>>>
>>> The low-level problem is the mappings variable in the
>>> GenerateMappings, which within a few iteration contains thousands of
>>> elements.
>>>
>>> This problem no longer occurs on trunk. Have you actually been
>>> introducing any fixes around that area?
>>>
>>> Tom
>>>
>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...>
>>>wrote:
>>>> Comments inline:
>>>>
>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote:
>>>>
>>>>>Hi Rob
>>>>>
>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>>>experiencing two issues.
>>>>>
>>>>>1. In my unit tests I relied on the way the library assigns blank node
>>>>>identifiers: autos1, autos2 and so on. When I run the tests separately
>>>>>each one passes but when I batch them they fail because in subsequent
>>>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>>>share the same graph or triple store. Have you changed this behavior
>>>>>delbierately?
>>>>
>>>> Yes this behavior changed in the 0.8.x releases, the change was made
>>>>in
>>>> order to resolve a bug in SPARQL 1.1 Update support and also uncovered
>>>>a
>>>> bug in graph isomorphism calculation which was fixed.
>>>>
>>>> You shouldn't rely on an internal implementation detail like how the
>>>> library assigns blank node identifiers.  Blank nodes should always be
>>>> identifiable by the triples they appear in so it should be possible to
>>>> formulate API calls or SPARQL queries that validate that you have
>>>>produced
>>>> the data you expected.
>>>>
>>>>>
>>>>>2. There is a bad memory leak in during SPARQL execution of this:
>>>>
>>>> Define bad memory leak?
>>>>
>>>> Updates are transactional so it may be a side effect of the library
>>>> maintaining the state necessary to rollback the transaction should it
>>>>fail
>>>> or be aborted.  Also the fact that you are replacing constant nodes
>>>>with
>>>> blank nodes will assign a lot of new identifiers and those identifiers
>>>> have to be tracked to prevent collisions.
>>>>
>>>>>
>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>DELETE { ?map rr:graph ?value . }
>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>
>>>>>DELETE { ?map rr:object ?value . }
>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>WHERE { ?map rr:object ?value } ;
>>>>>
>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>
>>>>>DELETE { ?map rr:subject ?value . }
>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>WHERE { ?map rr:subject ?value }
>>>>>
>>>>>The full code is simply:
>>>>>
>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>>            ISparqlUpdateProcessor processor = new
>>>>>LeviathanUpdateProcessor(dataset);
>>>>>            var updateParser = new SparqlUpdateParser();
>>>>>
>>>>>
>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubma
>>>>>p
>>>>>sRe
>>>>>placeSparql));
>>>>>
>>>>>Is this a know problem and has been already fixed or should I
>>>>>investigate closely?
>>>>
>>>> This is not a known issue, I would also guess that the data being used
>>>> would have some bearing on the severity of the problem.  Please go
>>>>ahead
>>>> and investigate but I would suspect it is the two things I outlined
>>>>above
>>>> which are the culprits here.
>>>>
>>>> Rob
>>>>
>>>>>
>>>>>Thanks,
>>>>>Tom
>>>>>
>>>>>----------------------------------------------------------------------
>>>>>-
>>>>>---
>>>>>----
>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>current
>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>_______________________________________________
>>>>>dotNetRDF-bugs mailing list
>>>>>dot...@li...
>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 
>>>>-----------------------------------------------------------------------
>>>>-
>>>>------
>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>current
>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>>> http://p.sf.net/sfu/learnmore_122412
>>>> _______________________________________________
>>>> dotNetRDF-bugs mailing list
>>>> dot...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>-------------------------------------------------------------------------
>>-
>>----
>>Minimize network downtime and maximize team effectiveness.
>>Reduce network management and security costs.Learn how to hire
>>the most talented Cisco Certified professionals. Visit the
>>Employer Resources Portal
>>http://www.cisco.com/web/learning/employer_resources/index.html
>>_______________________________________________
>>dotNetRDF-bugs mailing list
>>dot...@li...
>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>
>
>
>
>--------------------------------------------------------------------------
>----
>Precog is a next-generation analytics platform capable of advanced
>analytics on semi-structured data. The platform includes APIs for building
>apps and a phenomenal toolset for data science. Developers can use
>our toolset for easy data analysis & visualization. Get a free account!
>http://www2.precog.com/precogplatform/slashdotnewsletter
>_______________________________________________
>dotNetRDF-bugs mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 17:26:47

s/not/now

That should be "the test will now complete within the timeout"

Rob

On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:

>Hey Tom
>
>So the logic for generating the brute force mappings was completely broken
>causing it to get stuck in a memory sucking spin cycle :(
>
>I rewrote the GenerateMappings() method from scratch to use yield return
>and the test will not complete within the timeout but it fails so I still
>need to dig further
>
>We may still be generating incorrect possible mappings or the logic for
>brute force may be flawed elsewhere
>
>Rob
>
>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>
>>Hey Tom
>>
>>The problem is that graph isomorphism is NP-hard so sometimes the only
>>option we have is to attempt to brute force the problem
>>
>>I've started added some Debug.WriteLine() to GraphMatcher to track down
>>where things go wrong
>>
>>For your graphs they may look trivially equal but to code they are not,
>>the reason this worked prior to 0.8.0 is that one of the things we try is
>>a trivial mapping (assume blank nodes have same IDs in both graphs) so in
>>previous releases you would likely have hit this case and been fine.
>>
>>You have 33 blank nodes in the graph of which only 6 are uniquely
>>identifiable and mappable.  The matcher generates a candidate mapping for
>>the whole graph but its best effort is incorrect, so then it falls back
>>to
>>brute force.  I need to dig further into whether the candidate mapping
>>could be improved but this is not trivial to debug and will take some
>>time
>>to resolve.
>>
>>We may be able to reduce the "memory leak" by using yield rather than
>>pre-generating all possible mapping but this is a tricky refactor, it's
>>been a long time since I wrote the code originally and I remember that
>>doing the mapping in the yield form proved thorny at the time so I chose
>>not to.  The code itself for generating the mappings has some slightly
>>strange things in it so I really need to spend a block of time refreshing
>>myself on the logic there to check that it is sound before I attempt to
>>refactor.
>>
>>Rob
>>
>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>>wrote:
>>
>>>Hm, I was wrong actually.
>>>
>>>I tried comparing the exact same graphs loaded from Turtle in
>>>dotNetRDF test project but I got the unit test wrong.
>>>
>>>I have added the CORE-345 bug and committed a failing test case [1].
>>>Could you please have a look at this?
>>>
>>>Thanks,
>>>Tom
>>>
>>>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>
>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>><tom...@gm...> wrote:
>>>> Hi Rob
>>>>
>>>> I finally got back to R2RML to analyze why I am getting that memory
>>>> leak. It seems connected to the changes you had to introduce for
>>>> SPARQL 1.1.
>>>>
>>>> I have determined that it happens in GraphMatcher#GenerateMappings
>>>> method. The graphs are equal and I'm not sure what causes the problem.
>>>> As soon as TryBruteForceMapping is reached memory consumption explodes
>>>> to gigabytes within minutes.
>>>>
>>>> The low-level problem is the mappings variable in the
>>>> GenerateMappings, which within a few iteration contains thousands of
>>>> elements.
>>>>
>>>> This problem no longer occurs on trunk. Have you actually been
>>>> introducing any fixes around that area?
>>>>
>>>> Tom
>>>>
>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...>
>>>>wrote:
>>>>> Comments inline:
>>>>>
>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...>
>>>>>wrote:
>>>>>
>>>>>>Hi Rob
>>>>>>
>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>>>>experiencing two issues.
>>>>>>
>>>>>>1. In my unit tests I relied on the way the library assigns blank
>>>>>>node
>>>>>>identifiers: autos1, autos2 and so on. When I run the tests
>>>>>>separately
>>>>>>each one passes but when I batch them they fail because in subsequent
>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>>>>share the same graph or triple store. Have you changed this behavior
>>>>>>delbierately?
>>>>>
>>>>> Yes this behavior changed in the 0.8.x releases, the change was made
>>>>>in
>>>>> order to resolve a bug in SPARQL 1.1 Update support and also
>>>>>uncovered
>>>>>a
>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>
>>>>> You shouldn't rely on an internal implementation detail like how the
>>>>> library assigns blank node identifiers.  Blank nodes should always be
>>>>> identifiable by the triples they appear in so it should be possible
>>>>>to
>>>>> formulate API calls or SPARQL queries that validate that you have
>>>>>produced
>>>>> the data you expected.
>>>>>
>>>>>>
>>>>>>2. There is a bad memory leak in during SPARQL execution of this:
>>>>>
>>>>> Define bad memory leak?
>>>>>
>>>>> Updates are transactional so it may be a side effect of the library
>>>>> maintaining the state necessary to rollback the transaction should it
>>>>>fail
>>>>> or be aborted.  Also the fact that you are replacing constant nodes
>>>>>with
>>>>> blank nodes will assign a lot of new identifiers and those
>>>>>identifiers
>>>>> have to be tracked to prevent collisions.
>>>>>
>>>>>>
>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>
>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>
>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>
>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>
>>>>>>The full code is simply:
>>>>>>
>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>            var updateParser = new SparqlUpdateParser();
>>>>>>
>>>>>>
>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubm
>>>>>>a
>>>>>>p
>>>>>>sRe
>>>>>>placeSparql));
>>>>>>
>>>>>>Is this a know problem and has been already fixed or should I
>>>>>>investigate closely?
>>>>>
>>>>> This is not a known issue, I would also guess that the data being
>>>>>used
>>>>> would have some bearing on the severity of the problem.  Please go
>>>>>ahead
>>>>> and investigate but I would suspect it is the two things I outlined
>>>>>above
>>>>> which are the culprits here.
>>>>>
>>>>> Rob
>>>>>
>>>>>>
>>>>>>Thanks,
>>>>>>Tom
>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>-
>>>>>>-
>>>>>>---
>>>>>>----
>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>current
>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>_______________________________________________
>>>>>>dotNetRDF-bugs mailing list
>>>>>>dot...@li...
>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 
>>>>>----------------------------------------------------------------------
>>>>>-
>>>>>-
>>>>>------
>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>current
>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>> _______________________________________________
>>>>> dotNetRDF-bugs mailing list
>>>>> dot...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>
>>>------------------------------------------------------------------------
>>>-
>>>-
>>>----
>>>Minimize network downtime and maximize team effectiveness.
>>>Reduce network management and security costs.Learn how to hire
>>>the most talented Cisco Certified professionals. Visit the
>>>Employer Resources Portal
>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>_______________________________________________
>>>dotNetRDF-bugs mailing list
>>>dot...@li...
>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>
>>
>>
>>
>>-------------------------------------------------------------------------
>>-
>>----
>>Precog is a next-generation analytics platform capable of advanced
>>analytics on semi-structured data. The platform includes APIs for
>>building
>>apps and a phenomenal toolset for data science. Developers can use
>>our toolset for easy data analysis & visualization. Get a free account!
>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>_______________________________________________
>>dotNetRDF-bugs mailing list
>>dot...@li...
>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>
>
>
>
>--------------------------------------------------------------------------
>----
>Precog is a next-generation analytics platform capable of advanced
>analytics on semi-structured data. The platform includes APIs for building
>apps and a phenomenal toolset for data science. Developers can use
>our toolset for easy data analysis & visualization. Get a free account!
>http://www2.precog.com/precogplatform/slashdotnewsletter
>_______________________________________________
>dotNetRDF-bugs mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 18:21:08

Hey Tom

This should now be fixed for your test case though I am not 100% convinced
that brute forcing is not still broken

What I have done to fix this is to add an intermediate step between the
rules based and brute force mapping which does a divide and conquer
approach

What this does is break the unmapped blank node portions of the graph into
its constituent isolated sub-graphs (those that share no blank nodes) and
then recursively calls Equals() on the candidate matches for the
sub-graphs.  This approach reduces the amount of work required and the
likelihood of needing to brute force at all though we still fall back in
the worst case.

If you can come up with any more graphs that break GraphMatcher those
would be much appreciated

Rob

On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:

>s/not/now
>
>That should be "the test will now complete within the timeout"
>
>Rob
>
>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>
>>Hey Tom
>>
>>So the logic for generating the brute force mappings was completely
>>broken
>>causing it to get stuck in a memory sucking spin cycle :(
>>
>>I rewrote the GenerateMappings() method from scratch to use yield return
>>and the test will not complete within the timeout but it fails so I still
>>need to dig further
>>
>>We may still be generating incorrect possible mappings or the logic for
>>brute force may be flawed elsewhere
>>
>>Rob
>>
>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>
>>>Hey Tom
>>>
>>>The problem is that graph isomorphism is NP-hard so sometimes the only
>>>option we have is to attempt to brute force the problem
>>>
>>>I've started added some Debug.WriteLine() to GraphMatcher to track down
>>>where things go wrong
>>>
>>>For your graphs they may look trivially equal but to code they are not,
>>>the reason this worked prior to 0.8.0 is that one of the things we try
>>>is
>>>a trivial mapping (assume blank nodes have same IDs in both graphs) so
>>>in
>>>previous releases you would likely have hit this case and been fine.
>>>
>>>You have 33 blank nodes in the graph of which only 6 are uniquely
>>>identifiable and mappable.  The matcher generates a candidate mapping
>>>for
>>>the whole graph but its best effort is incorrect, so then it falls back
>>>to
>>>brute force.  I need to dig further into whether the candidate mapping
>>>could be improved but this is not trivial to debug and will take some
>>>time
>>>to resolve.
>>>
>>>We may be able to reduce the "memory leak" by using yield rather than
>>>pre-generating all possible mapping but this is a tricky refactor, it's
>>>been a long time since I wrote the code originally and I remember that
>>>doing the mapping in the yield form proved thorny at the time so I chose
>>>not to.  The code itself for generating the mappings has some slightly
>>>strange things in it so I really need to spend a block of time
>>>refreshing
>>>myself on the logic there to check that it is sound before I attempt to
>>>refactor.
>>>
>>>Rob
>>>
>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>>>wrote:
>>>
>>>>Hm, I was wrong actually.
>>>>
>>>>I tried comparing the exact same graphs loaded from Turtle in
>>>>dotNetRDF test project but I got the unit test wrong.
>>>>
>>>>I have added the CORE-345 bug and committed a failing test case [1].
>>>>Could you please have a look at this?
>>>>
>>>>Thanks,
>>>>Tom
>>>>
>>>>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>
>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>><tom...@gm...> wrote:
>>>>> Hi Rob
>>>>>
>>>>> I finally got back to R2RML to analyze why I am getting that memory
>>>>> leak. It seems connected to the changes you had to introduce for
>>>>> SPARQL 1.1.
>>>>>
>>>>> I have determined that it happens in GraphMatcher#GenerateMappings
>>>>> method. The graphs are equal and I'm not sure what causes the
>>>>>problem.
>>>>> As soon as TryBruteForceMapping is reached memory consumption
>>>>>explodes
>>>>> to gigabytes within minutes.
>>>>>
>>>>> The low-level problem is the mappings variable in the
>>>>> GenerateMappings, which within a few iteration contains thousands of
>>>>> elements.
>>>>>
>>>>> This problem no longer occurs on trunk. Have you actually been
>>>>> introducing any fixes around that area?
>>>>>
>>>>> Tom
>>>>>
>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...>
>>>>>wrote:
>>>>>> Comments inline:
>>>>>>
>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...>
>>>>>>wrote:
>>>>>>
>>>>>>>Hi Rob
>>>>>>>
>>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>>>>>experiencing two issues.
>>>>>>>
>>>>>>>1. In my unit tests I relied on the way the library assigns blank
>>>>>>>node
>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests
>>>>>>>separately
>>>>>>>each one passes but when I batch them they fail because in
>>>>>>>subsequent
>>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>>>>>share the same graph or triple store. Have you changed this behavior
>>>>>>>delbierately?
>>>>>>
>>>>>> Yes this behavior changed in the 0.8.x releases, the change was made
>>>>>>in
>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also
>>>>>>uncovered
>>>>>>a
>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>
>>>>>> You shouldn't rely on an internal implementation detail like how the
>>>>>> library assigns blank node identifiers.  Blank nodes should always
>>>>>>be
>>>>>> identifiable by the triples they appear in so it should be possible
>>>>>>to
>>>>>> formulate API calls or SPARQL queries that validate that you have
>>>>>>produced
>>>>>> the data you expected.
>>>>>>
>>>>>>>
>>>>>>>2. There is a bad memory leak in during SPARQL execution of this:
>>>>>>
>>>>>> Define bad memory leak?
>>>>>>
>>>>>> Updates are transactional so it may be a side effect of the library
>>>>>> maintaining the state necessary to rollback the transaction should
>>>>>>it
>>>>>>fail
>>>>>> or be aborted.  Also the fact that you are replacing constant nodes
>>>>>>with
>>>>>> blank nodes will assign a lot of new identifiers and those
>>>>>>identifiers
>>>>>> have to be tracked to prevent collisions.
>>>>>>
>>>>>>>
>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>
>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>
>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>
>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>
>>>>>>>The full code is simply:
>>>>>>>
>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>            var updateParser = new SparqlUpdateParser();
>>>>>>>
>>>>>>>
>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSub
>>>>>>>m
>>>>>>>a
>>>>>>>p
>>>>>>>sRe
>>>>>>>placeSparql));
>>>>>>>
>>>>>>>Is this a know problem and has been already fixed or should I
>>>>>>>investigate closely?
>>>>>>
>>>>>> This is not a known issue, I would also guess that the data being
>>>>>>used
>>>>>> would have some bearing on the severity of the problem.  Please go
>>>>>>ahead
>>>>>> and investigate but I would suspect it is the two things I outlined
>>>>>>above
>>>>>> which are the culprits here.
>>>>>>
>>>>>> Rob
>>>>>>
>>>>>>>
>>>>>>>Thanks,
>>>>>>>Tom
>>>>>>>
>>>>>>>--------------------------------------------------------------------
>>>>>>>-
>>>>>>>-
>>>>>>>-
>>>>>>>---
>>>>>>>----
>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>current
>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>_______________________________________________
>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>dot...@li...
>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 
>>>>>>---------------------------------------------------------------------
>>>>>>-
>>>>>>-
>>>>>>-
>>>>>>------
>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>current
>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>> _______________________________________________
>>>>>> dotNetRDF-bugs mailing list
>>>>>> dot...@li...
>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>
>>>>-----------------------------------------------------------------------
>>>>-
>>>>-
>>>>-
>>>>----
>>>>Minimize network downtime and maximize team effectiveness.
>>>>Reduce network management and security costs.Learn how to hire
>>>>the most talented Cisco Certified professionals. Visit the
>>>>Employer Resources Portal
>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>_______________________________________________
>>>>dotNetRDF-bugs mailing list
>>>>dot...@li...
>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>
>>>
>>>
>>>
>>>
>>>------------------------------------------------------------------------
>>>-
>>>-
>>>----
>>>Precog is a next-generation analytics platform capable of advanced
>>>analytics on semi-structured data. The platform includes APIs for
>>>building
>>>apps and a phenomenal toolset for data science. Developers can use
>>>our toolset for easy data analysis & visualization. Get a free account!
>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>_______________________________________________
>>>dotNetRDF-bugs mailing list
>>>dot...@li...
>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>
>>
>>
>>
>>-------------------------------------------------------------------------
>>-
>>----
>>Precog is a next-generation analytics platform capable of advanced
>>analytics on semi-structured data. The platform includes APIs for
>>building
>>apps and a phenomenal toolset for data science. Developers can use
>>our toolset for easy data analysis & visualization. Get a free account!
>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>_______________________________________________
>>dotNetRDF-bugs mailing list
>>dot...@li...
>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>
>
>
>
>--------------------------------------------------------------------------
>----
>Precog is a next-generation analytics platform capable of advanced
>analytics on semi-structured data. The platform includes APIs for building
>apps and a phenomenal toolset for data science. Developers can use
>our toolset for easy data analysis & visualization. Get a free account!
>http://www2.precog.com/precogplatform/slashdotnewsletter
>_______________________________________________
>dotNetRDF-bugs mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Tomasz P. <tom...@gm...> - 2013-04-12 18:24:12

Hi Rob

Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
same issue. I will add all these to the test fixture.

Tom

On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote:
> Hey Tom
>
> This should now be fixed for your test case though I am not 100% convinced
> that brute forcing is not still broken
>
> What I have done to fix this is to add an intermediate step between the
> rules based and brute force mapping which does a divide and conquer
> approach
>
> What this does is break the unmapped blank node portions of the graph into
> its constituent isolated sub-graphs (those that share no blank nodes) and
> then recursively calls Equals() on the candidate matches for the
> sub-graphs.  This approach reduces the amount of work required and the
> likelihood of needing to brute force at all though we still fall back in
> the worst case.
>
> If you can come up with any more graphs that break GraphMatcher those
> would be much appreciated
>
> Rob
>
> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
>
>>s/not/now
>>
>>That should be "the test will now complete within the timeout"
>>
>>Rob
>>
>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>>
>>>Hey Tom
>>>
>>>So the logic for generating the brute force mappings was completely
>>>broken
>>>causing it to get stuck in a memory sucking spin cycle :(
>>>
>>>I rewrote the GenerateMappings() method from scratch to use yield return
>>>and the test will not complete within the timeout but it fails so I still
>>>need to dig further
>>>
>>>We may still be generating incorrect possible mappings or the logic for
>>>brute force may be flawed elsewhere
>>>
>>>Rob
>>>
>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>>
>>>>Hey Tom
>>>>
>>>>The problem is that graph isomorphism is NP-hard so sometimes the only
>>>>option we have is to attempt to brute force the problem
>>>>
>>>>I've started added some Debug.WriteLine() to GraphMatcher to track down
>>>>where things go wrong
>>>>
>>>>For your graphs they may look trivially equal but to code they are not,
>>>>the reason this worked prior to 0.8.0 is that one of the things we try
>>>>is
>>>>a trivial mapping (assume blank nodes have same IDs in both graphs) so
>>>>in
>>>>previous releases you would likely have hit this case and been fine.
>>>>
>>>>You have 33 blank nodes in the graph of which only 6 are uniquely
>>>>identifiable and mappable.  The matcher generates a candidate mapping
>>>>for
>>>>the whole graph but its best effort is incorrect, so then it falls back
>>>>to
>>>>brute force.  I need to dig further into whether the candidate mapping
>>>>could be improved but this is not trivial to debug and will take some
>>>>time
>>>>to resolve.
>>>>
>>>>We may be able to reduce the "memory leak" by using yield rather than
>>>>pre-generating all possible mapping but this is a tricky refactor, it's
>>>>been a long time since I wrote the code originally and I remember that
>>>>doing the mapping in the yield form proved thorny at the time so I chose
>>>>not to.  The code itself for generating the mappings has some slightly
>>>>strange things in it so I really need to spend a block of time
>>>>refreshing
>>>>myself on the logic there to check that it is sound before I attempt to
>>>>refactor.
>>>>
>>>>Rob
>>>>
>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>>>>wrote:
>>>>
>>>>>Hm, I was wrong actually.
>>>>>
>>>>>I tried comparing the exact same graphs loaded from Turtle in
>>>>>dotNetRDF test project but I got the unit test wrong.
>>>>>
>>>>>I have added the CORE-345 bug and committed a failing test case [1].
>>>>>Could you please have a look at this?
>>>>>
>>>>>Thanks,
>>>>>Tom
>>>>>
>>>>>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>>
>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>>><tom...@gm...> wrote:
>>>>>> Hi Rob
>>>>>>
>>>>>> I finally got back to R2RML to analyze why I am getting that memory
>>>>>> leak. It seems connected to the changes you had to introduce for
>>>>>> SPARQL 1.1.
>>>>>>
>>>>>> I have determined that it happens in GraphMatcher#GenerateMappings
>>>>>> method. The graphs are equal and I'm not sure what causes the
>>>>>>problem.
>>>>>> As soon as TryBruteForceMapping is reached memory consumption
>>>>>>explodes
>>>>>> to gigabytes within minutes.
>>>>>>
>>>>>> The low-level problem is the mappings variable in the
>>>>>> GenerateMappings, which within a few iteration contains thousands of
>>>>>> elements.
>>>>>>
>>>>>> This problem no longer occurs on trunk. Have you actually been
>>>>>> introducing any fixes around that area?
>>>>>>
>>>>>> Tom
>>>>>>
>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...>
>>>>>>wrote:
>>>>>>> Comments inline:
>>>>>>>
>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...>
>>>>>>>wrote:
>>>>>>>
>>>>>>>>Hi Rob
>>>>>>>>
>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>>>>>>experiencing two issues.
>>>>>>>>
>>>>>>>>1. In my unit tests I relied on the way the library assigns blank
>>>>>>>>node
>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests
>>>>>>>>separately
>>>>>>>>each one passes but when I batch them they fail because in
>>>>>>>>subsequent
>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>>>>>>share the same graph or triple store. Have you changed this behavior
>>>>>>>>delbierately?
>>>>>>>
>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was made
>>>>>>>in
>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also
>>>>>>>uncovered
>>>>>>>a
>>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>>
>>>>>>> You shouldn't rely on an internal implementation detail like how the
>>>>>>> library assigns blank node identifiers.  Blank nodes should always
>>>>>>>be
>>>>>>> identifiable by the triples they appear in so it should be possible
>>>>>>>to
>>>>>>> formulate API calls or SPARQL queries that validate that you have
>>>>>>>produced
>>>>>>> the data you expected.
>>>>>>>
>>>>>>>>
>>>>>>>>2. There is a bad memory leak in during SPARQL execution of this:
>>>>>>>
>>>>>>> Define bad memory leak?
>>>>>>>
>>>>>>> Updates are transactional so it may be a side effect of the library
>>>>>>> maintaining the state necessary to rollback the transaction should
>>>>>>>it
>>>>>>>fail
>>>>>>> or be aborted.  Also the fact that you are replacing constant nodes
>>>>>>>with
>>>>>>> blank nodes will assign a lot of new identifiers and those
>>>>>>>identifiers
>>>>>>> have to be tracked to prevent collisions.
>>>>>>>
>>>>>>>>
>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>>
>>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>>
>>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>>
>>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>>
>>>>>>>>The full code is simply:
>>>>>>>>
>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>>            var updateParser = new SparqlUpdateParser();
>>>>>>>>
>>>>>>>>
>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSub
>>>>>>>>m
>>>>>>>>a
>>>>>>>>p
>>>>>>>>sRe
>>>>>>>>placeSparql));
>>>>>>>>
>>>>>>>>Is this a know problem and has been already fixed or should I
>>>>>>>>investigate closely?
>>>>>>>
>>>>>>> This is not a known issue, I would also guess that the data being
>>>>>>>used
>>>>>>> would have some bearing on the severity of the problem.  Please go
>>>>>>>ahead
>>>>>>> and investigate but I would suspect it is the two things I outlined
>>>>>>>above
>>>>>>> which are the culprits here.
>>>>>>>
>>>>>>> Rob
>>>>>>>
>>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Tom
>>>>>>>>
>>>>>>>>--------------------------------------------------------------------
>>>>>>>>-
>>>>>>>>-
>>>>>>>>-
>>>>>>>>---
>>>>>>>>----
>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>>current
>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>>_______________________________________________
>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>dot...@li...
>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>---------------------------------------------------------------------
>>>>>>>-
>>>>>>>-
>>>>>>>-
>>>>>>>------
>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>current
>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>>> _______________________________________________
>>>>>>> dotNetRDF-bugs mailing list
>>>>>>> dot...@li...
>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>
>>>>>-----------------------------------------------------------------------
>>>>>-
>>>>>-
>>>>>-
>>>>>----
>>>>>Minimize network downtime and maximize team effectiveness.
>>>>>Reduce network management and security costs.Learn how to hire
>>>>>the most talented Cisco Certified professionals. Visit the
>>>>>Employer Resources Portal
>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>_______________________________________________
>>>>>dotNetRDF-bugs mailing list
>>>>>dot...@li...
>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>------------------------------------------------------------------------
>>>>-
>>>>-
>>>>----
>>>>Precog is a next-generation analytics platform capable of advanced
>>>>analytics on semi-structured data. The platform includes APIs for
>>>>building
>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>our toolset for easy data analysis & visualization. Get a free account!
>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>_______________________________________________
>>>>dotNetRDF-bugs mailing list
>>>>dot...@li...
>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>
>>>
>>>
>>>
>>>
>>>-------------------------------------------------------------------------
>>>-
>>>----
>>>Precog is a next-generation analytics platform capable of advanced
>>>analytics on semi-structured data. The platform includes APIs for
>>>building
>>>apps and a phenomenal toolset for data science. Developers can use
>>>our toolset for easy data analysis & visualization. Get a free account!
>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>_______________________________________________
>>>dotNetRDF-bugs mailing list
>>>dot...@li...
>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>
>>
>>
>>
>>--------------------------------------------------------------------------
>>----
>>Precog is a next-generation analytics platform capable of advanced
>>analytics on semi-structured data. The platform includes APIs for building
>>apps and a phenomenal toolset for data science. Developers can use
>>our toolset for easy data analysis & visualization. Get a free account!
>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>_______________________________________________
>>dotNetRDF-bugs mailing list
>>dot...@li...
>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>
>
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> dotNetRDF-bugs mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 18:34:21

Those would be useful

Btw I closed the issue branch so please just add the tests to default

Rob

On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...>
wrote:

>Hi Rob
>
>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
>same issue. I will add all these to the test fixture.
>
>Tom
>
>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote:
>> Hey Tom
>>
>> This should now be fixed for your test case though I am not 100%
>>convinced
>> that brute forcing is not still broken
>>
>> What I have done to fix this is to add an intermediate step between the
>> rules based and brute force mapping which does a divide and conquer
>> approach
>>
>> What this does is break the unmapped blank node portions of the graph
>>into
>> its constituent isolated sub-graphs (those that share no blank nodes)
>>and
>> then recursively calls Equals() on the candidate matches for the
>> sub-graphs.  This approach reduces the amount of work required and the
>> likelihood of needing to brute force at all though we still fall back in
>> the worst case.
>>
>> If you can come up with any more graphs that break GraphMatcher those
>> would be much appreciated
>>
>> Rob
>>
>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
>>
>>>s/not/now
>>>
>>>That should be "the test will now complete within the timeout"
>>>
>>>Rob
>>>
>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>>>
>>>>Hey Tom
>>>>
>>>>So the logic for generating the brute force mappings was completely
>>>>broken
>>>>causing it to get stuck in a memory sucking spin cycle :(
>>>>
>>>>I rewrote the GenerateMappings() method from scratch to use yield
>>>>return
>>>>and the test will not complete within the timeout but it fails so I
>>>>still
>>>>need to dig further
>>>>
>>>>We may still be generating incorrect possible mappings or the logic for
>>>>brute force may be flawed elsewhere
>>>>
>>>>Rob
>>>>
>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>
>>>>>Hey Tom
>>>>>
>>>>>The problem is that graph isomorphism is NP-hard so sometimes the only
>>>>>option we have is to attempt to brute force the problem
>>>>>
>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track
>>>>>down
>>>>>where things go wrong
>>>>>
>>>>>For your graphs they may look trivially equal but to code they are
>>>>>not,
>>>>>the reason this worked prior to 0.8.0 is that one of the things we try
>>>>>is
>>>>>a trivial mapping (assume blank nodes have same IDs in both graphs) so
>>>>>in
>>>>>previous releases you would likely have hit this case and been fine.
>>>>>
>>>>>You have 33 blank nodes in the graph of which only 6 are uniquely
>>>>>identifiable and mappable.  The matcher generates a candidate mapping
>>>>>for
>>>>>the whole graph but its best effort is incorrect, so then it falls
>>>>>back
>>>>>to
>>>>>brute force.  I need to dig further into whether the candidate mapping
>>>>>could be improved but this is not trivial to debug and will take some
>>>>>time
>>>>>to resolve.
>>>>>
>>>>>We may be able to reduce the "memory leak" by using yield rather than
>>>>>pre-generating all possible mapping but this is a tricky refactor,
>>>>>it's
>>>>>been a long time since I wrote the code originally and I remember that
>>>>>doing the mapping in the yield form proved thorny at the time so I
>>>>>chose
>>>>>not to.  The code itself for generating the mappings has some slightly
>>>>>strange things in it so I really need to spend a block of time
>>>>>refreshing
>>>>>myself on the logic there to check that it is sound before I attempt
>>>>>to
>>>>>refactor.
>>>>>
>>>>>Rob
>>>>>
>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz"
>>>>><tom...@gm...>
>>>>>wrote:
>>>>>
>>>>>>Hm, I was wrong actually.
>>>>>>
>>>>>>I tried comparing the exact same graphs loaded from Turtle in
>>>>>>dotNetRDF test project but I got the unit test wrong.
>>>>>>
>>>>>>I have added the CORE-345 bug and committed a failing test case [1].
>>>>>>Could you please have a look at this?
>>>>>>
>>>>>>Thanks,
>>>>>>Tom
>>>>>>
>>>>>>[1]: 
>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>>>
>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>>>><tom...@gm...> wrote:
>>>>>>> Hi Rob
>>>>>>>
>>>>>>> I finally got back to R2RML to analyze why I am getting that memory
>>>>>>> leak. It seems connected to the changes you had to introduce for
>>>>>>> SPARQL 1.1.
>>>>>>>
>>>>>>> I have determined that it happens in GraphMatcher#GenerateMappings
>>>>>>> method. The graphs are equal and I'm not sure what causes the
>>>>>>>problem.
>>>>>>> As soon as TryBruteForceMapping is reached memory consumption
>>>>>>>explodes
>>>>>>> to gigabytes within minutes.
>>>>>>>
>>>>>>> The low-level problem is the mappings variable in the
>>>>>>> GenerateMappings, which within a few iteration contains thousands
>>>>>>>of
>>>>>>> elements.
>>>>>>>
>>>>>>> This problem no longer occurs on trunk. Have you actually been
>>>>>>> introducing any fixes around that area?
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...>
>>>>>>>wrote:
>>>>>>>> Comments inline:
>>>>>>>>
>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...>
>>>>>>>>wrote:
>>>>>>>>
>>>>>>>>>Hi Rob
>>>>>>>>>
>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>>>>>>>experiencing two issues.
>>>>>>>>>
>>>>>>>>>1. In my unit tests I relied on the way the library assigns blank
>>>>>>>>>node
>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests
>>>>>>>>>separately
>>>>>>>>>each one passes but when I batch them they fail because in
>>>>>>>>>subsequent
>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>>>>>>>share the same graph or triple store. Have you changed this
>>>>>>>>>behavior
>>>>>>>>>delbierately?
>>>>>>>>
>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was
>>>>>>>>made
>>>>>>>>in
>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also
>>>>>>>>uncovered
>>>>>>>>a
>>>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>>>
>>>>>>>> You shouldn't rely on an internal implementation detail like how
>>>>>>>>the
>>>>>>>> library assigns blank node identifiers.  Blank nodes should always
>>>>>>>>be
>>>>>>>> identifiable by the triples they appear in so it should be
>>>>>>>>possible
>>>>>>>>to
>>>>>>>> formulate API calls or SPARQL queries that validate that you have
>>>>>>>>produced
>>>>>>>> the data you expected.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>2. There is a bad memory leak in during SPARQL execution of this:
>>>>>>>>
>>>>>>>> Define bad memory leak?
>>>>>>>>
>>>>>>>> Updates are transactional so it may be a side effect of the
>>>>>>>>library
>>>>>>>> maintaining the state necessary to rollback the transaction should
>>>>>>>>it
>>>>>>>>fail
>>>>>>>> or be aborted.  Also the fact that you are replacing constant
>>>>>>>>nodes
>>>>>>>>with
>>>>>>>> blank nodes will assign a lot of new identifiers and those
>>>>>>>>identifiers
>>>>>>>> have to be tracked to prevent collisions.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>>>
>>>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>>>
>>>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>>>
>>>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>>>
>>>>>>>>>The full code is simply:
>>>>>>>>>
>>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>>>            var updateParser = new SparqlUpdateParser();
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutS
>>>>>>>>>ub
>>>>>>>>>m
>>>>>>>>>a
>>>>>>>>>p
>>>>>>>>>sRe
>>>>>>>>>placeSparql));
>>>>>>>>>
>>>>>>>>>Is this a know problem and has been already fixed or should I
>>>>>>>>>investigate closely?
>>>>>>>>
>>>>>>>> This is not a known issue, I would also guess that the data being
>>>>>>>>used
>>>>>>>> would have some bearing on the severity of the problem.  Please go
>>>>>>>>ahead
>>>>>>>> and investigate but I would suspect it is the two things I
>>>>>>>>outlined
>>>>>>>>above
>>>>>>>> which are the culprits here.
>>>>>>>>
>>>>>>>> Rob
>>>>>>>>
>>>>>>>>>
>>>>>>>>>Thanks,
>>>>>>>>>Tom
>>>>>>>>>
>>>>>>>>>------------------------------------------------------------------
>>>>>>>>>--
>>>>>>>>>-
>>>>>>>>>-
>>>>>>>>>-
>>>>>>>>>---
>>>>>>>>>----
>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
>>>>>>>>>CSS,
>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>>>current
>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>>>_______________________________________________
>>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>dot...@li...
>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>-------------------------------------------------------------------
>>>>>>>>--
>>>>>>>>-
>>>>>>>>-
>>>>>>>>-
>>>>>>>>------
>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
>>>>>>>>CSS,
>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>>current
>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>>>> _______________________________________________
>>>>>>>> dotNetRDF-bugs mailing list
>>>>>>>> dot...@li...
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>--
>>>>>>-
>>>>>>-
>>>>>>-
>>>>>>----
>>>>>>Minimize network downtime and maximize team effectiveness.
>>>>>>Reduce network management and security costs.Learn how to hire
>>>>>>the most talented Cisco Certified professionals. Visit the
>>>>>>Employer Resources Portal
>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>_______________________________________________
>>>>>>dotNetRDF-bugs mailing list
>>>>>>dot...@li...
>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>----------------------------------------------------------------------
>>>>>--
>>>>>-
>>>>>-
>>>>>----
>>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>building
>>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>account!
>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>_______________________________________________
>>>>>dotNetRDF-bugs mailing list
>>>>>dot...@li...
>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>-----------------------------------------------------------------------
>>>>--
>>>>-
>>>>----
>>>>Precog is a next-generation analytics platform capable of advanced
>>>>analytics on semi-structured data. The platform includes APIs for
>>>>building
>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>our toolset for easy data analysis & visualization. Get a free account!
>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>_______________________________________________
>>>>dotNetRDF-bugs mailing list
>>>>dot...@li...
>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>
>>>
>>>
>>>
>>>
>>>------------------------------------------------------------------------
>>>--
>>>----
>>>Precog is a next-generation analytics platform capable of advanced
>>>analytics on semi-structured data. The platform includes APIs for
>>>building
>>>apps and a phenomenal toolset for data science. Developers can use
>>>our toolset for easy data analysis & visualization. Get a free account!
>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>_______________________________________________
>>>dotNetRDF-bugs mailing list
>>>dot...@li...
>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>
>>
>>
>>
>> 
>>-------------------------------------------------------------------------
>>-----
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for
>>building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> _______________________________________________
>> dotNetRDF-bugs mailing list
>> dot...@li...
>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>--------------------------------------------------------------------------
>----
>Precog is a next-generation analytics platform capable of advanced
>analytics on semi-structured data. The platform includes APIs for building
>apps and a phenomenal toolset for data science. Developers can use
>our toolset for easy data analysis & visualization. Get a free account!
>http://www2.precog.com/precogplatform/slashdotnewsletter
>_______________________________________________
>dotNetRDF-bugs mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Tomasz P. <tom...@gm...> - 2013-04-12 18:56:42

I've just committed more test cases. Out of the 6 none fail cause OOM
anymore, which is marvellous.

However case1 reports false but I'm positive these graphs are actually equal.

Thanks,
Tom

On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote:
> Those would be useful
>
> Btw I closed the issue branch so please just add the tests to default
>
> Rob
>
> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...>
> wrote:
>
>>Hi Rob
>>
>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
>>same issue. I will add all these to the test fixture.
>>
>>Tom
>>
>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote:
>>> Hey Tom
>>>
>>> This should now be fixed for your test case though I am not 100%
>>>convinced
>>> that brute forcing is not still broken
>>>
>>> What I have done to fix this is to add an intermediate step between the
>>> rules based and brute force mapping which does a divide and conquer
>>> approach
>>>
>>> What this does is break the unmapped blank node portions of the graph
>>>into
>>> its constituent isolated sub-graphs (those that share no blank nodes)
>>>and
>>> then recursively calls Equals() on the candidate matches for the
>>> sub-graphs.  This approach reduces the amount of work required and the
>>> likelihood of needing to brute force at all though we still fall back in
>>> the worst case.
>>>
>>> If you can come up with any more graphs that break GraphMatcher those
>>> would be much appreciated
>>>
>>> Rob
>>>
>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
>>>
>>>>s/not/now
>>>>
>>>>That should be "the test will now complete within the timeout"
>>>>
>>>>Rob
>>>>
>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>
>>>>>Hey Tom
>>>>>
>>>>>So the logic for generating the brute force mappings was completely
>>>>>broken
>>>>>causing it to get stuck in a memory sucking spin cycle :(
>>>>>
>>>>>I rewrote the GenerateMappings() method from scratch to use yield
>>>>>return
>>>>>and the test will not complete within the timeout but it fails so I
>>>>>still
>>>>>need to dig further
>>>>>
>>>>>We may still be generating incorrect possible mappings or the logic for
>>>>>brute force may be flawed elsewhere
>>>>>
>>>>>Rob
>>>>>
>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>
>>>>>>Hey Tom
>>>>>>
>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the only
>>>>>>option we have is to attempt to brute force the problem
>>>>>>
>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track
>>>>>>down
>>>>>>where things go wrong
>>>>>>
>>>>>>For your graphs they may look trivially equal but to code they are
>>>>>>not,
>>>>>>the reason this worked prior to 0.8.0 is that one of the things we try
>>>>>>is
>>>>>>a trivial mapping (assume blank nodes have same IDs in both graphs) so
>>>>>>in
>>>>>>previous releases you would likely have hit this case and been fine.
>>>>>>
>>>>>>You have 33 blank nodes in the graph of which only 6 are uniquely
>>>>>>identifiable and mappable.  The matcher generates a candidate mapping
>>>>>>for
>>>>>>the whole graph but its best effort is incorrect, so then it falls
>>>>>>back
>>>>>>to
>>>>>>brute force.  I need to dig further into whether the candidate mapping
>>>>>>could be improved but this is not trivial to debug and will take some
>>>>>>time
>>>>>>to resolve.
>>>>>>
>>>>>>We may be able to reduce the "memory leak" by using yield rather than
>>>>>>pre-generating all possible mapping but this is a tricky refactor,
>>>>>>it's
>>>>>>been a long time since I wrote the code originally and I remember that
>>>>>>doing the mapping in the yield form proved thorny at the time so I
>>>>>>chose
>>>>>>not to.  The code itself for generating the mappings has some slightly
>>>>>>strange things in it so I really need to spend a block of time
>>>>>>refreshing
>>>>>>myself on the logic there to check that it is sound before I attempt
>>>>>>to
>>>>>>refactor.
>>>>>>
>>>>>>Rob
>>>>>>
>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz"
>>>>>><tom...@gm...>
>>>>>>wrote:
>>>>>>
>>>>>>>Hm, I was wrong actually.
>>>>>>>
>>>>>>>I tried comparing the exact same graphs loaded from Turtle in
>>>>>>>dotNetRDF test project but I got the unit test wrong.
>>>>>>>
>>>>>>>I have added the CORE-345 bug and committed a failing test case [1].
>>>>>>>Could you please have a look at this?
>>>>>>>
>>>>>>>Thanks,
>>>>>>>Tom
>>>>>>>
>>>>>>>[1]:
>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>>>>
>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>>>>><tom...@gm...> wrote:
>>>>>>>> Hi Rob
>>>>>>>>
>>>>>>>> I finally got back to R2RML to analyze why I am getting that memory
>>>>>>>> leak. It seems connected to the changes you had to introduce for
>>>>>>>> SPARQL 1.1.
>>>>>>>>
>>>>>>>> I have determined that it happens in GraphMatcher#GenerateMappings
>>>>>>>> method. The graphs are equal and I'm not sure what causes the
>>>>>>>>problem.
>>>>>>>> As soon as TryBruteForceMapping is reached memory consumption
>>>>>>>>explodes
>>>>>>>> to gigabytes within minutes.
>>>>>>>>
>>>>>>>> The low-level problem is the mappings variable in the
>>>>>>>> GenerateMappings, which within a few iteration contains thousands
>>>>>>>>of
>>>>>>>> elements.
>>>>>>>>
>>>>>>>> This problem no longer occurs on trunk. Have you actually been
>>>>>>>> introducing any fixes around that area?
>>>>>>>>
>>>>>>>> Tom
>>>>>>>>
>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...>
>>>>>>>>wrote:
>>>>>>>>> Comments inline:
>>>>>>>>>
>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...>
>>>>>>>>>wrote:
>>>>>>>>>
>>>>>>>>>>Hi Rob
>>>>>>>>>>
>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm
>>>>>>>>>>experiencing two issues.
>>>>>>>>>>
>>>>>>>>>>1. In my unit tests I relied on the way the library assigns blank
>>>>>>>>>>node
>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests
>>>>>>>>>>separately
>>>>>>>>>>each one passes but when I batch them they fail because in
>>>>>>>>>>subsequent
>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't
>>>>>>>>>>share the same graph or triple store. Have you changed this
>>>>>>>>>>behavior
>>>>>>>>>>delbierately?
>>>>>>>>>
>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was
>>>>>>>>>made
>>>>>>>>>in
>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also
>>>>>>>>>uncovered
>>>>>>>>>a
>>>>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>>>>
>>>>>>>>> You shouldn't rely on an internal implementation detail like how
>>>>>>>>>the
>>>>>>>>> library assigns blank node identifiers.  Blank nodes should always
>>>>>>>>>be
>>>>>>>>> identifiable by the triples they appear in so it should be
>>>>>>>>>possible
>>>>>>>>>to
>>>>>>>>> formulate API calls or SPARQL queries that validate that you have
>>>>>>>>>produced
>>>>>>>>> the data you expected.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>2. There is a bad memory leak in during SPARQL execution of this:
>>>>>>>>>
>>>>>>>>> Define bad memory leak?
>>>>>>>>>
>>>>>>>>> Updates are transactional so it may be a side effect of the
>>>>>>>>>library
>>>>>>>>> maintaining the state necessary to rollback the transaction should
>>>>>>>>>it
>>>>>>>>>fail
>>>>>>>>> or be aborted.  Also the fact that you are replacing constant
>>>>>>>>>nodes
>>>>>>>>>with
>>>>>>>>> blank nodes will assign a lot of new identifiers and those
>>>>>>>>>identifiers
>>>>>>>>> have to be tracked to prevent collisions.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>>>>
>>>>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>>>>
>>>>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>>>>
>>>>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>>>>
>>>>>>>>>>The full code is simply:
>>>>>>>>>>
>>>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>>>>            var updateParser = new SparqlUpdateParser();
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutS
>>>>>>>>>>ub
>>>>>>>>>>m
>>>>>>>>>>a
>>>>>>>>>>p
>>>>>>>>>>sRe
>>>>>>>>>>placeSparql));
>>>>>>>>>>
>>>>>>>>>>Is this a know problem and has been already fixed or should I
>>>>>>>>>>investigate closely?
>>>>>>>>>
>>>>>>>>> This is not a known issue, I would also guess that the data being
>>>>>>>>>used
>>>>>>>>> would have some bearing on the severity of the problem.  Please go
>>>>>>>>>ahead
>>>>>>>>> and investigate but I would suspect it is the two things I
>>>>>>>>>outlined
>>>>>>>>>above
>>>>>>>>> which are the culprits here.
>>>>>>>>>
>>>>>>>>> Rob
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Thanks,
>>>>>>>>>>Tom
>>>>>>>>>>
>>>>>>>>>>------------------------------------------------------------------
>>>>>>>>>>--
>>>>>>>>>>-
>>>>>>>>>>-
>>>>>>>>>>-
>>>>>>>>>>---
>>>>>>>>>>----
>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
>>>>>>>>>>CSS,
>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>>>>current
>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>>>>_______________________________________________
>>>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>dot...@li...
>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>-------------------------------------------------------------------
>>>>>>>>>--
>>>>>>>>>-
>>>>>>>>>-
>>>>>>>>>-
>>>>>>>>>------
>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
>>>>>>>>>CSS,
>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>>>current
>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>>>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>>>>> _______________________________________________
>>>>>>>>> dotNetRDF-bugs mailing list
>>>>>>>>> dot...@li...
>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>
>>>>>>>---------------------------------------------------------------------
>>>>>>>--
>>>>>>>-
>>>>>>>-
>>>>>>>-
>>>>>>>----
>>>>>>>Minimize network downtime and maximize team effectiveness.
>>>>>>>Reduce network management and security costs.Learn how to hire
>>>>>>>the most talented Cisco Certified professionals. Visit the
>>>>>>>Employer Resources Portal
>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>>_______________________________________________
>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>dot...@li...
>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>----------------------------------------------------------------------
>>>>>>--
>>>>>>-
>>>>>>-
>>>>>>----
>>>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>>building
>>>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>account!
>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>_______________________________________________
>>>>>>dotNetRDF-bugs mailing list
>>>>>>dot...@li...
>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>-----------------------------------------------------------------------
>>>>>--
>>>>>-
>>>>>----
>>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>building
>>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>our toolset for easy data analysis & visualization. Get a free account!
>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>_______________________________________________
>>>>>dotNetRDF-bugs mailing list
>>>>>dot...@li...
>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>------------------------------------------------------------------------
>>>>--
>>>>----
>>>>Precog is a next-generation analytics platform capable of advanced
>>>>analytics on semi-structured data. The platform includes APIs for
>>>>building
>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>our toolset for easy data analysis & visualization. Get a free account!
>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>_______________________________________________
>>>>dotNetRDF-bugs mailing list
>>>>dot...@li...
>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>
>>>
>>>
>>>
>>>
>>>
>>>-------------------------------------------------------------------------
>>>-----
>>> Precog is a next-generation analytics platform capable of advanced
>>> analytics on semi-structured data. The platform includes APIs for
>>>building
>>> apps and a phenomenal toolset for data science. Developers can use
>>> our toolset for easy data analysis & visualization. Get a free account!
>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>> _______________________________________________
>>> dotNetRDF-bugs mailing list
>>> dot...@li...
>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>--------------------------------------------------------------------------
>>----
>>Precog is a next-generation analytics platform capable of advanced
>>analytics on semi-structured data. The platform includes APIs for building
>>apps and a phenomenal toolset for data science. Developers can use
>>our toolset for easy data analysis & visualization. Get a free account!
>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>_______________________________________________
>>dotNetRDF-bugs mailing list
>>dot...@li...
>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>
>
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> dotNetRDF-bugs mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 18:59:03

Ok

Can you push the commits up so I can pull them down and take a look at the
new test cases

Rob

On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...>
wrote:

>I've just committed more test cases. Out of the 6 none fail cause OOM
>anymore, which is marvellous.
>
>However case1 reports false but I'm positive these graphs are actually
>equal.
>
>Thanks,
>Tom
>
>On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote:
>> Those would be useful
>>
>> Btw I closed the issue branch so please just add the tests to default
>>
>> Rob
>>
>> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>> wrote:
>>
>>>Hi Rob
>>>
>>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
>>>same issue. I will add all these to the test fixture.
>>>
>>>Tom
>>>
>>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote:
>>>> Hey Tom
>>>>
>>>> This should now be fixed for your test case though I am not 100%
>>>>convinced
>>>> that brute forcing is not still broken
>>>>
>>>> What I have done to fix this is to add an intermediate step between
>>>>the
>>>> rules based and brute force mapping which does a divide and conquer
>>>> approach
>>>>
>>>> What this does is break the unmapped blank node portions of the graph
>>>>into
>>>> its constituent isolated sub-graphs (those that share no blank nodes)
>>>>and
>>>> then recursively calls Equals() on the candidate matches for the
>>>> sub-graphs.  This approach reduces the amount of work required and the
>>>> likelihood of needing to brute force at all though we still fall back
>>>>in
>>>> the worst case.
>>>>
>>>> If you can come up with any more graphs that break GraphMatcher those
>>>> would be much appreciated
>>>>
>>>> Rob
>>>>
>>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>
>>>>>s/not/now
>>>>>
>>>>>That should be "the test will now complete within the timeout"
>>>>>
>>>>>Rob
>>>>>
>>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>
>>>>>>Hey Tom
>>>>>>
>>>>>>So the logic for generating the brute force mappings was completely
>>>>>>broken
>>>>>>causing it to get stuck in a memory sucking spin cycle :(
>>>>>>
>>>>>>I rewrote the GenerateMappings() method from scratch to use yield
>>>>>>return
>>>>>>and the test will not complete within the timeout but it fails so I
>>>>>>still
>>>>>>need to dig further
>>>>>>
>>>>>>We may still be generating incorrect possible mappings or the logic
>>>>>>for
>>>>>>brute force may be flawed elsewhere
>>>>>>
>>>>>>Rob
>>>>>>
>>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>
>>>>>>>Hey Tom
>>>>>>>
>>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the
>>>>>>>only
>>>>>>>option we have is to attempt to brute force the problem
>>>>>>>
>>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track
>>>>>>>down
>>>>>>>where things go wrong
>>>>>>>
>>>>>>>For your graphs they may look trivially equal but to code they are
>>>>>>>not,
>>>>>>>the reason this worked prior to 0.8.0 is that one of the things we
>>>>>>>try
>>>>>>>is
>>>>>>>a trivial mapping (assume blank nodes have same IDs in both graphs)
>>>>>>>so
>>>>>>>in
>>>>>>>previous releases you would likely have hit this case and been fine.
>>>>>>>
>>>>>>>You have 33 blank nodes in the graph of which only 6 are uniquely
>>>>>>>identifiable and mappable.  The matcher generates a candidate
>>>>>>>mapping
>>>>>>>for
>>>>>>>the whole graph but its best effort is incorrect, so then it falls
>>>>>>>back
>>>>>>>to
>>>>>>>brute force.  I need to dig further into whether the candidate
>>>>>>>mapping
>>>>>>>could be improved but this is not trivial to debug and will take
>>>>>>>some
>>>>>>>time
>>>>>>>to resolve.
>>>>>>>
>>>>>>>We may be able to reduce the "memory leak" by using yield rather
>>>>>>>than
>>>>>>>pre-generating all possible mapping but this is a tricky refactor,
>>>>>>>it's
>>>>>>>been a long time since I wrote the code originally and I remember
>>>>>>>that
>>>>>>>doing the mapping in the yield form proved thorny at the time so I
>>>>>>>chose
>>>>>>>not to.  The code itself for generating the mappings has some
>>>>>>>slightly
>>>>>>>strange things in it so I really need to spend a block of time
>>>>>>>refreshing
>>>>>>>myself on the logic there to check that it is sound before I attempt
>>>>>>>to
>>>>>>>refactor.
>>>>>>>
>>>>>>>Rob
>>>>>>>
>>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz"
>>>>>>><tom...@gm...>
>>>>>>>wrote:
>>>>>>>
>>>>>>>>Hm, I was wrong actually.
>>>>>>>>
>>>>>>>>I tried comparing the exact same graphs loaded from Turtle in
>>>>>>>>dotNetRDF test project but I got the unit test wrong.
>>>>>>>>
>>>>>>>>I have added the CORE-345 bug and committed a failing test case
>>>>>>>>[1].
>>>>>>>>Could you please have a look at this?
>>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Tom
>>>>>>>>
>>>>>>>>[1]:
>>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>>>>>
>>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>>>>>><tom...@gm...> wrote:
>>>>>>>>> Hi Rob
>>>>>>>>>
>>>>>>>>> I finally got back to R2RML to analyze why I am getting that
>>>>>>>>>memory
>>>>>>>>> leak. It seems connected to the changes you had to introduce for
>>>>>>>>> SPARQL 1.1.
>>>>>>>>>
>>>>>>>>> I have determined that it happens in
>>>>>>>>>GraphMatcher#GenerateMappings
>>>>>>>>> method. The graphs are equal and I'm not sure what causes the
>>>>>>>>>problem.
>>>>>>>>> As soon as TryBruteForceMapping is reached memory consumption
>>>>>>>>>explodes
>>>>>>>>> to gigabytes within minutes.
>>>>>>>>>
>>>>>>>>> The low-level problem is the mappings variable in the
>>>>>>>>> GenerateMappings, which within a few iteration contains thousands
>>>>>>>>>of
>>>>>>>>> elements.
>>>>>>>>>
>>>>>>>>> This problem no longer occurs on trunk. Have you actually been
>>>>>>>>> introducing any fixes around that area?
>>>>>>>>>
>>>>>>>>> Tom
>>>>>>>>>
>>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse
>>>>>>>>><rv...@do...>
>>>>>>>>>wrote:
>>>>>>>>>> Comments inline:
>>>>>>>>>>
>>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...>
>>>>>>>>>>wrote:
>>>>>>>>>>
>>>>>>>>>>>Hi Rob
>>>>>>>>>>>
>>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and
>>>>>>>>>>>I'm
>>>>>>>>>>>experiencing two issues.
>>>>>>>>>>>
>>>>>>>>>>>1. In my unit tests I relied on the way the library assigns
>>>>>>>>>>>blank
>>>>>>>>>>>node
>>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests
>>>>>>>>>>>separately
>>>>>>>>>>>each one passes but when I batch them they fail because in
>>>>>>>>>>>subsequent
>>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they
>>>>>>>>>>>don't
>>>>>>>>>>>share the same graph or triple store. Have you changed this
>>>>>>>>>>>behavior
>>>>>>>>>>>delbierately?
>>>>>>>>>>
>>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was
>>>>>>>>>>made
>>>>>>>>>>in
>>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also
>>>>>>>>>>uncovered
>>>>>>>>>>a
>>>>>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>>>>>
>>>>>>>>>> You shouldn't rely on an internal implementation detail like how
>>>>>>>>>>the
>>>>>>>>>> library assigns blank node identifiers.  Blank nodes should
>>>>>>>>>>always
>>>>>>>>>>be
>>>>>>>>>> identifiable by the triples they appear in so it should be
>>>>>>>>>>possible
>>>>>>>>>>to
>>>>>>>>>> formulate API calls or SPARQL queries that validate that you
>>>>>>>>>>have
>>>>>>>>>>produced
>>>>>>>>>> the data you expected.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>2. There is a bad memory leak in during SPARQL execution of
>>>>>>>>>>>this:
>>>>>>>>>>
>>>>>>>>>> Define bad memory leak?
>>>>>>>>>>
>>>>>>>>>> Updates are transactional so it may be a side effect of the
>>>>>>>>>>library
>>>>>>>>>> maintaining the state necessary to rollback the transaction
>>>>>>>>>>should
>>>>>>>>>>it
>>>>>>>>>>fail
>>>>>>>>>> or be aborted.  Also the fact that you are replacing constant
>>>>>>>>>>nodes
>>>>>>>>>>with
>>>>>>>>>> blank nodes will assign a lot of new identifiers and those
>>>>>>>>>>identifiers
>>>>>>>>>> have to be tracked to prevent collisions.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>>>>>
>>>>>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>>>>>
>>>>>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
>>>>>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>>>>>
>>>>>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>>>>>
>>>>>>>>>>>The full code is simply:
>>>>>>>>>>>
>>>>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
>>>>>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>>>>>            var updateParser = new SparqlUpdateParser();
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu
>>>>>>>>>>>tS
>>>>>>>>>>>ub
>>>>>>>>>>>m
>>>>>>>>>>>a
>>>>>>>>>>>p
>>>>>>>>>>>sRe
>>>>>>>>>>>placeSparql));
>>>>>>>>>>>
>>>>>>>>>>>Is this a know problem and has been already fixed or should I
>>>>>>>>>>>investigate closely?
>>>>>>>>>>
>>>>>>>>>> This is not a known issue, I would also guess that the data
>>>>>>>>>>being
>>>>>>>>>>used
>>>>>>>>>> would have some bearing on the severity of the problem.  Please
>>>>>>>>>>go
>>>>>>>>>>ahead
>>>>>>>>>> and investigate but I would suspect it is the two things I
>>>>>>>>>>outlined
>>>>>>>>>>above
>>>>>>>>>> which are the culprits here.
>>>>>>>>>>
>>>>>>>>>> Rob
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Thanks,
>>>>>>>>>>>Tom
>>>>>>>>>>>
>>>>>>>>>>>----------------------------------------------------------------
>>>>>>>>>>>--
>>>>>>>>>>>--
>>>>>>>>>>>-
>>>>>>>>>>>-
>>>>>>>>>>>-
>>>>>>>>>>>---
>>>>>>>>>>>----
>>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
>>>>>>>>>>>CSS,
>>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>>>>>current
>>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by
>>>>>>>>>>>Microsoft
>>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at:
>>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>>>>>_______________________________________________
>>>>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>>dot...@li...
>>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>-----------------------------------------------------------------
>>>>>>>>>>--
>>>>>>>>>>--
>>>>>>>>>>-
>>>>>>>>>>-
>>>>>>>>>>-
>>>>>>>>>>------
>>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
>>>>>>>>>>CSS,
>>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>>>>>>>>>>current
>>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by
>>>>>>>>>>Microsoft
>>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
>>>>>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>>>>>> _______________________________________________
>>>>>>>>>> dotNetRDF-bugs mailing list
>>>>>>>>>> dot...@li...
>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>
>>>>>>>>-------------------------------------------------------------------
>>>>>>>>--
>>>>>>>>--
>>>>>>>>-
>>>>>>>>-
>>>>>>>>-
>>>>>>>>----
>>>>>>>>Minimize network downtime and maximize team effectiveness.
>>>>>>>>Reduce network management and security costs.Learn how to hire
>>>>>>>>the most talented Cisco Certified professionals. Visit the
>>>>>>>>Employer Resources Portal
>>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>>>_______________________________________________
>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>dot...@li...
>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>--------------------------------------------------------------------
>>>>>>>--
>>>>>>>--
>>>>>>>-
>>>>>>>-
>>>>>>>----
>>>>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>>>building
>>>>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>account!
>>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>_______________________________________________
>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>dot...@li...
>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>--
>>>>>>--
>>>>>>-
>>>>>>----
>>>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>>building
>>>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>account!
>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>_______________________________________________
>>>>>>dotNetRDF-bugs mailing list
>>>>>>dot...@li...
>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>----------------------------------------------------------------------
>>>>>--
>>>>>--
>>>>>----
>>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>building
>>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>account!
>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>_______________________________________________
>>>>>dotNetRDF-bugs mailing list
>>>>>dot...@li...
>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>-----------------------------------------------------------------------
>>>>--
>>>>-----
>>>> Precog is a next-generation analytics platform capable of advanced
>>>> analytics on semi-structured data. The platform includes APIs for
>>>>building
>>>> apps and a phenomenal toolset for data science. Developers can use
>>>> our toolset for easy data analysis & visualization. Get a free
>>>>account!
>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>> _______________________________________________
>>>> dotNetRDF-bugs mailing list
>>>> dot...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>
>>>------------------------------------------------------------------------
>>>--
>>>----
>>>Precog is a next-generation analytics platform capable of advanced
>>>analytics on semi-structured data. The platform includes APIs for
>>>building
>>>apps and a phenomenal toolset for data science. Developers can use
>>>our toolset for easy data analysis & visualization. Get a free account!
>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>_______________________________________________
>>>dotNetRDF-bugs mailing list
>>>dot...@li...
>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>
>>
>>
>>
>>
>> 
>>-------------------------------------------------------------------------
>>-----
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for
>>building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> _______________________________________________
>> dotNetRDF-bugs mailing list
>> dot...@li...
>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>--------------------------------------------------------------------------
>----
>Precog is a next-generation analytics platform capable of advanced
>analytics on semi-structured data. The platform includes APIs for building
>apps and a phenomenal toolset for data science. Developers can use
>our toolset for easy data analysis & visualization. Get a free account!
>http://www2.precog.com/precogplatform/slashdotnewsletter
>_______________________________________________
>dotNetRDF-bugs mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Tomek P. <to...@pl...> - 2013-04-12 19:05:47

I did with a little delay. Please check now.

Tom
On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote:

> Ok
>
> Can you push the commits up so I can pull them down and take a look at the
> new test cases
>
> Rob
>
> On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...>
> wrote:
>
> >I've just committed more test cases. Out of the 6 none fail cause OOM
> >anymore, which is marvellous.
> >
> >However case1 reports false but I'm positive these graphs are actually
> >equal.
> >
> >Thanks,
> >Tom
> >
> >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote:
> >> Those would be useful
> >>
> >> Btw I closed the issue branch so please just add the tests to default
> >>
> >> Rob
> >>
> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...
> >
> >> wrote:
> >>
> >>>Hi Rob
> >>>
> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
> >>>same issue. I will add all these to the test fixture.
> >>>
> >>>Tom
> >>>
> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...>
> wrote:
> >>>> Hey Tom
> >>>>
> >>>> This should now be fixed for your test case though I am not 100%
> >>>>convinced
> >>>> that brute forcing is not still broken
> >>>>
> >>>> What I have done to fix this is to add an intermediate step between
> >>>>the
> >>>> rules based and brute force mapping which does a divide and conquer
> >>>> approach
> >>>>
> >>>> What this does is break the unmapped blank node portions of the graph
> >>>>into
> >>>> its constituent isolated sub-graphs (those that share no blank nodes)
> >>>>and
> >>>> then recursively calls Equals() on the candidate matches for the
> >>>> sub-graphs.  This approach reduces the amount of work required and the
> >>>> likelihood of needing to brute force at all though we still fall back
> >>>>in
> >>>> the worst case.
> >>>>
> >>>> If you can come up with any more graphs that break GraphMatcher those
> >>>> would be much appreciated
> >>>>
> >>>> Rob
> >>>>
> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
> >>>>
> >>>>>s/not/now
> >>>>>
> >>>>>That should be "the test will now complete within the timeout"
> >>>>>
> >>>>>Rob
> >>>>>
> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
> >>>>>
> >>>>>>Hey Tom
> >>>>>>
> >>>>>>So the logic for generating the brute force mappings was completely
> >>>>>>broken
> >>>>>>causing it to get stuck in a memory sucking spin cycle :(
> >>>>>>
> >>>>>>I rewrote the GenerateMappings() method from scratch to use yield
> >>>>>>return
> >>>>>>and the test will not complete within the timeout but it fails so I
> >>>>>>still
> >>>>>>need to dig further
> >>>>>>
> >>>>>>We may still be generating incorrect possible mappings or the logic
> >>>>>>for
> >>>>>>brute force may be flawed elsewhere
> >>>>>>
> >>>>>>Rob
> >>>>>>
> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
> >>>>>>
> >>>>>>>Hey Tom
> >>>>>>>
> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the
> >>>>>>>only
> >>>>>>>option we have is to attempt to brute force the problem
> >>>>>>>
> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track
> >>>>>>>down
> >>>>>>>where things go wrong
> >>>>>>>
> >>>>>>>For your graphs they may look trivially equal but to code they are
> >>>>>>>not,
> >>>>>>>the reason this worked prior to 0.8.0 is that one of the things we
> >>>>>>>try
> >>>>>>>is
> >>>>>>>a trivial mapping (assume blank nodes have same IDs in both graphs)
> >>>>>>>so
> >>>>>>>in
> >>>>>>>previous releases you would likely have hit this case and been fine.
> >>>>>>>
> >>>>>>>You have 33 blank nodes in the graph of which only 6 are uniquely
> >>>>>>>identifiable and mappable.  The matcher generates a candidate
> >>>>>>>mapping
> >>>>>>>for
> >>>>>>>the whole graph but its best effort is incorrect, so then it falls
> >>>>>>>back
> >>>>>>>to
> >>>>>>>brute force.  I need to dig further into whether the candidate
> >>>>>>>mapping
> >>>>>>>could be improved but this is not trivial to debug and will take
> >>>>>>>some
> >>>>>>>time
> >>>>>>>to resolve.
> >>>>>>>
> >>>>>>>We may be able to reduce the "memory leak" by using yield rather
> >>>>>>>than
> >>>>>>>pre-generating all possible mapping but this is a tricky refactor,
> >>>>>>>it's
> >>>>>>>been a long time since I wrote the code originally and I remember
> >>>>>>>that
> >>>>>>>doing the mapping in the yield form proved thorny at the time so I
> >>>>>>>chose
> >>>>>>>not to.  The code itself for generating the mappings has some
> >>>>>>>slightly
> >>>>>>>strange things in it so I really need to spend a block of time
> >>>>>>>refreshing
> >>>>>>>myself on the logic there to check that it is sound before I attempt
> >>>>>>>to
> >>>>>>>refactor.
> >>>>>>>
> >>>>>>>Rob
> >>>>>>>
> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz"
> >>>>>>><tom...@gm...>
> >>>>>>>wrote:
> >>>>>>>
> >>>>>>>>Hm, I was wrong actually.
> >>>>>>>>
> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle in
> >>>>>>>>dotNetRDF test project but I got the unit test wrong.
> >>>>>>>>
> >>>>>>>>I have added the CORE-345 bug and committed a failing test case
> >>>>>>>>[1].
> >>>>>>>>Could you please have a look at this?
> >>>>>>>>
> >>>>>>>>Thanks,
> >>>>>>>>Tom
> >>>>>>>>
> >>>>>>>>[1]:
> >>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
> >>>>>>>>
> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
> >>>>>>>><tom...@gm...> wrote:
> >>>>>>>>> Hi Rob
> >>>>>>>>>
> >>>>>>>>> I finally got back to R2RML to analyze why I am getting that
> >>>>>>>>>memory
> >>>>>>>>> leak. It seems connected to the changes you had to introduce for
> >>>>>>>>> SPARQL 1.1.
> >>>>>>>>>
> >>>>>>>>> I have determined that it happens in
> >>>>>>>>>GraphMatcher#GenerateMappings
> >>>>>>>>> method. The graphs are equal and I'm not sure what causes the
> >>>>>>>>>problem.
> >>>>>>>>> As soon as TryBruteForceMapping is reached memory consumption
> >>>>>>>>>explodes
> >>>>>>>>> to gigabytes within minutes.
> >>>>>>>>>
> >>>>>>>>> The low-level problem is the mappings variable in the
> >>>>>>>>> GenerateMappings, which within a few iteration contains thousands
> >>>>>>>>>of
> >>>>>>>>> elements.
> >>>>>>>>>
> >>>>>>>>> This problem no longer occurs on trunk. Have you actually been
> >>>>>>>>> introducing any fixes around that area?
> >>>>>>>>>
> >>>>>>>>> Tom
> >>>>>>>>>
> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse
> >>>>>>>>><rv...@do...>
> >>>>>>>>>wrote:
> >>>>>>>>>> Comments inline:
> >>>>>>>>>>
> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...>
> >>>>>>>>>>wrote:
> >>>>>>>>>>
> >>>>>>>>>>>Hi Rob
> >>>>>>>>>>>
> >>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and
> >>>>>>>>>>>I'm
> >>>>>>>>>>>experiencing two issues.
> >>>>>>>>>>>
> >>>>>>>>>>>1. In my unit tests I relied on the way the library assigns
> >>>>>>>>>>>blank
> >>>>>>>>>>>node
> >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests
> >>>>>>>>>>>separately
> >>>>>>>>>>>each one passes but when I batch them they fail because in
> >>>>>>>>>>>subsequent
> >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they
> >>>>>>>>>>>don't
> >>>>>>>>>>>share the same graph or triple store. Have you changed this
> >>>>>>>>>>>behavior
> >>>>>>>>>>>delbierately?
> >>>>>>>>>>
> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was
> >>>>>>>>>>made
> >>>>>>>>>>in
> >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also
> >>>>>>>>>>uncovered
> >>>>>>>>>>a
> >>>>>>>>>> bug in graph isomorphism calculation which was fixed.
> >>>>>>>>>>
> >>>>>>>>>> You shouldn't rely on an internal implementation detail like how
> >>>>>>>>>>the
> >>>>>>>>>> library assigns blank node identifiers.  Blank nodes should
> >>>>>>>>>>always
> >>>>>>>>>>be
> >>>>>>>>>> identifiable by the triples they appear in so it should be
> >>>>>>>>>>possible
> >>>>>>>>>>to
> >>>>>>>>>> formulate API calls or SPARQL queries that validate that you
> >>>>>>>>>>have
> >>>>>>>>>>produced
> >>>>>>>>>> the data you expected.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL execution of
> >>>>>>>>>>>this:
> >>>>>>>>>>
> >>>>>>>>>> Define bad memory leak?
> >>>>>>>>>>
> >>>>>>>>>> Updates are transactional so it may be a side effect of the
> >>>>>>>>>>library
> >>>>>>>>>> maintaining the state necessary to rollback the transaction
> >>>>>>>>>>should
> >>>>>>>>>>it
> >>>>>>>>>>fail
> >>>>>>>>>> or be aborted.  Also the fact that you are replacing constant
> >>>>>>>>>>nodes
> >>>>>>>>>>with
> >>>>>>>>>> blank nodes will assign a lot of new identifiers and those
> >>>>>>>>>>identifiers
> >>>>>>>>>> have to be tracked to prevent collisions.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
> >>>>>>>>>>>DELETE { ?map rr:graph ?value . }
> >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
> >>>>>>>>>>>WHERE { ?map rr:graph ?value } ;
> >>>>>>>>>>>
> >>>>>>>>>>>DELETE { ?map rr:object ?value . }
> >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
> >>>>>>>>>>>WHERE { ?map rr:object ?value } ;
> >>>>>>>>>>>
> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . }
> >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . }
> >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ;
> >>>>>>>>>>>
> >>>>>>>>>>>DELETE { ?map rr:subject ?value . }
> >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
> >>>>>>>>>>>WHERE { ?map rr:subject ?value }
> >>>>>>>>>>>
> >>>>>>>>>>>The full code is simply:
> >>>>>>>>>>>
> >>>>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri);
> >>>>>>>>>>>            ISparqlUpdateProcessor processor = new
> >>>>>>>>>>>LeviathanUpdateProcessor(dataset);
> >>>>>>>>>>>            var updateParser = new SparqlUpdateParser();
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu
> >>>>>>>>>>>tS
> >>>>>>>>>>>ub
> >>>>>>>>>>>m
> >>>>>>>>>>>a
> >>>>>>>>>>>p
> >>>>>>>>>>>sRe
> >>>>>>>>>>>placeSparql));
> >>>>>>>>>>>
> >>>>>>>>>>>Is this a know problem and has been already fixed or should I
> >>>>>>>>>>>investigate closely?
> >>>>>>>>>>
> >>>>>>>>>> This is not a known issue, I would also guess that the data
> >>>>>>>>>>being
> >>>>>>>>>>used
> >>>>>>>>>> would have some bearing on the severity of the problem.  Please
> >>>>>>>>>>go
> >>>>>>>>>>ahead
> >>>>>>>>>> and investigate but I would suspect it is the two things I
> >>>>>>>>>>outlined
> >>>>>>>>>>above
> >>>>>>>>>> which are the culprits here.
> >>>>>>>>>>
> >>>>>>>>>> Rob
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>Thanks,
> >>>>>>>>>>>Tom
> >>>>>>>>>>>
> >>>>>>>>>>>----------------------------------------------------------------
> >>>>>>>>>>>--
> >>>>>>>>>>>--
> >>>>>>>>>>>-
> >>>>>>>>>>>-
> >>>>>>>>>>>-
> >>>>>>>>>>>---
> >>>>>>>>>>>----
> >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
> >>>>>>>>>>>CSS,
> >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
> >>>>>>>>>>>current
> >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by
> >>>>>>>>>>>Microsoft
> >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at:
> >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712
> >>>>>>>>>>>_______________________________________________
> >>>>>>>>>>>dotNetRDF-bugs mailing list
> >>>>>>>>>>>dot...@li...
> >>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>-----------------------------------------------------------------
> >>>>>>>>>>--
> >>>>>>>>>>--
> >>>>>>>>>>-
> >>>>>>>>>>-
> >>>>>>>>>>-
> >>>>>>>>>>------
> >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
> >>>>>>>>>>CSS,
> >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
> >>>>>>>>>>current
> >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by
> >>>>>>>>>>Microsoft
> >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at:
> >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> dotNetRDF-bugs mailing list
> >>>>>>>>>> dot...@li...
> >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>>>>>>>
> >>>>>>>>-------------------------------------------------------------------
> >>>>>>>>--
> >>>>>>>>--
> >>>>>>>>-
> >>>>>>>>-
> >>>>>>>>-
> >>>>>>>>----
> >>>>>>>>Minimize network downtime and maximize team effectiveness.
> >>>>>>>>Reduce network management and security costs.Learn how to hire
> >>>>>>>>the most talented Cisco Certified professionals. Visit the
> >>>>>>>>Employer Resources Portal
> >>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
> >>>>>>>>_______________________________________________
> >>>>>>>>dotNetRDF-bugs mailing list
> >>>>>>>>dot...@li...
> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>--------------------------------------------------------------------
> >>>>>>>--
> >>>>>>>--
> >>>>>>>-
> >>>>>>>-
> >>>>>>>----
> >>>>>>>Precog is a next-generation analytics platform capable of advanced
> >>>>>>>analytics on semi-structured data. The platform includes APIs for
> >>>>>>>building
> >>>>>>>apps and a phenomenal toolset for data science. Developers can use
> >>>>>>>our toolset for easy data analysis & visualization. Get a free
> >>>>>>>account!
> >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
> >>>>>>>_______________________________________________
> >>>>>>>dotNetRDF-bugs mailing list
> >>>>>>>dot...@li...
> >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>---------------------------------------------------------------------
> >>>>>>--
> >>>>>>--
> >>>>>>-
> >>>>>>----
> >>>>>>Precog is a next-generation analytics platform capable of advanced
> >>>>>>analytics on semi-structured data. The platform includes APIs for
> >>>>>>building
> >>>>>>apps and a phenomenal toolset for data science. Developers can use
> >>>>>>our toolset for easy data analysis & visualization. Get a free
> >>>>>>account!
> >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
> >>>>>>_______________________________________________
> >>>>>>dotNetRDF-bugs mailing list
> >>>>>>dot...@li...
> >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>----------------------------------------------------------------------
> >>>>>--
> >>>>>--
> >>>>>----
> >>>>>Precog is a next-generation analytics platform capable of advanced
> >>>>>analytics on semi-structured data. The platform includes APIs for
> >>>>>building
> >>>>>apps and a phenomenal toolset for data science. Developers can use
> >>>>>our toolset for easy data analysis & visualization. Get a free
> >>>>>account!
> >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
> >>>>>_______________________________________________
> >>>>>dotNetRDF-bugs mailing list
> >>>>>dot...@li...
> >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>-----------------------------------------------------------------------
> >>>>--
> >>>>-----
> >>>> Precog is a next-generation analytics platform capable of advanced
> >>>> analytics on semi-structured data. The platform includes APIs for
> >>>>building
> >>>> apps and a phenomenal toolset for data science. Developers can use
> >>>> our toolset for easy data analysis & visualization. Get a free
> >>>>account!
> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter
> >>>> _______________________________________________
> >>>> dotNetRDF-bugs mailing list
> >>>> dot...@li...
> >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>>
> >>>------------------------------------------------------------------------
> >>>--
> >>>----
> >>>Precog is a next-generation analytics platform capable of advanced
> >>>analytics on semi-structured data. The platform includes APIs for
> >>>building
> >>>apps and a phenomenal toolset for data science. Developers can use
> >>>our toolset for easy data analysis & visualization. Get a free account!
> >>>http://www2.precog.com/precogplatform/slashdotnewsletter
> >>>_______________________________________________
> >>>dotNetRDF-bugs mailing list
> >>>dot...@li...
> >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >>
> >>
> >>
> >>
> >>
> >>
> >>-------------------------------------------------------------------------
> >>-----
> >> Precog is a next-generation analytics platform capable of advanced
> >> analytics on semi-structured data. The platform includes APIs for
> >>building
> >> apps and a phenomenal toolset for data science. Developers can use
> >> our toolset for easy data analysis & visualization. Get a free account!
> >> http://www2.precog.com/precogplatform/slashdotnewsletter
> >> _______________________________________________
> >> dotNetRDF-bugs mailing list
> >> dot...@li...
> >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> >
> >--------------------------------------------------------------------------
> >----
> >Precog is a next-generation analytics platform capable of advanced
> >analytics on semi-structured data. The platform includes APIs for building
> >apps and a phenomenal toolset for data science. Developers can use
> >our toolset for easy data analysis & visualization. Get a free account!
> >http://www2.precog.com/precogplatform/slashdotnewsletter
> >_______________________________________________
> >dotNetRDF-bugs mailing list
> >dot...@li...
> >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> dotNetRDF-bugs mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 19:38:58

Yes I realized that when I tried a pull again after sending the reply

Ok so that case is bombing out on brute force mapping which would tend to
indicate that there may be an issue there still

At a glance the graphs look equivalent but I need to verify this by hand
because the sub-graphs are too large and blank node heavy to easily verify
whether they are equal and we are just not detecting it correctly or if they
are non-equal

Rob

From:  Tomek Pluskiewicz <to...@pl...>
Reply-To:  dotNetRDF Bug Report tracking and resolution
<dot...@li...>
Date:  Friday, April 12, 2013 12:05 PM
To:  dotNetRDF Bug Report tracking and resolution
<dot...@li...>
Subject:  Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

> 
> I did with a little delay. Please check now.
> 
> Tom
> 
> On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote:
>> Ok
>> 
>> Can you push the commits up so I can pull them down and take a look at the
>> new test cases
>> 
>> Rob
>> 
>> On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>> wrote:
>> 
>>> >I've just committed more test cases. Out of the 6 none fail cause OOM
>>> >anymore, which is marvellous.
>>> >
>>> >However case1 reports false but I'm positive these graphs are actually
>>> >equal.
>>> >
>>> >Thanks,
>>> >Tom
>>> >
>>> >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote:
>>>> >> Those would be useful
>>>> >>
>>>> >> Btw I closed the issue branch so please just add the tests to default
>>>> >>
>>>> >> Rob
>>>> >>
>>>> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>>>> >> wrote:
>>>> >>
>>>>> >>>Hi Rob
>>>>> >>>
>>>>> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
>>>>> >>>same issue. I will add all these to the test fixture.
>>>>> >>>
>>>>> >>>Tom
>>>>> >>>
>>>>> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...>
>>>>> wrote:
>>>>>> >>>> Hey Tom
>>>>>> >>>>
>>>>>> >>>> This should now be fixed for your test case though I am not 100%
>>>>>> >>>>convinced
>>>>>> >>>> that brute forcing is not still broken
>>>>>> >>>>
>>>>>> >>>> What I have done to fix this is to add an intermediate step between
>>>>>> >>>>the
>>>>>> >>>> rules based and brute force mapping which does a divide and conquer
>>>>>> >>>> approach
>>>>>> >>>>
>>>>>> >>>> What this does is break the unmapped blank node portions of the
>>>>>> graph
>>>>>> >>>>into
>>>>>> >>>> its constituent isolated sub-graphs (those that share no blank
>>>>>> nodes)
>>>>>> >>>>and
>>>>>> >>>> then recursively calls Equals() on the candidate matches for the
>>>>>> >>>> sub-graphs.  This approach reduces the amount of work required and
the
>>>>>> >>>> likelihood of needing to brute force at all though we still fall
back
>>>>>> >>>>in
>>>>>> >>>> the worst case.
>>>>>> >>>>
>>>>>> >>>> If you can come up with any more graphs that break GraphMatcher
>>>>>> those
>>>>>> >>>> would be much appreciated
>>>>>> >>>>
>>>>>> >>>> Rob
>>>>>> >>>>
>>>>>> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>> >>>>
>>>>>>> >>>>>s/not/now
>>>>>>> >>>>>
>>>>>>> >>>>>That should be "the test will now complete within the timeout"
>>>>>>> >>>>>
>>>>>>> >>>>>Rob
>>>>>>> >>>>>
>>>>>>> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>> >>>>>
>>>>>>>> >>>>>>Hey Tom
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>So the logic for generating the brute force mappings was
>>>>>>>> completely
>>>>>>>> >>>>>>broken
>>>>>>>> >>>>>>causing it to get stuck in a memory sucking spin cycle :(
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>I rewrote the GenerateMappings() method from scratch to use yield
>>>>>>>> >>>>>>return
>>>>>>>> >>>>>>and the test will not complete within the timeout but it fails so
I
>>>>>>>> >>>>>>still
>>>>>>>> >>>>>>need to dig further
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>We may still be generating incorrect possible mappings or the
logic
>>>>>>>> >>>>>>for
>>>>>>>> >>>>>>brute force may be flawed elsewhere
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>Rob
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>>Hey Tom
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes
the
>>>>>>>>> >>>>>>>only
>>>>>>>>> >>>>>>>option we have is to attempt to brute force the problem
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to
track
>>>>>>>>> >>>>>>>down
>>>>>>>>> >>>>>>>where things go wrong
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>For your graphs they may look trivially equal but to code they
are
>>>>>>>>> >>>>>>>not,
>>>>>>>>> >>>>>>>the reason this worked prior to 0.8.0 is that one of the things
we
>>>>>>>>> >>>>>>>try
>>>>>>>>> >>>>>>>is
>>>>>>>>> >>>>>>>a trivial mapping (assume blank nodes have same IDs in both
>>>>>>>>> graphs)
>>>>>>>>> >>>>>>>so
>>>>>>>>> >>>>>>>in
>>>>>>>>> >>>>>>>previous releases you would likely have hit this case and been
fine.
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>You have 33 blank nodes in the graph of which only 6 are
>>>>>>>>> uniquely
>>>>>>>>> >>>>>>>identifiable and mappable.  The matcher generates a candidate
>>>>>>>>> >>>>>>>mapping
>>>>>>>>> >>>>>>>for
>>>>>>>>> >>>>>>>the whole graph but its best effort is incorrect, so then it
falls
>>>>>>>>> >>>>>>>back
>>>>>>>>> >>>>>>>to
>>>>>>>>> >>>>>>>brute force.  I need to dig further into whether the candidate
>>>>>>>>> >>>>>>>mapping
>>>>>>>>> >>>>>>>could be improved but this is not trivial to debug and will
take
>>>>>>>>> >>>>>>>some
>>>>>>>>> >>>>>>>time
>>>>>>>>> >>>>>>>to resolve.
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>We may be able to reduce the "memory leak" by using yield
rather
>>>>>>>>> >>>>>>>than
>>>>>>>>> >>>>>>>pre-generating all possible mapping but this is a tricky
>>>>>>>>> refactor,
>>>>>>>>> >>>>>>>it's
>>>>>>>>> >>>>>>>been a long time since I wrote the code originally and I
>>>>>>>>> remember
>>>>>>>>> >>>>>>>that
>>>>>>>>> >>>>>>>doing the mapping in the yield form proved thorny at the time
so I
>>>>>>>>> >>>>>>>chose
>>>>>>>>> >>>>>>>not to.  The code itself for generating the mappings has some
>>>>>>>>> >>>>>>>slightly
>>>>>>>>> >>>>>>>strange things in it so I really need to spend a block of time
>>>>>>>>> >>>>>>>refreshing
>>>>>>>>> >>>>>>>myself on the logic there to check that it is sound before I
>>>>>>>>> attempt
>>>>>>>>> >>>>>>>to
>>>>>>>>> >>>>>>>refactor.
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>Rob
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz"
>>>>>>>>> >>>>>>><tom...@gm...>
>>>>>>>>> >>>>>>>wrote:
>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>>Hm, I was wrong actually.
>>>>>>>>>> >>>>>>>>
>>>>>>>>>> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle in
>>>>>>>>>> >>>>>>>>dotNetRDF test project but I got the unit test wrong.
>>>>>>>>>> >>>>>>>>
>>>>>>>>>> >>>>>>>>I have added the CORE-345 bug and committed a failing test
case
>>>>>>>>>> >>>>>>>>[1].
>>>>>>>>>> >>>>>>>>Could you please have a look at this?
>>>>>>>>>> >>>>>>>>
>>>>>>>>>> >>>>>>>>Thanks,
>>>>>>>>>> >>>>>>>>Tom
>>>>>>>>>> >>>>>>>>
>>>>>>>>>> >>>>>>>>[1]:
>>>>>>>>>> 
>>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>>>>>>> >>>>>>>>
>>>>>>>>>> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>>>>>>>> >>>>>>>><tom...@gm...> wrote:
>>>>>>>>>>> >>>>>>>>> Hi Rob
>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>> >>>>>>>>> I finally got back to R2RML to analyze why I am getting
that
>>>>>>>>>>> >>>>>>>>>memory
>>>>>>>>>>> >>>>>>>>> leak. It seems connected to the changes you had to
>>>>>>>>>>> introduce for
>>>>>>>>>>> >>>>>>>>> SPARQL 1.1.
>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>> >>>>>>>>> I have determined that it happens in
>>>>>>>>>>> >>>>>>>>>GraphMatcher#GenerateMappings
>>>>>>>>>>> >>>>>>>>> method. The graphs are equal and I'm not sure what causes
the
>>>>>>>>>>> >>>>>>>>>problem.
>>>>>>>>>>> >>>>>>>>> As soon as TryBruteForceMapping is reached memory
>>>>>>>>>>> consumption
>>>>>>>>>>> >>>>>>>>>explodes
>>>>>>>>>>> >>>>>>>>> to gigabytes within minutes.
>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>> >>>>>>>>> The low-level problem is the mappings variable in the
>>>>>>>>>>> >>>>>>>>> GenerateMappings, which within a few iteration contains
thousands
>>>>>>>>>>> >>>>>>>>>of
>>>>>>>>>>> >>>>>>>>> elements.
>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>> >>>>>>>>> This problem no longer occurs on trunk. Have you actually
been
>>>>>>>>>>> >>>>>>>>> introducing any fixes around that area?
>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>> >>>>>>>>> Tom
>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse
>>>>>>>>>>> >>>>>>>>><rv...@do...>
>>>>>>>>>>> >>>>>>>>>wrote:
>>>>>>>>>>>> >>>>>>>>>> Comments inline:
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz"
>>>>>>>>>>>> <to...@pl...>
>>>>>>>>>>>> >>>>>>>>>>wrote:
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>Hi Rob
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>I have just updated to latest dotNetRDF available on
NuGet and
>>>>>>>>>>>>> >>>>>>>>>>>I'm
>>>>>>>>>>>>> >>>>>>>>>>>experiencing two issues.
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>1. In my unit tests I relied on the way the library
assigns
>>>>>>>>>>>>> >>>>>>>>>>>blank
>>>>>>>>>>>>> >>>>>>>>>>>node
>>>>>>>>>>>>> >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the
tests
>>>>>>>>>>>>> >>>>>>>>>>>separately
>>>>>>>>>>>>> >>>>>>>>>>>each one passes but when I batch them they fail because
in
>>>>>>>>>>>>> >>>>>>>>>>>subsequent
>>>>>>>>>>>>> >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However
they
>>>>>>>>>>>>> >>>>>>>>>>>don't
>>>>>>>>>>>>> >>>>>>>>>>>share the same graph or triple store. Have you changed
this
>>>>>>>>>>>>> >>>>>>>>>>>behavior
>>>>>>>>>>>>> >>>>>>>>>>>delbierately?
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the
change was
>>>>>>>>>>>> >>>>>>>>>>made
>>>>>>>>>>>> >>>>>>>>>>in
>>>>>>>>>>>> >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and
also
>>>>>>>>>>>> >>>>>>>>>>uncovered
>>>>>>>>>>>> >>>>>>>>>>a
>>>>>>>>>>>> >>>>>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> You shouldn't rely on an internal implementation detail
like how
>>>>>>>>>>>> >>>>>>>>>>the
>>>>>>>>>>>> >>>>>>>>>> library assigns blank node identifiers.  Blank nodes
should
>>>>>>>>>>>> >>>>>>>>>>always
>>>>>>>>>>>> >>>>>>>>>>be
>>>>>>>>>>>> >>>>>>>>>> identifiable by the triples they appear in so it should
be
>>>>>>>>>>>> >>>>>>>>>>possible
>>>>>>>>>>>> >>>>>>>>>>to
>>>>>>>>>>>> >>>>>>>>>> formulate API calls or SPARQL queries that validate that
you
>>>>>>>>>>>> >>>>>>>>>>have
>>>>>>>>>>>> >>>>>>>>>>produced
>>>>>>>>>>>> >>>>>>>>>> the data you expected.
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL
>>>>>>>>>>>>> execution of
>>>>>>>>>>>>> >>>>>>>>>>>this:
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> Define bad memory leak?
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> Updates are transactional so it may be a side effect of
the
>>>>>>>>>>>> >>>>>>>>>>library
>>>>>>>>>>>> >>>>>>>>>> maintaining the state necessary to rollback the
>>>>>>>>>>>> transaction
>>>>>>>>>>>> >>>>>>>>>>should
>>>>>>>>>>>> >>>>>>>>>>it
>>>>>>>>>>>> >>>>>>>>>>fail
>>>>>>>>>>>> >>>>>>>>>> or be aborted.  Also the fact that you are replacing
constant
>>>>>>>>>>>> >>>>>>>>>>nodes
>>>>>>>>>>>> >>>>>>>>>>with
>>>>>>>>>>>> >>>>>>>>>> blank nodes will assign a lot of new identifiers and
those
>>>>>>>>>>>> >>>>>>>>>>identifiers
>>>>>>>>>>>> >>>>>>>>>> have to be tracked to prevent collisions.
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] .
}
>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>The full code is simply:
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>var dataset = new InMemoryDataset(store,
>>>>>>>>>>>>> R2RMLMappings.BaseUri);
>>>>>>>>>>>>> >>>>>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>>>>>>> >>>>>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>>>>>>> >>>>>>>>>>>            var updateParser = new
>>>>>>>>>>>>> SparqlUpdateParser();
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> 
>>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu
>>>>>>>>>>>>> >>>>>>>>>>>tS
>>>>>>>>>>>>> >>>>>>>>>>>ub
>>>>>>>>>>>>> >>>>>>>>>>>m
>>>>>>>>>>>>> >>>>>>>>>>>a
>>>>>>>>>>>>> >>>>>>>>>>>p
>>>>>>>>>>>>> >>>>>>>>>>>sRe
>>>>>>>>>>>>> >>>>>>>>>>>placeSparql));
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>Is this a know problem and has been already fixed or
should I
>>>>>>>>>>>>> >>>>>>>>>>>investigate closely?
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> This is not a known issue, I would also guess that the
data
>>>>>>>>>>>> >>>>>>>>>>being
>>>>>>>>>>>> >>>>>>>>>>used
>>>>>>>>>>>> >>>>>>>>>> would have some bearing on the severity of the problem.
Please
>>>>>>>>>>>> >>>>>>>>>>go
>>>>>>>>>>>> >>>>>>>>>>ahead
>>>>>>>>>>>> >>>>>>>>>> and investigate but I would suspect it is the two things
I
>>>>>>>>>>>> >>>>>>>>>>outlined
>>>>>>>>>>>> >>>>>>>>>>above
>>>>>>>>>>>> >>>>>>>>>> which are the culprits here.
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>> Rob
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>>Thanks,
>>>>>>>>>>>>> >>>>>>>>>>>Tom
>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>> 
>>>>>>>>>>>----------------------------------------------------------------
>>>>>>>>>>>>> >>>>>>>>>>>--
>>>>>>>>>>>>> >>>>>>>>>>>--
>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>> >>>>>>>>>>>---
>>>>>>>>>>>>> >>>>>>>>>>>----
>>>>>>>>>>>>> >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET
>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5,
>>>>>>>>>>>>> >>>>>>>>>>>CSS,
>>>>>>>>>>>>> >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep
>>>>>>>>>>>>> your skills
>>>>>>>>>>>>> >>>>>>>>>>>current
>>>>>>>>>>>>> >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials
by
>>>>>>>>>>>>> >>>>>>>>>>>Microsoft
>>>>>>>>>>>>> >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more
at:
>>>>>>>>>>>>> >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>>>>>>> >>>>>>>>>>>_______________________________________________
>>>>>>>>>>>>> >>>>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>>>> >>>>>>>>>>>dot...@li...
>>>>>>>>>>>>> 
>>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> 
>>>>>>>>>>-----------------------------------------------------------------
>>>>>>>>>>>> >>>>>>>>>>--
>>>>>>>>>>>> >>>>>>>>>>--
>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>> >>>>>>>>>>------
>>>>>>>>>>>> >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET
>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5,
>>>>>>>>>>>> >>>>>>>>>>CSS,
>>>>>>>>>>>> >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your
skills
>>>>>>>>>>>> >>>>>>>>>>current
>>>>>>>>>>>> >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by
>>>>>>>>>>>> >>>>>>>>>>Microsoft
>>>>>>>>>>>> >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn
more at:
>>>>>>>>>>>> >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>>>>>>>> >>>>>>>>>> _______________________________________________
>>>>>>>>>>>> >>>>>>>>>> dotNetRDF-bugs mailing list
>>>>>>>>>>>> >>>>>>>>>> dot...@li...
>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>> >>>>>>>>
>>>>>>>>>> 
>>>>>>>>-------------------------------------------------------------------
>>>>>>>>>> >>>>>>>>--
>>>>>>>>>> >>>>>>>>--
>>>>>>>>>> >>>>>>>>-
>>>>>>>>>> >>>>>>>>-
>>>>>>>>>> >>>>>>>>-
>>>>>>>>>> >>>>>>>>----
>>>>>>>>>> >>>>>>>>Minimize network downtime and maximize team effectiveness.
>>>>>>>>>> >>>>>>>>Reduce network management and security costs.Learn how to
hire
>>>>>>>>>> >>>>>>>>the most talented Cisco Certified professionals. Visit the
>>>>>>>>>> >>>>>>>>Employer Resources Portal
>>>>>>>>>> 
>>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>>>>> >>>>>>>>_______________________________________________
>>>>>>>>>> >>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>> >>>>>>>>dot...@li...
>>>>>>>>>> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>
>>>>>>>>> >>>>>>>
>>>>>>>>> 
>>>>>>>--------------------------------------------------------------------
>>>>>>>>> >>>>>>>--
>>>>>>>>> >>>>>>>--
>>>>>>>>> >>>>>>>-
>>>>>>>>> >>>>>>>-
>>>>>>>>> >>>>>>>----
>>>>>>>>> >>>>>>>Precog is a next-generation analytics platform capable of
>>>>>>>>> advanced
>>>>>>>>> >>>>>>>analytics on semi-structured data. The platform includes APIs
for
>>>>>>>>> >>>>>>>building
>>>>>>>>> >>>>>>>apps and a phenomenal toolset for data science. Developers can
use
>>>>>>>>> >>>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>>> >>>>>>>account!
>>>>>>>>> >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>>> >>>>>>>_______________________________________________
>>>>>>>>> >>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>> >>>>>>>dot...@li...
>>>>>>>>> >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>
>>>>>>>> >>>>>>
>>>>>>>> 
>>>>>>---------------------------------------------------------------------
>>>>>>>> >>>>>>--
>>>>>>>> >>>>>>--
>>>>>>>> >>>>>>-
>>>>>>>> >>>>>>----
>>>>>>>> >>>>>>Precog is a next-generation analytics platform capable of
>>>>>>>> advanced
>>>>>>>> >>>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>>>> >>>>>>building
>>>>>>>> >>>>>>apps and a phenomenal toolset for data science. Developers can
use
>>>>>>>> >>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>> >>>>>>account!
>>>>>>>> >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>> >>>>>>_______________________________________________
>>>>>>>> >>>>>>dotNetRDF-bugs mailing list
>>>>>>>> >>>>>>dot...@li...
>>>>>>>> >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> 
>>>>>----------------------------------------------------------------------
>>>>>>> >>>>>--
>>>>>>> >>>>>--
>>>>>>> >>>>>----
>>>>>>> >>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>>> >>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>>> >>>>>building
>>>>>>> >>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>>> >>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>> >>>>>account!
>>>>>>> >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>> >>>>>_______________________________________________
>>>>>>> >>>>>dotNetRDF-bugs mailing list
>>>>>>> >>>>>dot...@li...
>>>>>>> >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> 
>>>>-----------------------------------------------------------------------
>>>>>> >>>>--
>>>>>> >>>>-----
>>>>>> >>>> Precog is a next-generation analytics platform capable of advanced
>>>>>> >>>> analytics on semi-structured data. The platform includes APIs for
>>>>>> >>>>building
>>>>>> >>>> apps and a phenomenal toolset for data science. Developers can use
>>>>>> >>>> our toolset for easy data analysis & visualization. Get a free
>>>>>> >>>>account!
>>>>>> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>> >>>> _______________________________________________
>>>>>> >>>> dotNetRDF-bugs mailing list
>>>>>> >>>> dot...@li...
>>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>> >>>
>>>>> 
>>>------------------------------------------------------------------------
>>>>> >>>--
>>>>> >>>----
>>>>> >>>Precog is a next-generation analytics platform capable of advanced
>>>>> >>>analytics on semi-structured data. The platform includes APIs for
>>>>> >>>building
>>>>> >>>apps and a phenomenal toolset for data science. Developers can use
>>>>> >>>our toolset for easy data analysis & visualization. Get a free account!
>>>>> >>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>> >>>_______________________________________________
>>>>> >>>dotNetRDF-bugs mailing list
>>>>> >>>dot...@li...
>>>>> >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>-------------------------------------------------------------------------
>>>> >>-----
>>>> >> Precog is a next-generation analytics platform capable of advanced
>>>> >> analytics on semi-structured data. The platform includes APIs for
>>>> >>building
>>>> >> apps and a phenomenal toolset for data science. Developers can use
>>>> >> our toolset for easy data analysis & visualization. Get a free account!
>>>> >> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>> >> _______________________________________________
>>>> >> dotNetRDF-bugs mailing list
>>>> >> dot...@li...
>>>> >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>> >
>>> >--------------------------------------------------------------------------
>>> >----
>>> >Precog is a next-generation analytics platform capable of advanced
>>> >analytics on semi-structured data. The platform includes APIs for building
>>> >apps and a phenomenal toolset for data science. Developers can use
>>> >our toolset for easy data analysis & visualization. Get a free account!
>>> >http://www2.precog.com/precogplatform/slashdotnewsletter
>>> >_______________________________________________
>>> >dotNetRDF-bugs mailing list
>>> >dot...@li...
>>> >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>> 
>> 
>> 
>> 
>> 
>> 
----------------------------------------------------------------------------->>
-
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> _______________________________________________
>> dotNetRDF-bugs mailing list
>> dot...@li...
>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced analytics
> on semi-structured data. The platform includes APIs for building apps and a
> phenomenal toolset for data science. Developers can use our toolset for easy
> data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter______________________
> _________________________ dotNetRDF-bugs mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 21:15:30

Hey Tom

So I validated that those graphs were indeed equal

Having gone through that process by hand I realized there was an additional
rules based mapping step we could be using that we weren't, with this in
place we now don't have to use the divide and conquer approach on any of
your test cases which will improve performance.

All your tests cases now pass, if you come up with any more please go ahead
and add them.

I will try and look more to figure out if the brute force generator is
generating sensible mappings but hopefully now very few graphs should ever
have to resort to that approach.

Rob

From:  Rob Vesse <rv...@do...>
Reply-To:  dotNetRDF Bug Report tracking and resolution
<dot...@li...>
Date:  Friday, April 12, 2013 12:37 PM
To:  dotNetRDF Bug Report tracking and resolution
<dot...@li...>
Subject:  Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

> Yes I realized that when I tried a pull again after sending the reply
> 
> Ok so that case is bombing out on brute force mapping which would tend to
> indicate that there may be an issue there still
> 
> At a glance the graphs look equivalent but I need to verify this by hand
> because the sub-graphs are too large and blank node heavy to easily verify
> whether they are equal and we are just not detecting it correctly or if they
> are non-equal
> 
> Rob
> 
> From:  Tomek Pluskiewicz <to...@pl...>
> Reply-To:  dotNetRDF Bug Report tracking and resolution
> <dot...@li...>
> Date:  Friday, April 12, 2013 12:05 PM
> To:  dotNetRDF Bug Report tracking and resolution
> <dot...@li...>
> Subject:  Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs
> 
>> 
>> I did with a little delay. Please check now.
>> 
>> Tom
>> 
>> On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote:
>>> Ok
>>> 
>>> Can you push the commits up so I can pull them down and take a look at the
>>> new test cases
>>> 
>>> Rob
>>> 
>>> On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>>> wrote:
>>> 
>>>> >I've just committed more test cases. Out of the 6 none fail cause OOM
>>>> >anymore, which is marvellous.
>>>> >
>>>> >However case1 reports false but I'm positive these graphs are actually
>>>> >equal.
>>>> >
>>>> >Thanks,
>>>> >Tom
>>>> >
>>>> >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote:
>>>>> >> Those would be useful
>>>>> >>
>>>>> >> Btw I closed the issue branch so please just add the tests to default
>>>>> >>
>>>>> >> Rob
>>>>> >>
>>>>> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz"
>>>>> <tom...@gm...>
>>>>> >> wrote:
>>>>> >>
>>>>>> >>>Hi Rob
>>>>>> >>>
>>>>>> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
>>>>>> >>>same issue. I will add all these to the test fixture.
>>>>>> >>>
>>>>>> >>>Tom
>>>>>> >>>
>>>>>> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...>
>>>>>> wrote:
>>>>>>> >>>> Hey Tom
>>>>>>> >>>>
>>>>>>> >>>> This should now be fixed for your test case though I am not 100%
>>>>>>> >>>>convinced
>>>>>>> >>>> that brute forcing is not still broken
>>>>>>> >>>>
>>>>>>> >>>> What I have done to fix this is to add an intermediate step between
>>>>>>> >>>>the
>>>>>>> >>>> rules based and brute force mapping which does a divide and conquer
>>>>>>> >>>> approach
>>>>>>> >>>>
>>>>>>> >>>> What this does is break the unmapped blank node portions of the
>>>>>>> graph
>>>>>>> >>>>into
>>>>>>> >>>> its constituent isolated sub-graphs (those that share no blank
>>>>>>> nodes)
>>>>>>> >>>>and
>>>>>>> >>>> then recursively calls Equals() on the candidate matches for the
>>>>>>> >>>> sub-graphs.  This approach reduces the amount of work required and
the
>>>>>>> >>>> likelihood of needing to brute force at all though we still fall
back
>>>>>>> >>>>in
>>>>>>> >>>> the worst case.
>>>>>>> >>>>
>>>>>>> >>>> If you can come up with any more graphs that break GraphMatcher
>>>>>>> those
>>>>>>> >>>> would be much appreciated
>>>>>>> >>>>
>>>>>>> >>>> Rob
>>>>>>> >>>>
>>>>>>> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>> >>>>
>>>>>>>> >>>>>s/not/now
>>>>>>>> >>>>>
>>>>>>>> >>>>>That should be "the test will now complete within the timeout"
>>>>>>>> >>>>>
>>>>>>>> >>>>>Rob
>>>>>>>> >>>>>
>>>>>>>> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>>> >>>>>
>>>>>>>>> >>>>>>Hey Tom
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>So the logic for generating the brute force mappings was
>>>>>>>>> completely
>>>>>>>>> >>>>>>broken
>>>>>>>>> >>>>>>causing it to get stuck in a memory sucking spin cycle :(
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>I rewrote the GenerateMappings() method from scratch to use
yield
>>>>>>>>> >>>>>>return
>>>>>>>>> >>>>>>and the test will not complete within the timeout but it fails
so I
>>>>>>>>> >>>>>>still
>>>>>>>>> >>>>>>need to dig further
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>We may still be generating incorrect possible mappings or the
logic
>>>>>>>>> >>>>>>for
>>>>>>>>> >>>>>>brute force may be flawed elsewhere
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>Rob
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>>Hey Tom
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes
the
>>>>>>>>>> >>>>>>>only
>>>>>>>>>> >>>>>>>option we have is to attempt to brute force the problem
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to
track
>>>>>>>>>> >>>>>>>down
>>>>>>>>>> >>>>>>>where things go wrong
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>For your graphs they may look trivially equal but to code they
are
>>>>>>>>>> >>>>>>>not,
>>>>>>>>>> >>>>>>>the reason this worked prior to 0.8.0 is that one of the
>>>>>>>>>> things we
>>>>>>>>>> >>>>>>>try
>>>>>>>>>> >>>>>>>is
>>>>>>>>>> >>>>>>>a trivial mapping (assume blank nodes have same IDs in both
>>>>>>>>>> graphs)
>>>>>>>>>> >>>>>>>so
>>>>>>>>>> >>>>>>>in
>>>>>>>>>> >>>>>>>previous releases you would likely have hit this case and been
fine.
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>You have 33 blank nodes in the graph of which only 6 are
>>>>>>>>>> uniquely
>>>>>>>>>> >>>>>>>identifiable and mappable.  The matcher generates a candidate
>>>>>>>>>> >>>>>>>mapping
>>>>>>>>>> >>>>>>>for
>>>>>>>>>> >>>>>>>the whole graph but its best effort is incorrect, so then it
falls
>>>>>>>>>> >>>>>>>back
>>>>>>>>>> >>>>>>>to
>>>>>>>>>> >>>>>>>brute force.  I need to dig further into whether the candidate
>>>>>>>>>> >>>>>>>mapping
>>>>>>>>>> >>>>>>>could be improved but this is not trivial to debug and will
take
>>>>>>>>>> >>>>>>>some
>>>>>>>>>> >>>>>>>time
>>>>>>>>>> >>>>>>>to resolve.
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>We may be able to reduce the "memory leak" by using yield
rather
>>>>>>>>>> >>>>>>>than
>>>>>>>>>> >>>>>>>pre-generating all possible mapping but this is a tricky
>>>>>>>>>> refactor,
>>>>>>>>>> >>>>>>>it's
>>>>>>>>>> >>>>>>>been a long time since I wrote the code originally and I
>>>>>>>>>> remember
>>>>>>>>>> >>>>>>>that
>>>>>>>>>> >>>>>>>doing the mapping in the yield form proved thorny at the time
so I
>>>>>>>>>> >>>>>>>chose
>>>>>>>>>> >>>>>>>not to.  The code itself for generating the mappings has some
>>>>>>>>>> >>>>>>>slightly
>>>>>>>>>> >>>>>>>strange things in it so I really need to spend a block of time
>>>>>>>>>> >>>>>>>refreshing
>>>>>>>>>> >>>>>>>myself on the logic there to check that it is sound before I
>>>>>>>>>> attempt
>>>>>>>>>> >>>>>>>to
>>>>>>>>>> >>>>>>>refactor.
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>Rob
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz"
>>>>>>>>>> >>>>>>><tom...@gm...>
>>>>>>>>>> >>>>>>>wrote:
>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>>Hm, I was wrong actually.
>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle
in
>>>>>>>>>>> >>>>>>>>dotNetRDF test project but I got the unit test wrong.
>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>> >>>>>>>>I have added the CORE-345 bug and committed a failing test
case
>>>>>>>>>>> >>>>>>>>[1].
>>>>>>>>>>> >>>>>>>>Could you please have a look at this?
>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>> >>>>>>>>Thanks,
>>>>>>>>>>> >>>>>>>>Tom
>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>> >>>>>>>>[1]:
>>>>>>>>>>> 
>>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>>>>>>>>> >>>>>>>><tom...@gm...> wrote:
>>>>>>>>>>>> >>>>>>>>> Hi Rob
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> I finally got back to R2RML to analyze why I am getting
that
>>>>>>>>>>>> >>>>>>>>>memory
>>>>>>>>>>>> >>>>>>>>> leak. It seems connected to the changes you had to
>>>>>>>>>>>> introduce for
>>>>>>>>>>>> >>>>>>>>> SPARQL 1.1.
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> I have determined that it happens in
>>>>>>>>>>>> >>>>>>>>>GraphMatcher#GenerateMappings
>>>>>>>>>>>> >>>>>>>>> method. The graphs are equal and I'm not sure what causes
the
>>>>>>>>>>>> >>>>>>>>>problem.
>>>>>>>>>>>> >>>>>>>>> As soon as TryBruteForceMapping is reached memory
>>>>>>>>>>>> consumption
>>>>>>>>>>>> >>>>>>>>>explodes
>>>>>>>>>>>> >>>>>>>>> to gigabytes within minutes.
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> The low-level problem is the mappings variable in the
>>>>>>>>>>>> >>>>>>>>> GenerateMappings, which within a few iteration contains
thousands
>>>>>>>>>>>> >>>>>>>>>of
>>>>>>>>>>>> >>>>>>>>> elements.
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> This problem no longer occurs on trunk. Have you actually
been
>>>>>>>>>>>> >>>>>>>>> introducing any fixes around that area?
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> Tom
>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse
>>>>>>>>>>>> >>>>>>>>><rv...@do...>
>>>>>>>>>>>> >>>>>>>>>wrote:
>>>>>>>>>>>>> >>>>>>>>>> Comments inline:
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz"
>>>>>>>>>>>>> <to...@pl...>
>>>>>>>>>>>>> >>>>>>>>>>wrote:
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>Hi Rob
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>I have just updated to latest dotNetRDF available on
NuGet and
>>>>>>>>>>>>>> >>>>>>>>>>>I'm
>>>>>>>>>>>>>> >>>>>>>>>>>experiencing two issues.
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>1. In my unit tests I relied on the way the library
assigns
>>>>>>>>>>>>>> >>>>>>>>>>>blank
>>>>>>>>>>>>>> >>>>>>>>>>>node
>>>>>>>>>>>>>> >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the
tests
>>>>>>>>>>>>>> >>>>>>>>>>>separately
>>>>>>>>>>>>>> >>>>>>>>>>>each one passes but when I batch them they fail
because in
>>>>>>>>>>>>>> >>>>>>>>>>>subsequent
>>>>>>>>>>>>>> >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc.
>>>>>>>>>>>>>> However they
>>>>>>>>>>>>>> >>>>>>>>>>>don't
>>>>>>>>>>>>>> >>>>>>>>>>>share the same graph or triple store. Have you changed
this
>>>>>>>>>>>>>> >>>>>>>>>>>behavior
>>>>>>>>>>>>>> >>>>>>>>>>>delbierately?
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the
change was
>>>>>>>>>>>>> >>>>>>>>>>made
>>>>>>>>>>>>> >>>>>>>>>>in
>>>>>>>>>>>>> >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and
also
>>>>>>>>>>>>> >>>>>>>>>>uncovered
>>>>>>>>>>>>> >>>>>>>>>>a
>>>>>>>>>>>>> >>>>>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>> You shouldn't rely on an internal implementation detail
like how
>>>>>>>>>>>>> >>>>>>>>>>the
>>>>>>>>>>>>> >>>>>>>>>> library assigns blank node identifiers.  Blank nodes
should
>>>>>>>>>>>>> >>>>>>>>>>always
>>>>>>>>>>>>> >>>>>>>>>>be
>>>>>>>>>>>>> >>>>>>>>>> identifiable by the triples they appear in so it should
be
>>>>>>>>>>>>> >>>>>>>>>>possible
>>>>>>>>>>>>> >>>>>>>>>>to
>>>>>>>>>>>>> >>>>>>>>>> formulate API calls or SPARQL queries that validate
that you
>>>>>>>>>>>>> >>>>>>>>>>have
>>>>>>>>>>>>> >>>>>>>>>>produced
>>>>>>>>>>>>> >>>>>>>>>> the data you expected.
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL
>>>>>>>>>>>>>> execution of
>>>>>>>>>>>>>> >>>>>>>>>>>this:
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>> Define bad memory leak?
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>> Updates are transactional so it may be a side effect of
the
>>>>>>>>>>>>> >>>>>>>>>>library
>>>>>>>>>>>>> >>>>>>>>>> maintaining the state necessary to rollback the
>>>>>>>>>>>>> transaction
>>>>>>>>>>>>> >>>>>>>>>>should
>>>>>>>>>>>>> >>>>>>>>>>it
>>>>>>>>>>>>> >>>>>>>>>>fail
>>>>>>>>>>>>> >>>>>>>>>> or be aborted.  Also the fact that you are replacing
constant
>>>>>>>>>>>>> >>>>>>>>>>nodes
>>>>>>>>>>>>> >>>>>>>>>>with
>>>>>>>>>>>>> >>>>>>>>>> blank nodes will assign a lot of new identifiers and
those
>>>>>>>>>>>>> >>>>>>>>>>identifiers
>>>>>>>>>>>>> >>>>>>>>>> have to be tracked to prevent collisions.
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] .
}
>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . }
>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>The full code is simply:
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>var dataset = new InMemoryDataset(store,
>>>>>>>>>>>>>> R2RMLMappings.BaseUri);
>>>>>>>>>>>>>> >>>>>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>>>>>>>> >>>>>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>>>>>>>> >>>>>>>>>>>            var updateParser = new
>>>>>>>>>>>>>> SparqlUpdateParser();
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> 
>>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu
>>>>>>>>>>>>>> >>>>>>>>>>>tS
>>>>>>>>>>>>>> >>>>>>>>>>>ub
>>>>>>>>>>>>>> >>>>>>>>>>>m
>>>>>>>>>>>>>> >>>>>>>>>>>a
>>>>>>>>>>>>>> >>>>>>>>>>>p
>>>>>>>>>>>>>> >>>>>>>>>>>sRe
>>>>>>>>>>>>>> >>>>>>>>>>>placeSparql));
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>Is this a know problem and has been already fixed or
should I
>>>>>>>>>>>>>> >>>>>>>>>>>investigate closely?
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>> This is not a known issue, I would also guess that the
data
>>>>>>>>>>>>> >>>>>>>>>>being
>>>>>>>>>>>>> >>>>>>>>>>used
>>>>>>>>>>>>> >>>>>>>>>> would have some bearing on the severity of the problem.
Please
>>>>>>>>>>>>> >>>>>>>>>>go
>>>>>>>>>>>>> >>>>>>>>>>ahead
>>>>>>>>>>>>> >>>>>>>>>> and investigate but I would suspect it is the two
things I
>>>>>>>>>>>>> >>>>>>>>>>outlined
>>>>>>>>>>>>> >>>>>>>>>>above
>>>>>>>>>>>>> >>>>>>>>>> which are the culprits here.
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>> Rob
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>>Thanks,
>>>>>>>>>>>>>> >>>>>>>>>>>Tom
>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>> 
>>>>>>>>>>>----------------------------------------------------------------
>>>>>>>>>>>>>> >>>>>>>>>>>--
>>>>>>>>>>>>>> >>>>>>>>>>>--
>>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>>> >>>>>>>>>>>---
>>>>>>>>>>>>>> >>>>>>>>>>>----
>>>>>>>>>>>>>> >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET
>>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5,
>>>>>>>>>>>>>> >>>>>>>>>>>CSS,
>>>>>>>>>>>>>> >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep
>>>>>>>>>>>>>> your skills
>>>>>>>>>>>>>> >>>>>>>>>>>current
>>>>>>>>>>>>>> >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials
by
>>>>>>>>>>>>>> >>>>>>>>>>>Microsoft
>>>>>>>>>>>>>> >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn
more at:
>>>>>>>>>>>>>> >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>>>>>>>> >>>>>>>>>>>_______________________________________________
>>>>>>>>>>>>>> >>>>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>>>>> >>>>>>>>>>>dot...@li...
>>>>>>>>>>>>>> 
>>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> 
>>>>>>>>>>-----------------------------------------------------------------
>>>>>>>>>>>>> >>>>>>>>>>--
>>>>>>>>>>>>> >>>>>>>>>>--
>>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>>> >>>>>>>>>>------
>>>>>>>>>>>>> >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET
>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5,
>>>>>>>>>>>>> >>>>>>>>>>CSS,
>>>>>>>>>>>>> >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep
>>>>>>>>>>>>> your skills
>>>>>>>>>>>>> >>>>>>>>>>current
>>>>>>>>>>>>> >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials
by
>>>>>>>>>>>>> >>>>>>>>>>Microsoft
>>>>>>>>>>>>> >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn
more at:
>>>>>>>>>>>>> >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>>>>>>>>> >>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> >>>>>>>>>> dotNetRDF-bugs mailing list
>>>>>>>>>>>>> >>>>>>>>>> dot...@li...
>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>> 
>>>>>>>>-------------------------------------------------------------------
>>>>>>>>>>> >>>>>>>>--
>>>>>>>>>>> >>>>>>>>--
>>>>>>>>>>> >>>>>>>>-
>>>>>>>>>>> >>>>>>>>-
>>>>>>>>>>> >>>>>>>>-
>>>>>>>>>>> >>>>>>>>----
>>>>>>>>>>> >>>>>>>>Minimize network downtime and maximize team effectiveness.
>>>>>>>>>>> >>>>>>>>Reduce network management and security costs.Learn how to
hire
>>>>>>>>>>> >>>>>>>>the most talented Cisco Certified professionals. Visit the
>>>>>>>>>>> >>>>>>>>Employer Resources Portal
>>>>>>>>>>> 
>>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>>>>>> >>>>>>>>_______________________________________________
>>>>>>>>>>> >>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>> >>>>>>>>dot...@li...
>>>>>>>>>>> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>
>>>>>>>>>> >>>>>>>
>>>>>>>>>> 
>>>>>>>--------------------------------------------------------------------
>>>>>>>>>> >>>>>>>--
>>>>>>>>>> >>>>>>>--
>>>>>>>>>> >>>>>>>-
>>>>>>>>>> >>>>>>>-
>>>>>>>>>> >>>>>>>----
>>>>>>>>>> >>>>>>>Precog is a next-generation analytics platform capable of
>>>>>>>>>> advanced
>>>>>>>>>> >>>>>>>analytics on semi-structured data. The platform includes APIs
for
>>>>>>>>>> >>>>>>>building
>>>>>>>>>> >>>>>>>apps and a phenomenal toolset for data science. Developers can
use
>>>>>>>>>> >>>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>>>> >>>>>>>account!
>>>>>>>>>> >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>>>> >>>>>>>_______________________________________________
>>>>>>>>>> >>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>> >>>>>>>dot...@li...
>>>>>>>>>> >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>
>>>>>>>>> >>>>>>
>>>>>>>>> 
>>>>>>---------------------------------------------------------------------
>>>>>>>>> >>>>>>--
>>>>>>>>> >>>>>>--
>>>>>>>>> >>>>>>-
>>>>>>>>> >>>>>>----
>>>>>>>>> >>>>>>Precog is a next-generation analytics platform capable of
>>>>>>>>> advanced
>>>>>>>>> >>>>>>analytics on semi-structured data. The platform includes APIs
for
>>>>>>>>> >>>>>>building
>>>>>>>>> >>>>>>apps and a phenomenal toolset for data science. Developers can
use
>>>>>>>>> >>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>>> >>>>>>account!
>>>>>>>>> >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>>> >>>>>>_______________________________________________
>>>>>>>>> >>>>>>dotNetRDF-bugs mailing list
>>>>>>>>> >>>>>>dot...@li...
>>>>>>>>> >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> >>>>>
>>>>>>>> 
>>>>>----------------------------------------------------------------------
>>>>>>>> >>>>>--
>>>>>>>> >>>>>--
>>>>>>>> >>>>>----
>>>>>>>> >>>>>Precog is a next-generation analytics platform capable of advanced
>>>>>>>> >>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>>>> >>>>>building
>>>>>>>> >>>>>apps and a phenomenal toolset for data science. Developers can use
>>>>>>>> >>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>> >>>>>account!
>>>>>>>> >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>> >>>>>_______________________________________________
>>>>>>>> >>>>>dotNetRDF-bugs mailing list
>>>>>>>> >>>>>dot...@li...
>>>>>>>> >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> 
>>>>-----------------------------------------------------------------------
>>>>>>> >>>>--
>>>>>>> >>>>-----
>>>>>>> >>>> Precog is a next-generation analytics platform capable of advanced
>>>>>>> >>>> analytics on semi-structured data. The platform includes APIs for
>>>>>>> >>>>building
>>>>>>> >>>> apps and a phenomenal toolset for data science. Developers can use
>>>>>>> >>>> our toolset for easy data analysis & visualization. Get a free
>>>>>>> >>>>account!
>>>>>>> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>> >>>> _______________________________________________
>>>>>>> >>>> dotNetRDF-bugs mailing list
>>>>>>> >>>> dot...@li...
>>>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>> >>>
>>>>>> 
>>>------------------------------------------------------------------------
>>>>>> >>>--
>>>>>> >>>----
>>>>>> >>>Precog is a next-generation analytics platform capable of advanced
>>>>>> >>>analytics on semi-structured data. The platform includes APIs for
>>>>>> >>>building
>>>>>> >>>apps and a phenomenal toolset for data science. Developers can use
>>>>>> >>>our toolset for easy data analysis & visualization. Get a free
>>>>>> account!
>>>>>> >>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>> >>>_______________________________________________
>>>>>> >>>dotNetRDF-bugs mailing list
>>>>>> >>>dot...@li...
>>>>>> >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> 
>>-------------------------------------------------------------------------
>>>>> >>-----
>>>>> >> Precog is a next-generation analytics platform capable of advanced
>>>>> >> analytics on semi-structured data. The platform includes APIs for
>>>>> >>building
>>>>> >> apps and a phenomenal toolset for data science. Developers can use
>>>>> >> our toolset for easy data analysis & visualization. Get a free account!
>>>>> >> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>> >> _______________________________________________
>>>>> >> dotNetRDF-bugs mailing list
>>>>> >> dot...@li...
>>>>> >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>> >
>>>> >--------------------------------------------------------------------------
>>>> >----
>>>> >Precog is a next-generation analytics platform capable of advanced
>>>> >analytics on semi-structured data. The platform includes APIs for building
>>>> >apps and a phenomenal toolset for data science. Developers can use
>>>> >our toolset for easy data analysis & visualization. Get a free account!
>>>> >http://www2.precog.com/precogplatform/slashdotnewsletter
>>>> >_______________________________________________
>>>> >dotNetRDF-bugs mailing list
>>>> >dot...@li...
>>>> >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ----------------------------------------------------------------------------
>>> --
>>> Precog is a next-generation analytics platform capable of advanced
>>> analytics on semi-structured data. The platform includes APIs for building
>>> apps and a phenomenal toolset for data science. Developers can use
>>> our toolset for easy data analysis & visualization. Get a free account!
>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>> _______________________________________________
>>> dotNetRDF-bugs mailing list
>>> dot...@li...
>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>> -----------------------------------------------------------------------------
>> - Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use our
>> toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter_____________________
>> __________________________ dotNetRDF-bugs mailing list
>> dot...@li...://lists.sourceforge.net/lists/listi
>> nfo/dotnetrdf-bugs
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced analytics
> on semi-structured data. The platform includes APIs for building apps and a
> phenomenal toolset for data science. Developers can use our toolset for easy
> data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter______________________
> _________________________ dotNetRDF-bugs mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs

Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

From: Rob V. <rv...@do...> - 2013-04-12 21:49:48

I have now fixed the brute force generator and added unit tests specifically
for it so as to verify that it does generate all possible mappings

I will close out CORE-345 since this should now be completely resolved

Rob

From:  Rob Vesse <rv...@do...>
Reply-To:  dotNetRDF Bug Report tracking and resolution
<dot...@li...>
Date:  Friday, April 12, 2013 2:14 PM
To:  dotNetRDF Bug Report tracking and resolution
<dot...@li...>
Subject:  Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs

> Hey Tom
> 
> So I validated that those graphs were indeed equal
> 
> Having gone through that process by hand I realized there was an additional
> rules based mapping step we could be using that we weren't, with this in place
> we now don't have to use the divide and conquer approach on any of your test
> cases which will improve performance.
> 
> All your tests cases now pass, if you come up with any more please go ahead
> and add them.
> 
> I will try and look more to figure out if the brute force generator is
> generating sensible mappings but hopefully now very few graphs should ever
> have to resort to that approach.
> 
> Rob
> 
> From:  Rob Vesse <rv...@do...>
> Reply-To:  dotNetRDF Bug Report tracking and resolution
> <dot...@li...>
> Date:  Friday, April 12, 2013 12:37 PM
> To:  dotNetRDF Bug Report tracking and resolution
> <dot...@li...>
> Subject:  Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs
> 
>> Yes I realized that when I tried a pull again after sending the reply
>> 
>> Ok so that case is bombing out on brute force mapping which would tend to
>> indicate that there may be an issue there still
>> 
>> At a glance the graphs look equivalent but I need to verify this by hand
>> because the sub-graphs are too large and blank node heavy to easily verify
>> whether they are equal and we are just not detecting it correctly or if they
>> are non-equal
>> 
>> Rob
>> 
>> From:  Tomek Pluskiewicz <to...@pl...>
>> Reply-To:  dotNetRDF Bug Report tracking and resolution
>> <dot...@li...>
>> Date:  Friday, April 12, 2013 12:05 PM
>> To:  dotNetRDF Bug Report tracking and resolution
>> <dot...@li...>
>> Subject:  Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs
>> 
>>> 
>>> I did with a little delay. Please check now.
>>> 
>>> Tom
>>> 
>>> On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote:
>>>> Ok
>>>> 
>>>> Can you push the commits up so I can pull them down and take a look at the
>>>> new test cases
>>>> 
>>>> Rob
>>>> 
>>>> On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...>
>>>> wrote:
>>>> 
>>>>> >I've just committed more test cases. Out of the 6 none fail cause OOM
>>>>> >anymore, which is marvellous.
>>>>> >
>>>>> >However case1 reports false but I'm positive these graphs are actually
>>>>> >equal.
>>>>> >
>>>>> >Thanks,
>>>>> >Tom
>>>>> >
>>>>> >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote:
>>>>>> >> Those would be useful
>>>>>> >>
>>>>>> >> Btw I closed the issue branch so please just add the tests to default
>>>>>> >>
>>>>>> >> Rob
>>>>>> >>
>>>>>> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz"
>>>>>> <tom...@gm...>
>>>>>> >> wrote:
>>>>>> >>
>>>>>>> >>>Hi Rob
>>>>>>> >>>
>>>>>>> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this
>>>>>>> >>>same issue. I will add all these to the test fixture.
>>>>>>> >>>
>>>>>>> >>>Tom
>>>>>>> >>>
>>>>>>> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...>
>>>>>>> wrote:
>>>>>>>> >>>> Hey Tom
>>>>>>>> >>>>
>>>>>>>> >>>> This should now be fixed for your test case though I am not 100%
>>>>>>>> >>>>convinced
>>>>>>>> >>>> that brute forcing is not still broken
>>>>>>>> >>>>
>>>>>>>> >>>> What I have done to fix this is to add an intermediate step
>>>>>>>> between
>>>>>>>> >>>>the
>>>>>>>> >>>> rules based and brute force mapping which does a divide and
>>>>>>>> conquer
>>>>>>>> >>>> approach
>>>>>>>> >>>>
>>>>>>>> >>>> What this does is break the unmapped blank node portions of the
>>>>>>>> graph
>>>>>>>> >>>>into
>>>>>>>> >>>> its constituent isolated sub-graphs (those that share no blank
>>>>>>>> nodes)
>>>>>>>> >>>>and
>>>>>>>> >>>> then recursively calls Equals() on the candidate matches for the
>>>>>>>> >>>> sub-graphs.  This approach reduces the amount of work required and
the
>>>>>>>> >>>> likelihood of needing to brute force at all though we still fall
back
>>>>>>>> >>>>in
>>>>>>>> >>>> the worst case.
>>>>>>>> >>>>
>>>>>>>> >>>> If you can come up with any more graphs that break GraphMatcher
>>>>>>>> those
>>>>>>>> >>>> would be much appreciated
>>>>>>>> >>>>
>>>>>>>> >>>> Rob
>>>>>>>> >>>>
>>>>>>>> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>>> >>>>
>>>>>>>>> >>>>>s/not/now
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>That should be "the test will now complete within the timeout"
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>Rob
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>>>> >>>>>
>>>>>>>>>> >>>>>>Hey Tom
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>So the logic for generating the brute force mappings was
>>>>>>>>>> completely
>>>>>>>>>> >>>>>>broken
>>>>>>>>>> >>>>>>causing it to get stuck in a memory sucking spin cycle :(
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>I rewrote the GenerateMappings() method from scratch to use
yield
>>>>>>>>>> >>>>>>return
>>>>>>>>>> >>>>>>and the test will not complete within the timeout but it fails
so I
>>>>>>>>>> >>>>>>still
>>>>>>>>>> >>>>>>need to dig further
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>We may still be generating incorrect possible mappings or the
logic
>>>>>>>>>> >>>>>>for
>>>>>>>>>> >>>>>>brute force may be flawed elsewhere
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>Rob
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote:
>>>>>>>>>> >>>>>>
>>>>>>>>>>> >>>>>>>Hey Tom
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes
the
>>>>>>>>>>> >>>>>>>only
>>>>>>>>>>> >>>>>>>option we have is to attempt to brute force the problem
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to
track
>>>>>>>>>>> >>>>>>>down
>>>>>>>>>>> >>>>>>>where things go wrong
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>For your graphs they may look trivially equal but to code
>>>>>>>>>>> they are
>>>>>>>>>>> >>>>>>>not,
>>>>>>>>>>> >>>>>>>the reason this worked prior to 0.8.0 is that one of the
>>>>>>>>>>> things we
>>>>>>>>>>> >>>>>>>try
>>>>>>>>>>> >>>>>>>is
>>>>>>>>>>> >>>>>>>a trivial mapping (assume blank nodes have same IDs in both
>>>>>>>>>>> graphs)
>>>>>>>>>>> >>>>>>>so
>>>>>>>>>>> >>>>>>>in
>>>>>>>>>>> >>>>>>>previous releases you would likely have hit this case and
>>>>>>>>>>> been fine.
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>You have 33 blank nodes in the graph of which only 6 are
>>>>>>>>>>> uniquely
>>>>>>>>>>> >>>>>>>identifiable and mappable.  The matcher generates a candidate
>>>>>>>>>>> >>>>>>>mapping
>>>>>>>>>>> >>>>>>>for
>>>>>>>>>>> >>>>>>>the whole graph but its best effort is incorrect, so then it
falls
>>>>>>>>>>> >>>>>>>back
>>>>>>>>>>> >>>>>>>to
>>>>>>>>>>> >>>>>>>brute force.  I need to dig further into whether the
>>>>>>>>>>> candidate
>>>>>>>>>>> >>>>>>>mapping
>>>>>>>>>>> >>>>>>>could be improved but this is not trivial to debug and will
take
>>>>>>>>>>> >>>>>>>some
>>>>>>>>>>> >>>>>>>time
>>>>>>>>>>> >>>>>>>to resolve.
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>We may be able to reduce the "memory leak" by using yield
rather
>>>>>>>>>>> >>>>>>>than
>>>>>>>>>>> >>>>>>>pre-generating all possible mapping but this is a tricky
>>>>>>>>>>> refactor,
>>>>>>>>>>> >>>>>>>it's
>>>>>>>>>>> >>>>>>>been a long time since I wrote the code originally and I
>>>>>>>>>>> remember
>>>>>>>>>>> >>>>>>>that
>>>>>>>>>>> >>>>>>>doing the mapping in the yield form proved thorny at the time
so I
>>>>>>>>>>> >>>>>>>chose
>>>>>>>>>>> >>>>>>>not to.  The code itself for generating the mappings has some
>>>>>>>>>>> >>>>>>>slightly
>>>>>>>>>>> >>>>>>>strange things in it so I really need to spend a block of
time
>>>>>>>>>>> >>>>>>>refreshing
>>>>>>>>>>> >>>>>>>myself on the logic there to check that it is sound before I
>>>>>>>>>>> attempt
>>>>>>>>>>> >>>>>>>to
>>>>>>>>>>> >>>>>>>refactor.
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>Rob
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz"
>>>>>>>>>>> >>>>>>><tom...@gm...>
>>>>>>>>>>> >>>>>>>wrote:
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>> >>>>>>>>Hm, I was wrong actually.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle
in
>>>>>>>>>>>> >>>>>>>>dotNetRDF test project but I got the unit test wrong.
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>>I have added the CORE-345 bug and committed a failing test
case
>>>>>>>>>>>> >>>>>>>>[1].
>>>>>>>>>>>> >>>>>>>>Could you please have a look at this?
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>>Thanks,
>>>>>>>>>>>> >>>>>>>>Tom
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>>[1]:
>>>>>>>>>>>> 
>>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz
>>>>>>>>>>>> >>>>>>>><tom...@gm...> wrote:
>>>>>>>>>>>>> >>>>>>>>> Hi Rob
>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>> I finally got back to R2RML to analyze why I am getting
that
>>>>>>>>>>>>> >>>>>>>>>memory
>>>>>>>>>>>>> >>>>>>>>> leak. It seems connected to the changes you had to
>>>>>>>>>>>>> introduce for
>>>>>>>>>>>>> >>>>>>>>> SPARQL 1.1.
>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>> I have determined that it happens in
>>>>>>>>>>>>> >>>>>>>>>GraphMatcher#GenerateMappings
>>>>>>>>>>>>> >>>>>>>>> method. The graphs are equal and I'm not sure what
>>>>>>>>>>>>> causes the
>>>>>>>>>>>>> >>>>>>>>>problem.
>>>>>>>>>>>>> >>>>>>>>> As soon as TryBruteForceMapping is reached memory
>>>>>>>>>>>>> consumption
>>>>>>>>>>>>> >>>>>>>>>explodes
>>>>>>>>>>>>> >>>>>>>>> to gigabytes within minutes.
>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>> The low-level problem is the mappings variable in the
>>>>>>>>>>>>> >>>>>>>>> GenerateMappings, which within a few iteration contains
thousands
>>>>>>>>>>>>> >>>>>>>>>of
>>>>>>>>>>>>> >>>>>>>>> elements.
>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>> This problem no longer occurs on trunk. Have you
>>>>>>>>>>>>> actually been
>>>>>>>>>>>>> >>>>>>>>> introducing any fixes around that area?
>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>> Tom
>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse
>>>>>>>>>>>>> >>>>>>>>><rv...@do...>
>>>>>>>>>>>>> >>>>>>>>>wrote:
>>>>>>>>>>>>>> >>>>>>>>>> Comments inline:
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz"
>>>>>>>>>>>>>> <to...@pl...>
>>>>>>>>>>>>>> >>>>>>>>>>wrote:
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>Hi Rob
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>I have just updated to latest dotNetRDF available on
NuGet and
>>>>>>>>>>>>>>> >>>>>>>>>>>I'm
>>>>>>>>>>>>>>> >>>>>>>>>>>experiencing two issues.
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>1. In my unit tests I relied on the way the library
assigns
>>>>>>>>>>>>>>> >>>>>>>>>>>blank
>>>>>>>>>>>>>>> >>>>>>>>>>>node
>>>>>>>>>>>>>>> >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the
tests
>>>>>>>>>>>>>>> >>>>>>>>>>>separately
>>>>>>>>>>>>>>> >>>>>>>>>>>each one passes but when I batch them they fail
because in
>>>>>>>>>>>>>>> >>>>>>>>>>>subsequent
>>>>>>>>>>>>>>> >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc.
>>>>>>>>>>>>>>> However they
>>>>>>>>>>>>>>> >>>>>>>>>>>don't
>>>>>>>>>>>>>>> >>>>>>>>>>>share the same graph or triple store. Have you
>>>>>>>>>>>>>>> changed this
>>>>>>>>>>>>>>> >>>>>>>>>>>behavior
>>>>>>>>>>>>>>> >>>>>>>>>>>delbierately?
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the
change was
>>>>>>>>>>>>>> >>>>>>>>>>made
>>>>>>>>>>>>>> >>>>>>>>>>in
>>>>>>>>>>>>>> >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support
and also
>>>>>>>>>>>>>> >>>>>>>>>>uncovered
>>>>>>>>>>>>>> >>>>>>>>>>a
>>>>>>>>>>>>>> >>>>>>>>>> bug in graph isomorphism calculation which was fixed.
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>> You shouldn't rely on an internal implementation
>>>>>>>>>>>>>> detail like how
>>>>>>>>>>>>>> >>>>>>>>>>the
>>>>>>>>>>>>>> >>>>>>>>>> library assigns blank node identifiers.  Blank nodes
should
>>>>>>>>>>>>>> >>>>>>>>>>always
>>>>>>>>>>>>>> >>>>>>>>>>be
>>>>>>>>>>>>>> >>>>>>>>>> identifiable by the triples they appear in so it
should be
>>>>>>>>>>>>>> >>>>>>>>>>possible
>>>>>>>>>>>>>> >>>>>>>>>>to
>>>>>>>>>>>>>> >>>>>>>>>> formulate API calls or SPARQL queries that validate
that you
>>>>>>>>>>>>>> >>>>>>>>>>have
>>>>>>>>>>>>>> >>>>>>>>>>produced
>>>>>>>>>>>>>> >>>>>>>>>> the data you expected.
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL
>>>>>>>>>>>>>>> execution of
>>>>>>>>>>>>>>> >>>>>>>>>>>this:
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>> Define bad memory leak?
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>> Updates are transactional so it may be a side effect
of the
>>>>>>>>>>>>>> >>>>>>>>>>library
>>>>>>>>>>>>>> >>>>>>>>>> maintaining the state necessary to rollback the
>>>>>>>>>>>>>> transaction
>>>>>>>>>>>>>> >>>>>>>>>>should
>>>>>>>>>>>>>> >>>>>>>>>>it
>>>>>>>>>>>>>> >>>>>>>>>>fail
>>>>>>>>>>>>>> >>>>>>>>>> or be aborted.  Also the fact that you are replacing
constant
>>>>>>>>>>>>>> >>>>>>>>>>nodes
>>>>>>>>>>>>>> >>>>>>>>>>with
>>>>>>>>>>>>>> >>>>>>>>>> blank nodes will assign a lot of new identifiers and
those
>>>>>>>>>>>>>> >>>>>>>>>>identifiers
>>>>>>>>>>>>>> >>>>>>>>>> have to be tracked to prevent collisions.
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#>
>>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:graph ?value . }
>>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . }
>>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:graph ?value } ;
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:object ?value . }
>>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . }
>>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:object ?value } ;
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . }
>>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ]
. }
>>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ;
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:subject ?value . }
>>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] .
}
>>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:subject ?value }
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>The full code is simply:
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>var dataset = new InMemoryDataset(store,
>>>>>>>>>>>>>>> R2RMLMappings.BaseUri);
>>>>>>>>>>>>>>> >>>>>>>>>>>            ISparqlUpdateProcessor processor = new
>>>>>>>>>>>>>>> >>>>>>>>>>>LeviathanUpdateProcessor(dataset);
>>>>>>>>>>>>>>> >>>>>>>>>>>            var updateParser = new
>>>>>>>>>>>>>>> SparqlUpdateParser();
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> 
>>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromStr>>>>>>>>>>>>>>>
ing(Shortcu
>>>>>>>>>>>>>>> >>>>>>>>>>>tS
>>>>>>>>>>>>>>> >>>>>>>>>>>ub
>>>>>>>>>>>>>>> >>>>>>>>>>>m
>>>>>>>>>>>>>>> >>>>>>>>>>>a
>>>>>>>>>>>>>>> >>>>>>>>>>>p
>>>>>>>>>>>>>>> >>>>>>>>>>>sRe
>>>>>>>>>>>>>>> >>>>>>>>>>>placeSparql));
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>Is this a know problem and has been already fixed or
should I
>>>>>>>>>>>>>>> >>>>>>>>>>>investigate closely?
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>> This is not a known issue, I would also guess that the
data
>>>>>>>>>>>>>> >>>>>>>>>>being
>>>>>>>>>>>>>> >>>>>>>>>>used
>>>>>>>>>>>>>> >>>>>>>>>> would have some bearing on the severity of the
>>>>>>>>>>>>>> problem.  Please
>>>>>>>>>>>>>> >>>>>>>>>>go
>>>>>>>>>>>>>> >>>>>>>>>>ahead
>>>>>>>>>>>>>> >>>>>>>>>> and investigate but I would suspect it is the two
things I
>>>>>>>>>>>>>> >>>>>>>>>>outlined
>>>>>>>>>>>>>> >>>>>>>>>>above
>>>>>>>>>>>>>> >>>>>>>>>> which are the culprits here.
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>> Rob
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> >>>>>>>>>>>Thanks,
>>>>>>>>>>>>>>> >>>>>>>>>>>Tom
>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>> 
>>>>>>>>>>>----------------------------------------------------->>>>>>>>>>>>>>>
-----------
>>>>>>>>>>>>>>> >>>>>>>>>>>--
>>>>>>>>>>>>>>> >>>>>>>>>>>--
>>>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>>>> >>>>>>>>>>>-
>>>>>>>>>>>>>>> >>>>>>>>>>>---
>>>>>>>>>>>>>>> >>>>>>>>>>>----
>>>>>>>>>>>>>>> >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET
>>>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5,
>>>>>>>>>>>>>>> >>>>>>>>>>>CSS,
>>>>>>>>>>>>>>> >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep
>>>>>>>>>>>>>>> your skills
>>>>>>>>>>>>>>> >>>>>>>>>>>current
>>>>>>>>>>>>>>> >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials
by
>>>>>>>>>>>>>>> >>>>>>>>>>>Microsoft
>>>>>>>>>>>>>>> >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn
more at:
>>>>>>>>>>>>>>> >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712
>>>>>>>>>>>>>>> >>>>>>>>>>>_______________________________________________
>>>>>>>>>>>>>>> >>>>>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>>>>>> >>>>>>>>>>>dot...@li...
>>>>>>>>>>>>>>> 
>>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> 
>>>>>>>>>>------------------------------------------------------->>>>>>>>>>>>>>
----------
>>>>>>>>>>>>>> >>>>>>>>>>--
>>>>>>>>>>>>>> >>>>>>>>>>--
>>>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>>>> >>>>>>>>>>-
>>>>>>>>>>>>>> >>>>>>>>>>------
>>>>>>>>>>>>>> >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET
>>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5,
>>>>>>>>>>>>>> >>>>>>>>>>CSS,
>>>>>>>>>>>>>> >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep
>>>>>>>>>>>>>> your skills
>>>>>>>>>>>>>> >>>>>>>>>>current
>>>>>>>>>>>>>> >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials
by
>>>>>>>>>>>>>> >>>>>>>>>>Microsoft
>>>>>>>>>>>>>> >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn
more at:
>>>>>>>>>>>>>> >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412
>>>>>>>>>>>>>> >>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> >>>>>>>>>> dotNetRDF-bugs mailing list
>>>>>>>>>>>>>> >>>>>>>>>> dot...@li...
>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>> 
>>>>>>>>----------------------------------------------------------->>>>>>>>>>>>
--------
>>>>>>>>>>>> >>>>>>>>--
>>>>>>>>>>>> >>>>>>>>--
>>>>>>>>>>>> >>>>>>>>-
>>>>>>>>>>>> >>>>>>>>-
>>>>>>>>>>>> >>>>>>>>-
>>>>>>>>>>>> >>>>>>>>----
>>>>>>>>>>>> >>>>>>>>Minimize network downtime and maximize team effectiveness.
>>>>>>>>>>>> >>>>>>>>Reduce network management and security costs.Learn how to
hire
>>>>>>>>>>>> >>>>>>>>the most talented Cisco Certified professionals. Visit the
>>>>>>>>>>>> >>>>>>>>Employer Resources Portal
>>>>>>>>>>>> 
>>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>>>>>>> >>>>>>>>_______________________________________________
>>>>>>>>>>>> >>>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>>> >>>>>>>>dot...@li...
>>>>>>>>>>>> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> >>>>>>>
>>>>>>>>>>> 
>>>>>>>------------------------------------------------------------->>>>>>>>>>>
-------
>>>>>>>>>>> >>>>>>>--
>>>>>>>>>>> >>>>>>>--
>>>>>>>>>>> >>>>>>>-
>>>>>>>>>>> >>>>>>>-
>>>>>>>>>>> >>>>>>>----
>>>>>>>>>>> >>>>>>>Precog is a next-generation analytics platform capable of
>>>>>>>>>>> advanced
>>>>>>>>>>> >>>>>>>analytics on semi-structured data. The platform includes APIs
for
>>>>>>>>>>> >>>>>>>building
>>>>>>>>>>> >>>>>>>apps and a phenomenal toolset for data science. Developers
>>>>>>>>>>> can use
>>>>>>>>>>> >>>>>>>our toolset for easy data analysis & visualization. Get a
free
>>>>>>>>>>> >>>>>>>account!
>>>>>>>>>>> >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>>>>> >>>>>>>_______________________________________________
>>>>>>>>>>> >>>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>>> >>>>>>>dot...@li...
>>>>>>>>>>> >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>
>>>>>>>>>> >>>>>>
>>>>>>>>>> 
>>>>>>--------------------------------------------------------------->>>>>>>>>>
------
>>>>>>>>>> >>>>>>--
>>>>>>>>>> >>>>>>--
>>>>>>>>>> >>>>>>-
>>>>>>>>>> >>>>>>----
>>>>>>>>>> >>>>>>Precog is a next-generation analytics platform capable of
>>>>>>>>>> advanced
>>>>>>>>>> >>>>>>analytics on semi-structured data. The platform includes APIs
for
>>>>>>>>>> >>>>>>building
>>>>>>>>>> >>>>>>apps and a phenomenal toolset for data science. Developers can
use
>>>>>>>>>> >>>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>>>> >>>>>>account!
>>>>>>>>>> >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>>>> >>>>>>_______________________________________________
>>>>>>>>>> >>>>>>dotNetRDF-bugs mailing list
>>>>>>>>>> >>>>>>dot...@li...
>>>>>>>>>> >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>
>>>>>>>>> 
>>>>>----------------------------------------------------------------->>>>>>>>>
-----
>>>>>>>>> >>>>>--
>>>>>>>>> >>>>>--
>>>>>>>>> >>>>>----
>>>>>>>>> >>>>>Precog is a next-generation analytics platform capable of
>>>>>>>>> advanced
>>>>>>>>> >>>>>analytics on semi-structured data. The platform includes APIs for
>>>>>>>>> >>>>>building
>>>>>>>>> >>>>>apps and a phenomenal toolset for data science. Developers can
use
>>>>>>>>> >>>>>our toolset for easy data analysis & visualization. Get a free
>>>>>>>>> >>>>>account!
>>>>>>>>> >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>>> >>>>>_______________________________________________
>>>>>>>>> >>>>>dotNetRDF-bugs mailing list
>>>>>>>>> >>>>>dot...@li...
>>>>>>>>> >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> 
>>>>------------------------------------------------------------------->>>>>>>>
----
>>>>>>>> >>>>--
>>>>>>>> >>>>-----
>>>>>>>> >>>> Precog is a next-generation analytics platform capable of advanced
>>>>>>>> >>>> analytics on semi-structured data. The platform includes APIs for
>>>>>>>> >>>>building
>>>>>>>> >>>> apps and a phenomenal toolset for data science. Developers can use
>>>>>>>> >>>> our toolset for easy data analysis & visualization. Get a free
>>>>>>>> >>>>account!
>>>>>>>> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>> >>>> _______________________________________________
>>>>>>>> >>>> dotNetRDF-bugs mailing list
>>>>>>>> >>>> dot...@li...
>>>>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>>> >>>
>>>>>>> 
>>>--------------------------------------------------------------------->>>>>>>
---
>>>>>>> >>>--
>>>>>>> >>>----
>>>>>>> >>>Precog is a next-generation analytics platform capable of advanced
>>>>>>> >>>analytics on semi-structured data. The platform includes APIs for
>>>>>>> >>>building
>>>>>>> >>>apps and a phenomenal toolset for data science. Developers can use
>>>>>>> >>>our toolset for easy data analysis & visualization. Get a free
>>>>>>> account!
>>>>>>> >>>http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>> >>>_______________________________________________
>>>>>>> >>>dotNetRDF-bugs mailing list
>>>>>>> >>>dot...@li...
>>>>>>> >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> 
>>----------------------------------------------------------------------->>>>>>
--
>>>>>> >>-----
>>>>>> >> Precog is a next-generation analytics platform capable of advanced
>>>>>> >> analytics on semi-structured data. The platform includes APIs for
>>>>>> >>building
>>>>>> >> apps and a phenomenal toolset for data science. Developers can use
>>>>>> >> our toolset for easy data analysis & visualization. Get a free
>>>>>> account!
>>>>>> >> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>> >> _______________________________________________
>>>>>> >> dotNetRDF-bugs mailing list
>>>>>> >> dot...@li...
>>>>>> >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>>> >
>>>>> 
>------------------------------------------------------------------------->>>>>
-
>>>>> >----
>>>>> >Precog is a next-generation analytics platform capable of advanced
>>>>> >analytics on semi-structured data. The platform includes APIs for
>>>>> building
>>>>> >apps and a phenomenal toolset for data science. Developers can use
>>>>> >our toolset for easy data analysis & visualization. Get a free account!
>>>>> >http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>> >_______________________________________________
>>>>> >dotNetRDF-bugs mailing list
>>>>> >dot...@li...
>>>>> >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------------
>>>> ---
>>>> Precog is a next-generation analytics platform capable of advanced
>>>> analytics on semi-structured data. The platform includes APIs for building
>>>> apps and a phenomenal toolset for data science. Developers can use
>>>> our toolset for easy data analysis & visualization. Get a free account!
>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>> _______________________________________________
>>>> dotNetRDF-bugs mailing list
>>>> dot...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs
>>> ----------------------------------------------------------------------------
>>> -- Precog is a next-generation analytics platform capable of advanced
>>> analytics on semi-structured data. The platform includes APIs for building
>>> apps and a phenomenal toolset for data science. Developers can use our
>>> toolset for easy data analysis & visualization. Get a free account!
>>> http://www2.precog.com/precogplatform/slashdotnewsletter____________________
>>> ___________________________ dotNetRDF-bugs mailing list
>>> dot...@li...://lists.sourceforge.net/lists/list
>>> info/dotnetrdf-bugs
>> -----------------------------------------------------------------------------
>> - Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use our
>> toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter_____________________
>> __________________________ dotNetRDF-bugs mailing list
>> dot...@li...://lists.sourceforge.net/lists/listi
>> nfo/dotnetrdf-bugs
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced analytics
> on semi-structured data. The platform includes APIs for building apps and a
> phenomenal toolset for data science. Developers can use our toolset for easy
> data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter______________________
> _________________________ dotNetRDF-bugs mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs