|
From: Tomek P. <to...@pl...> - 2013-01-10 19:15:24
|
Hi Rob I have just updated to latest dotNetRDF available on NuGet and I'm experiencing two issues. 1. In my unit tests I relied on the way the library assigns blank node identifiers: autos1, autos2 and so on. When I run the tests separately each one passes but when I batch them they fail because in subsequent tests blank nodes are name autos2, autos3, etc. However they don't share the same graph or triple store. Have you changed this behavior delbierately? 2. There is a bad memory leak in during SPARQL execution of this: PREFIX rr: <http://www.w3.org/ns/r2rml#> DELETE { ?map rr:graph ?value . } INSERT { ?map rr:graphMap [ rr:constant ?value ] . } WHERE { ?map rr:graph ?value } ; DELETE { ?map rr:object ?value . } INSERT { ?map rr:objectMap [ rr:constant ?value ] . } WHERE { ?map rr:object ?value } ; DELETE { ?map rr:predicate ?value . } INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } WHERE { ?map rr:predicate ?value } ; DELETE { ?map rr:subject ?value . } INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } WHERE { ?map rr:subject ?value } The full code is simply: var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); ISparqlUpdateProcessor processor = new LeviathanUpdateProcessor(dataset); var updateParser = new SparqlUpdateParser(); processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsReplaceSparql)); Is this a know problem and has been already fixed or should I investigate closely? Thanks, Tom |
|
From: Rob V. <rv...@do...> - 2013-01-14 11:34:05
|
Comments inline: On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote: >Hi Rob > >I have just updated to latest dotNetRDF available on NuGet and I'm >experiencing two issues. > >1. In my unit tests I relied on the way the library assigns blank node >identifiers: autos1, autos2 and so on. When I run the tests separately >each one passes but when I batch them they fail because in subsequent >tests blank nodes are name autos2, autos3, etc. However they don't >share the same graph or triple store. Have you changed this behavior >delbierately? Yes this behavior changed in the 0.8.x releases, the change was made in order to resolve a bug in SPARQL 1.1 Update support and also uncovered a bug in graph isomorphism calculation which was fixed. You shouldn't rely on an internal implementation detail like how the library assigns blank node identifiers. Blank nodes should always be identifiable by the triples they appear in so it should be possible to formulate API calls or SPARQL queries that validate that you have produced the data you expected. > >2. There is a bad memory leak in during SPARQL execution of this: Define bad memory leak? Updates are transactional so it may be a side effect of the library maintaining the state necessary to rollback the transaction should it fail or be aborted. Also the fact that you are replacing constant nodes with blank nodes will assign a lot of new identifiers and those identifiers have to be tracked to prevent collisions. > >PREFIX rr: <http://www.w3.org/ns/r2rml#> >DELETE { ?map rr:graph ?value . } >INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >WHERE { ?map rr:graph ?value } ; > >DELETE { ?map rr:object ?value . } >INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >WHERE { ?map rr:object ?value } ; > >DELETE { ?map rr:predicate ?value . } >INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >WHERE { ?map rr:predicate ?value } ; > >DELETE { ?map rr:subject ?value . } >INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >WHERE { ?map rr:subject ?value } > >The full code is simply: > >var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); > ISparqlUpdateProcessor processor = new >LeviathanUpdateProcessor(dataset); > var updateParser = new SparqlUpdateParser(); > > >processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsRe >placeSparql)); > >Is this a know problem and has been already fixed or should I >investigate closely? This is not a known issue, I would also guess that the data being used would have some bearing on the severity of the problem. Please go ahead and investigate but I would suspect it is the two things I outlined above which are the culprits here. Rob > >Thanks, >Tom > >-------------------------------------------------------------------------- >---- >Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >MVPs and experts. ON SALE this month only -- learn more at: >http://p.sf.net/sfu/learnmore_122712 >_______________________________________________ >dotNetRDF-bugs mailing list >dot...@li... >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Tomasz P. <tom...@gm...> - 2013-04-07 17:37:07
|
Hi Rob I finally got back to R2RML to analyze why I am getting that memory leak. It seems connected to the changes you had to introduce for SPARQL 1.1. I have determined that it happens in GraphMatcher#GenerateMappings method. The graphs are equal and I'm not sure what causes the problem. As soon as TryBruteForceMapping is reached memory consumption explodes to gigabytes within minutes. The low-level problem is the mappings variable in the GenerateMappings, which within a few iteration contains thousands of elements. This problem no longer occurs on trunk. Have you actually been introducing any fixes around that area? Tom On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> wrote: > Comments inline: > > On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote: > >>Hi Rob >> >>I have just updated to latest dotNetRDF available on NuGet and I'm >>experiencing two issues. >> >>1. In my unit tests I relied on the way the library assigns blank node >>identifiers: autos1, autos2 and so on. When I run the tests separately >>each one passes but when I batch them they fail because in subsequent >>tests blank nodes are name autos2, autos3, etc. However they don't >>share the same graph or triple store. Have you changed this behavior >>delbierately? > > Yes this behavior changed in the 0.8.x releases, the change was made in > order to resolve a bug in SPARQL 1.1 Update support and also uncovered a > bug in graph isomorphism calculation which was fixed. > > You shouldn't rely on an internal implementation detail like how the > library assigns blank node identifiers. Blank nodes should always be > identifiable by the triples they appear in so it should be possible to > formulate API calls or SPARQL queries that validate that you have produced > the data you expected. > >> >>2. There is a bad memory leak in during SPARQL execution of this: > > Define bad memory leak? > > Updates are transactional so it may be a side effect of the library > maintaining the state necessary to rollback the transaction should it fail > or be aborted. Also the fact that you are replacing constant nodes with > blank nodes will assign a lot of new identifiers and those identifiers > have to be tracked to prevent collisions. > >> >>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>DELETE { ?map rr:graph ?value . } >>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>WHERE { ?map rr:graph ?value } ; >> >>DELETE { ?map rr:object ?value . } >>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>WHERE { ?map rr:object ?value } ; >> >>DELETE { ?map rr:predicate ?value . } >>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>WHERE { ?map rr:predicate ?value } ; >> >>DELETE { ?map rr:subject ?value . } >>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>WHERE { ?map rr:subject ?value } >> >>The full code is simply: >> >>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >> ISparqlUpdateProcessor processor = new >>LeviathanUpdateProcessor(dataset); >> var updateParser = new SparqlUpdateParser(); >> >> >>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsRe >>placeSparql)); >> >>Is this a know problem and has been already fixed or should I >>investigate closely? > > This is not a known issue, I would also guess that the data being used > would have some bearing on the severity of the problem. Please go ahead > and investigate but I would suspect it is the two things I outlined above > which are the culprits here. > > Rob > >> >>Thanks, >>Tom >> >>-------------------------------------------------------------------------- >>---- >>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>MVPs and experts. ON SALE this month only -- learn more at: >>http://p.sf.net/sfu/learnmore_122712 >>_______________________________________________ >>dotNetRDF-bugs mailing list >>dot...@li... >>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > > > > ------------------------------------------------------------------------------ > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft > MVPs and experts. SALE $99.99 this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122412 > _______________________________________________ > dotNetRDF-bugs mailing list > dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Tomasz P. <tom...@gm...> - 2013-04-07 18:21:09
|
Hm, I was wrong actually. I tried comparing the exact same graphs loaded from Turtle in dotNetRDF test project but I got the unit test wrong. I have added the CORE-345 bug and committed a failing test case [1]. Could you please have a look at this? Thanks, Tom [1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz <tom...@gm...> wrote: > Hi Rob > > I finally got back to R2RML to analyze why I am getting that memory > leak. It seems connected to the changes you had to introduce for > SPARQL 1.1. > > I have determined that it happens in GraphMatcher#GenerateMappings > method. The graphs are equal and I'm not sure what causes the problem. > As soon as TryBruteForceMapping is reached memory consumption explodes > to gigabytes within minutes. > > The low-level problem is the mappings variable in the > GenerateMappings, which within a few iteration contains thousands of > elements. > > This problem no longer occurs on trunk. Have you actually been > introducing any fixes around that area? > > Tom > > On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> wrote: >> Comments inline: >> >> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote: >> >>>Hi Rob >>> >>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>experiencing two issues. >>> >>>1. In my unit tests I relied on the way the library assigns blank node >>>identifiers: autos1, autos2 and so on. When I run the tests separately >>>each one passes but when I batch them they fail because in subsequent >>>tests blank nodes are name autos2, autos3, etc. However they don't >>>share the same graph or triple store. Have you changed this behavior >>>delbierately? >> >> Yes this behavior changed in the 0.8.x releases, the change was made in >> order to resolve a bug in SPARQL 1.1 Update support and also uncovered a >> bug in graph isomorphism calculation which was fixed. >> >> You shouldn't rely on an internal implementation detail like how the >> library assigns blank node identifiers. Blank nodes should always be >> identifiable by the triples they appear in so it should be possible to >> formulate API calls or SPARQL queries that validate that you have produced >> the data you expected. >> >>> >>>2. There is a bad memory leak in during SPARQL execution of this: >> >> Define bad memory leak? >> >> Updates are transactional so it may be a side effect of the library >> maintaining the state necessary to rollback the transaction should it fail >> or be aborted. Also the fact that you are replacing constant nodes with >> blank nodes will assign a lot of new identifiers and those identifiers >> have to be tracked to prevent collisions. >> >>> >>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>DELETE { ?map rr:graph ?value . } >>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>WHERE { ?map rr:graph ?value } ; >>> >>>DELETE { ?map rr:object ?value . } >>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>WHERE { ?map rr:object ?value } ; >>> >>>DELETE { ?map rr:predicate ?value . } >>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>WHERE { ?map rr:predicate ?value } ; >>> >>>DELETE { ?map rr:subject ?value . } >>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>WHERE { ?map rr:subject ?value } >>> >>>The full code is simply: >>> >>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>> ISparqlUpdateProcessor processor = new >>>LeviathanUpdateProcessor(dataset); >>> var updateParser = new SparqlUpdateParser(); >>> >>> >>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmapsRe >>>placeSparql)); >>> >>>Is this a know problem and has been already fixed or should I >>>investigate closely? >> >> This is not a known issue, I would also guess that the data being used >> would have some bearing on the severity of the problem. Please go ahead >> and investigate but I would suspect it is the two things I outlined above >> which are the culprits here. >> >> Rob >> >>> >>>Thanks, >>>Tom >>> >>>-------------------------------------------------------------------------- >>>---- >>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>MVPs and experts. ON SALE this month only -- learn more at: >>>http://p.sf.net/sfu/learnmore_122712 >>>_______________________________________________ >>>dotNetRDF-bugs mailing list >>>dot...@li... >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >> MVPs and experts. SALE $99.99 this month only -- learn more at: >> http://p.sf.net/sfu/learnmore_122412 >> _______________________________________________ >> dotNetRDF-bugs mailing list >> dot...@li... >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-09 17:35:31
|
Hey Tom The problem is that graph isomorphism is NP-hard so sometimes the only option we have is to attempt to brute force the problem I've started added some Debug.WriteLine() to GraphMatcher to track down where things go wrong For your graphs they may look trivially equal but to code they are not, the reason this worked prior to 0.8.0 is that one of the things we try is a trivial mapping (assume blank nodes have same IDs in both graphs) so in previous releases you would likely have hit this case and been fine. You have 33 blank nodes in the graph of which only 6 are uniquely identifiable and mappable. The matcher generates a candidate mapping for the whole graph but its best effort is incorrect, so then it falls back to brute force. I need to dig further into whether the candidate mapping could be improved but this is not trivial to debug and will take some time to resolve. We may be able to reduce the "memory leak" by using yield rather than pre-generating all possible mapping but this is a tricky refactor, it's been a long time since I wrote the code originally and I remember that doing the mapping in the yield form proved thorny at the time so I chose not to. The code itself for generating the mappings has some slightly strange things in it so I really need to spend a block of time refreshing myself on the logic there to check that it is sound before I attempt to refactor. Rob On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...> wrote: >Hm, I was wrong actually. > >I tried comparing the exact same graphs loaded from Turtle in >dotNetRDF test project but I got the unit test wrong. > >I have added the CORE-345 bug and committed a failing test case [1]. >Could you please have a look at this? > >Thanks, >Tom > >[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 > >On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz ><tom...@gm...> wrote: >> Hi Rob >> >> I finally got back to R2RML to analyze why I am getting that memory >> leak. It seems connected to the changes you had to introduce for >> SPARQL 1.1. >> >> I have determined that it happens in GraphMatcher#GenerateMappings >> method. The graphs are equal and I'm not sure what causes the problem. >> As soon as TryBruteForceMapping is reached memory consumption explodes >> to gigabytes within minutes. >> >> The low-level problem is the mappings variable in the >> GenerateMappings, which within a few iteration contains thousands of >> elements. >> >> This problem no longer occurs on trunk. Have you actually been >> introducing any fixes around that area? >> >> Tom >> >> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> >>wrote: >>> Comments inline: >>> >>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote: >>> >>>>Hi Rob >>>> >>>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>>experiencing two issues. >>>> >>>>1. In my unit tests I relied on the way the library assigns blank node >>>>identifiers: autos1, autos2 and so on. When I run the tests separately >>>>each one passes but when I batch them they fail because in subsequent >>>>tests blank nodes are name autos2, autos3, etc. However they don't >>>>share the same graph or triple store. Have you changed this behavior >>>>delbierately? >>> >>> Yes this behavior changed in the 0.8.x releases, the change was made in >>> order to resolve a bug in SPARQL 1.1 Update support and also uncovered >>>a >>> bug in graph isomorphism calculation which was fixed. >>> >>> You shouldn't rely on an internal implementation detail like how the >>> library assigns blank node identifiers. Blank nodes should always be >>> identifiable by the triples they appear in so it should be possible to >>> formulate API calls or SPARQL queries that validate that you have >>>produced >>> the data you expected. >>> >>>> >>>>2. There is a bad memory leak in during SPARQL execution of this: >>> >>> Define bad memory leak? >>> >>> Updates are transactional so it may be a side effect of the library >>> maintaining the state necessary to rollback the transaction should it >>>fail >>> or be aborted. Also the fact that you are replacing constant nodes >>>with >>> blank nodes will assign a lot of new identifiers and those identifiers >>> have to be tracked to prevent collisions. >>> >>>> >>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>DELETE { ?map rr:graph ?value . } >>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>WHERE { ?map rr:graph ?value } ; >>>> >>>>DELETE { ?map rr:object ?value . } >>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>WHERE { ?map rr:object ?value } ; >>>> >>>>DELETE { ?map rr:predicate ?value . } >>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>WHERE { ?map rr:predicate ?value } ; >>>> >>>>DELETE { ?map rr:subject ?value . } >>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>WHERE { ?map rr:subject ?value } >>>> >>>>The full code is simply: >>>> >>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>> ISparqlUpdateProcessor processor = new >>>>LeviathanUpdateProcessor(dataset); >>>> var updateParser = new SparqlUpdateParser(); >>>> >>>> >>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubmap >>>>sRe >>>>placeSparql)); >>>> >>>>Is this a know problem and has been already fixed or should I >>>>investigate closely? >>> >>> This is not a known issue, I would also guess that the data being used >>> would have some bearing on the severity of the problem. Please go >>>ahead >>> and investigate but I would suspect it is the two things I outlined >>>above >>> which are the culprits here. >>> >>> Rob >>> >>>> >>>>Thanks, >>>>Tom >>>> >>>>----------------------------------------------------------------------- >>>>--- >>>>---- >>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>http://p.sf.net/sfu/learnmore_122712 >>>>_______________________________________________ >>>>dotNetRDF-bugs mailing list >>>>dot...@li... >>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>> >>> >>> >>> >>> >>>------------------------------------------------------------------------ >>>------ >>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>> http://p.sf.net/sfu/learnmore_122412 >>> _______________________________________________ >>> dotNetRDF-bugs mailing list >>> dot...@li... >>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >-------------------------------------------------------------------------- >---- >Minimize network downtime and maximize team effectiveness. >Reduce network management and security costs.Learn how to hire >the most talented Cisco Certified professionals. Visit the >Employer Resources Portal >http://www.cisco.com/web/learning/employer_resources/index.html >_______________________________________________ >dotNetRDF-bugs mailing list >dot...@li... >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-12 17:24:42
|
Hey Tom So the logic for generating the brute force mappings was completely broken causing it to get stuck in a memory sucking spin cycle :( I rewrote the GenerateMappings() method from scratch to use yield return and the test will not complete within the timeout but it fails so I still need to dig further We may still be generating incorrect possible mappings or the logic for brute force may be flawed elsewhere Rob On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >Hey Tom > >The problem is that graph isomorphism is NP-hard so sometimes the only >option we have is to attempt to brute force the problem > >I've started added some Debug.WriteLine() to GraphMatcher to track down >where things go wrong > >For your graphs they may look trivially equal but to code they are not, >the reason this worked prior to 0.8.0 is that one of the things we try is >a trivial mapping (assume blank nodes have same IDs in both graphs) so in >previous releases you would likely have hit this case and been fine. > >You have 33 blank nodes in the graph of which only 6 are uniquely >identifiable and mappable. The matcher generates a candidate mapping for >the whole graph but its best effort is incorrect, so then it falls back to >brute force. I need to dig further into whether the candidate mapping >could be improved but this is not trivial to debug and will take some time >to resolve. > >We may be able to reduce the "memory leak" by using yield rather than >pre-generating all possible mapping but this is a tricky refactor, it's >been a long time since I wrote the code originally and I remember that >doing the mapping in the yield form proved thorny at the time so I chose >not to. The code itself for generating the mappings has some slightly >strange things in it so I really need to spend a block of time refreshing >myself on the logic there to check that it is sound before I attempt to >refactor. > >Rob > >On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...> >wrote: > >>Hm, I was wrong actually. >> >>I tried comparing the exact same graphs loaded from Turtle in >>dotNetRDF test project but I got the unit test wrong. >> >>I have added the CORE-345 bug and committed a failing test case [1]. >>Could you please have a look at this? >> >>Thanks, >>Tom >> >>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >> >>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >><tom...@gm...> wrote: >>> Hi Rob >>> >>> I finally got back to R2RML to analyze why I am getting that memory >>> leak. It seems connected to the changes you had to introduce for >>> SPARQL 1.1. >>> >>> I have determined that it happens in GraphMatcher#GenerateMappings >>> method. The graphs are equal and I'm not sure what causes the problem. >>> As soon as TryBruteForceMapping is reached memory consumption explodes >>> to gigabytes within minutes. >>> >>> The low-level problem is the mappings variable in the >>> GenerateMappings, which within a few iteration contains thousands of >>> elements. >>> >>> This problem no longer occurs on trunk. Have you actually been >>> introducing any fixes around that area? >>> >>> Tom >>> >>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> >>>wrote: >>>> Comments inline: >>>> >>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> wrote: >>>> >>>>>Hi Rob >>>>> >>>>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>>>experiencing two issues. >>>>> >>>>>1. In my unit tests I relied on the way the library assigns blank node >>>>>identifiers: autos1, autos2 and so on. When I run the tests separately >>>>>each one passes but when I batch them they fail because in subsequent >>>>>tests blank nodes are name autos2, autos3, etc. However they don't >>>>>share the same graph or triple store. Have you changed this behavior >>>>>delbierately? >>>> >>>> Yes this behavior changed in the 0.8.x releases, the change was made >>>>in >>>> order to resolve a bug in SPARQL 1.1 Update support and also uncovered >>>>a >>>> bug in graph isomorphism calculation which was fixed. >>>> >>>> You shouldn't rely on an internal implementation detail like how the >>>> library assigns blank node identifiers. Blank nodes should always be >>>> identifiable by the triples they appear in so it should be possible to >>>> formulate API calls or SPARQL queries that validate that you have >>>>produced >>>> the data you expected. >>>> >>>>> >>>>>2. There is a bad memory leak in during SPARQL execution of this: >>>> >>>> Define bad memory leak? >>>> >>>> Updates are transactional so it may be a side effect of the library >>>> maintaining the state necessary to rollback the transaction should it >>>>fail >>>> or be aborted. Also the fact that you are replacing constant nodes >>>>with >>>> blank nodes will assign a lot of new identifiers and those identifiers >>>> have to be tracked to prevent collisions. >>>> >>>>> >>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>DELETE { ?map rr:graph ?value . } >>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>WHERE { ?map rr:graph ?value } ; >>>>> >>>>>DELETE { ?map rr:object ?value . } >>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>WHERE { ?map rr:object ?value } ; >>>>> >>>>>DELETE { ?map rr:predicate ?value . } >>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>WHERE { ?map rr:predicate ?value } ; >>>>> >>>>>DELETE { ?map rr:subject ?value . } >>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>WHERE { ?map rr:subject ?value } >>>>> >>>>>The full code is simply: >>>>> >>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>>> ISparqlUpdateProcessor processor = new >>>>>LeviathanUpdateProcessor(dataset); >>>>> var updateParser = new SparqlUpdateParser(); >>>>> >>>>> >>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubma >>>>>p >>>>>sRe >>>>>placeSparql)); >>>>> >>>>>Is this a know problem and has been already fixed or should I >>>>>investigate closely? >>>> >>>> This is not a known issue, I would also guess that the data being used >>>> would have some bearing on the severity of the problem. Please go >>>>ahead >>>> and investigate but I would suspect it is the two things I outlined >>>>above >>>> which are the culprits here. >>>> >>>> Rob >>>> >>>>> >>>>>Thanks, >>>>>Tom >>>>> >>>>>---------------------------------------------------------------------- >>>>>- >>>>>--- >>>>>---- >>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>current >>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>_______________________________________________ >>>>>dotNetRDF-bugs mailing list >>>>>dot...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >>>> >>>> >>>> >>>> >>>> >>>>----------------------------------------------------------------------- >>>>- >>>>------ >>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>current >>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>> http://p.sf.net/sfu/learnmore_122412 >>>> _______________________________________________ >>>> dotNetRDF-bugs mailing list >>>> dot...@li... >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >>------------------------------------------------------------------------- >>- >>---- >>Minimize network downtime and maximize team effectiveness. >>Reduce network management and security costs.Learn how to hire >>the most talented Cisco Certified professionals. Visit the >>Employer Resources Portal >>http://www.cisco.com/web/learning/employer_resources/index.html >>_______________________________________________ >>dotNetRDF-bugs mailing list >>dot...@li... >>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > > > >-------------------------------------------------------------------------- >---- >Precog is a next-generation analytics platform capable of advanced >analytics on semi-structured data. The platform includes APIs for building >apps and a phenomenal toolset for data science. Developers can use >our toolset for easy data analysis & visualization. Get a free account! >http://www2.precog.com/precogplatform/slashdotnewsletter >_______________________________________________ >dotNetRDF-bugs mailing list >dot...@li... >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-12 17:26:47
|
s/not/now That should be "the test will now complete within the timeout" Rob On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >Hey Tom > >So the logic for generating the brute force mappings was completely broken >causing it to get stuck in a memory sucking spin cycle :( > >I rewrote the GenerateMappings() method from scratch to use yield return >and the test will not complete within the timeout but it fails so I still >need to dig further > >We may still be generating incorrect possible mappings or the logic for >brute force may be flawed elsewhere > >Rob > >On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: > >>Hey Tom >> >>The problem is that graph isomorphism is NP-hard so sometimes the only >>option we have is to attempt to brute force the problem >> >>I've started added some Debug.WriteLine() to GraphMatcher to track down >>where things go wrong >> >>For your graphs they may look trivially equal but to code they are not, >>the reason this worked prior to 0.8.0 is that one of the things we try is >>a trivial mapping (assume blank nodes have same IDs in both graphs) so in >>previous releases you would likely have hit this case and been fine. >> >>You have 33 blank nodes in the graph of which only 6 are uniquely >>identifiable and mappable. The matcher generates a candidate mapping for >>the whole graph but its best effort is incorrect, so then it falls back >>to >>brute force. I need to dig further into whether the candidate mapping >>could be improved but this is not trivial to debug and will take some >>time >>to resolve. >> >>We may be able to reduce the "memory leak" by using yield rather than >>pre-generating all possible mapping but this is a tricky refactor, it's >>been a long time since I wrote the code originally and I remember that >>doing the mapping in the yield form proved thorny at the time so I chose >>not to. The code itself for generating the mappings has some slightly >>strange things in it so I really need to spend a block of time refreshing >>myself on the logic there to check that it is sound before I attempt to >>refactor. >> >>Rob >> >>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...> >>wrote: >> >>>Hm, I was wrong actually. >>> >>>I tried comparing the exact same graphs loaded from Turtle in >>>dotNetRDF test project but I got the unit test wrong. >>> >>>I have added the CORE-345 bug and committed a failing test case [1]. >>>Could you please have a look at this? >>> >>>Thanks, >>>Tom >>> >>>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>> >>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>><tom...@gm...> wrote: >>>> Hi Rob >>>> >>>> I finally got back to R2RML to analyze why I am getting that memory >>>> leak. It seems connected to the changes you had to introduce for >>>> SPARQL 1.1. >>>> >>>> I have determined that it happens in GraphMatcher#GenerateMappings >>>> method. The graphs are equal and I'm not sure what causes the problem. >>>> As soon as TryBruteForceMapping is reached memory consumption explodes >>>> to gigabytes within minutes. >>>> >>>> The low-level problem is the mappings variable in the >>>> GenerateMappings, which within a few iteration contains thousands of >>>> elements. >>>> >>>> This problem no longer occurs on trunk. Have you actually been >>>> introducing any fixes around that area? >>>> >>>> Tom >>>> >>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> >>>>wrote: >>>>> Comments inline: >>>>> >>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> >>>>>wrote: >>>>> >>>>>>Hi Rob >>>>>> >>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>>>>experiencing two issues. >>>>>> >>>>>>1. In my unit tests I relied on the way the library assigns blank >>>>>>node >>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>separately >>>>>>each one passes but when I batch them they fail because in subsequent >>>>>>tests blank nodes are name autos2, autos3, etc. However they don't >>>>>>share the same graph or triple store. Have you changed this behavior >>>>>>delbierately? >>>>> >>>>> Yes this behavior changed in the 0.8.x releases, the change was made >>>>>in >>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>uncovered >>>>>a >>>>> bug in graph isomorphism calculation which was fixed. >>>>> >>>>> You shouldn't rely on an internal implementation detail like how the >>>>> library assigns blank node identifiers. Blank nodes should always be >>>>> identifiable by the triples they appear in so it should be possible >>>>>to >>>>> formulate API calls or SPARQL queries that validate that you have >>>>>produced >>>>> the data you expected. >>>>> >>>>>> >>>>>>2. There is a bad memory leak in during SPARQL execution of this: >>>>> >>>>> Define bad memory leak? >>>>> >>>>> Updates are transactional so it may be a side effect of the library >>>>> maintaining the state necessary to rollback the transaction should it >>>>>fail >>>>> or be aborted. Also the fact that you are replacing constant nodes >>>>>with >>>>> blank nodes will assign a lot of new identifiers and those >>>>>identifiers >>>>> have to be tracked to prevent collisions. >>>>> >>>>>> >>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>DELETE { ?map rr:graph ?value . } >>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>WHERE { ?map rr:graph ?value } ; >>>>>> >>>>>>DELETE { ?map rr:object ?value . } >>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>WHERE { ?map rr:object ?value } ; >>>>>> >>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>> >>>>>>DELETE { ?map rr:subject ?value . } >>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>WHERE { ?map rr:subject ?value } >>>>>> >>>>>>The full code is simply: >>>>>> >>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>>>> ISparqlUpdateProcessor processor = new >>>>>>LeviathanUpdateProcessor(dataset); >>>>>> var updateParser = new SparqlUpdateParser(); >>>>>> >>>>>> >>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSubm >>>>>>a >>>>>>p >>>>>>sRe >>>>>>placeSparql)); >>>>>> >>>>>>Is this a know problem and has been already fixed or should I >>>>>>investigate closely? >>>>> >>>>> This is not a known issue, I would also guess that the data being >>>>>used >>>>> would have some bearing on the severity of the problem. Please go >>>>>ahead >>>>> and investigate but I would suspect it is the two things I outlined >>>>>above >>>>> which are the culprits here. >>>>> >>>>> Rob >>>>> >>>>>> >>>>>>Thanks, >>>>>>Tom >>>>>> >>>>>>--------------------------------------------------------------------- >>>>>>- >>>>>>- >>>>>>--- >>>>>>---- >>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>current >>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>_______________________________________________ >>>>>>dotNetRDF-bugs mailing list >>>>>>dot...@li... >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>---------------------------------------------------------------------- >>>>>- >>>>>- >>>>>------ >>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>current >>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>> http://p.sf.net/sfu/learnmore_122412 >>>>> _______________________________________________ >>>>> dotNetRDF-bugs mailing list >>>>> dot...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>>------------------------------------------------------------------------ >>>- >>>- >>>---- >>>Minimize network downtime and maximize team effectiveness. >>>Reduce network management and security costs.Learn how to hire >>>the most talented Cisco Certified professionals. Visit the >>>Employer Resources Portal >>>http://www.cisco.com/web/learning/employer_resources/index.html >>>_______________________________________________ >>>dotNetRDF-bugs mailing list >>>dot...@li... >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >> >> >> >> >>------------------------------------------------------------------------- >>- >>---- >>Precog is a next-generation analytics platform capable of advanced >>analytics on semi-structured data. The platform includes APIs for >>building >>apps and a phenomenal toolset for data science. Developers can use >>our toolset for easy data analysis & visualization. Get a free account! >>http://www2.precog.com/precogplatform/slashdotnewsletter >>_______________________________________________ >>dotNetRDF-bugs mailing list >>dot...@li... >>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > > > >-------------------------------------------------------------------------- >---- >Precog is a next-generation analytics platform capable of advanced >analytics on semi-structured data. The platform includes APIs for building >apps and a phenomenal toolset for data science. Developers can use >our toolset for easy data analysis & visualization. Get a free account! >http://www2.precog.com/precogplatform/slashdotnewsletter >_______________________________________________ >dotNetRDF-bugs mailing list >dot...@li... >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-12 18:21:08
|
Hey Tom This should now be fixed for your test case though I am not 100% convinced that brute forcing is not still broken What I have done to fix this is to add an intermediate step between the rules based and brute force mapping which does a divide and conquer approach What this does is break the unmapped blank node portions of the graph into its constituent isolated sub-graphs (those that share no blank nodes) and then recursively calls Equals() on the candidate matches for the sub-graphs. This approach reduces the amount of work required and the likelihood of needing to brute force at all though we still fall back in the worst case. If you can come up with any more graphs that break GraphMatcher those would be much appreciated Rob On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: >s/not/now > >That should be "the test will now complete within the timeout" > >Rob > >On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: > >>Hey Tom >> >>So the logic for generating the brute force mappings was completely >>broken >>causing it to get stuck in a memory sucking spin cycle :( >> >>I rewrote the GenerateMappings() method from scratch to use yield return >>and the test will not complete within the timeout but it fails so I still >>need to dig further >> >>We may still be generating incorrect possible mappings or the logic for >>brute force may be flawed elsewhere >> >>Rob >> >>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >> >>>Hey Tom >>> >>>The problem is that graph isomorphism is NP-hard so sometimes the only >>>option we have is to attempt to brute force the problem >>> >>>I've started added some Debug.WriteLine() to GraphMatcher to track down >>>where things go wrong >>> >>>For your graphs they may look trivially equal but to code they are not, >>>the reason this worked prior to 0.8.0 is that one of the things we try >>>is >>>a trivial mapping (assume blank nodes have same IDs in both graphs) so >>>in >>>previous releases you would likely have hit this case and been fine. >>> >>>You have 33 blank nodes in the graph of which only 6 are uniquely >>>identifiable and mappable. The matcher generates a candidate mapping >>>for >>>the whole graph but its best effort is incorrect, so then it falls back >>>to >>>brute force. I need to dig further into whether the candidate mapping >>>could be improved but this is not trivial to debug and will take some >>>time >>>to resolve. >>> >>>We may be able to reduce the "memory leak" by using yield rather than >>>pre-generating all possible mapping but this is a tricky refactor, it's >>>been a long time since I wrote the code originally and I remember that >>>doing the mapping in the yield form proved thorny at the time so I chose >>>not to. The code itself for generating the mappings has some slightly >>>strange things in it so I really need to spend a block of time >>>refreshing >>>myself on the logic there to check that it is sound before I attempt to >>>refactor. >>> >>>Rob >>> >>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...> >>>wrote: >>> >>>>Hm, I was wrong actually. >>>> >>>>I tried comparing the exact same graphs loaded from Turtle in >>>>dotNetRDF test project but I got the unit test wrong. >>>> >>>>I have added the CORE-345 bug and committed a failing test case [1]. >>>>Could you please have a look at this? >>>> >>>>Thanks, >>>>Tom >>>> >>>>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>> >>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>><tom...@gm...> wrote: >>>>> Hi Rob >>>>> >>>>> I finally got back to R2RML to analyze why I am getting that memory >>>>> leak. It seems connected to the changes you had to introduce for >>>>> SPARQL 1.1. >>>>> >>>>> I have determined that it happens in GraphMatcher#GenerateMappings >>>>> method. The graphs are equal and I'm not sure what causes the >>>>>problem. >>>>> As soon as TryBruteForceMapping is reached memory consumption >>>>>explodes >>>>> to gigabytes within minutes. >>>>> >>>>> The low-level problem is the mappings variable in the >>>>> GenerateMappings, which within a few iteration contains thousands of >>>>> elements. >>>>> >>>>> This problem no longer occurs on trunk. Have you actually been >>>>> introducing any fixes around that area? >>>>> >>>>> Tom >>>>> >>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> >>>>>wrote: >>>>>> Comments inline: >>>>>> >>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> >>>>>>wrote: >>>>>> >>>>>>>Hi Rob >>>>>>> >>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>>>>>experiencing two issues. >>>>>>> >>>>>>>1. In my unit tests I relied on the way the library assigns blank >>>>>>>node >>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>separately >>>>>>>each one passes but when I batch them they fail because in >>>>>>>subsequent >>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't >>>>>>>share the same graph or triple store. Have you changed this behavior >>>>>>>delbierately? >>>>>> >>>>>> Yes this behavior changed in the 0.8.x releases, the change was made >>>>>>in >>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>uncovered >>>>>>a >>>>>> bug in graph isomorphism calculation which was fixed. >>>>>> >>>>>> You shouldn't rely on an internal implementation detail like how the >>>>>> library assigns blank node identifiers. Blank nodes should always >>>>>>be >>>>>> identifiable by the triples they appear in so it should be possible >>>>>>to >>>>>> formulate API calls or SPARQL queries that validate that you have >>>>>>produced >>>>>> the data you expected. >>>>>> >>>>>>> >>>>>>>2. There is a bad memory leak in during SPARQL execution of this: >>>>>> >>>>>> Define bad memory leak? >>>>>> >>>>>> Updates are transactional so it may be a side effect of the library >>>>>> maintaining the state necessary to rollback the transaction should >>>>>>it >>>>>>fail >>>>>> or be aborted. Also the fact that you are replacing constant nodes >>>>>>with >>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>identifiers >>>>>> have to be tracked to prevent collisions. >>>>>> >>>>>>> >>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>> >>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>> >>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>> >>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>WHERE { ?map rr:subject ?value } >>>>>>> >>>>>>>The full code is simply: >>>>>>> >>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>> var updateParser = new SparqlUpdateParser(); >>>>>>> >>>>>>> >>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSub >>>>>>>m >>>>>>>a >>>>>>>p >>>>>>>sRe >>>>>>>placeSparql)); >>>>>>> >>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>investigate closely? >>>>>> >>>>>> This is not a known issue, I would also guess that the data being >>>>>>used >>>>>> would have some bearing on the severity of the problem. Please go >>>>>>ahead >>>>>> and investigate but I would suspect it is the two things I outlined >>>>>>above >>>>>> which are the culprits here. >>>>>> >>>>>> Rob >>>>>> >>>>>>> >>>>>>>Thanks, >>>>>>>Tom >>>>>>> >>>>>>>-------------------------------------------------------------------- >>>>>>>- >>>>>>>- >>>>>>>- >>>>>>>--- >>>>>>>---- >>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>current >>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>_______________________________________________ >>>>>>>dotNetRDF-bugs mailing list >>>>>>>dot...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>--------------------------------------------------------------------- >>>>>>- >>>>>>- >>>>>>- >>>>>>------ >>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>current >>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>> _______________________________________________ >>>>>> dotNetRDF-bugs mailing list >>>>>> dot...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >>>>----------------------------------------------------------------------- >>>>- >>>>- >>>>- >>>>---- >>>>Minimize network downtime and maximize team effectiveness. >>>>Reduce network management and security costs.Learn how to hire >>>>the most talented Cisco Certified professionals. Visit the >>>>Employer Resources Portal >>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>_______________________________________________ >>>>dotNetRDF-bugs mailing list >>>>dot...@li... >>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>> >>> >>> >>> >>>------------------------------------------------------------------------ >>>- >>>- >>>---- >>>Precog is a next-generation analytics platform capable of advanced >>>analytics on semi-structured data. The platform includes APIs for >>>building >>>apps and a phenomenal toolset for data science. Developers can use >>>our toolset for easy data analysis & visualization. Get a free account! >>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>_______________________________________________ >>>dotNetRDF-bugs mailing list >>>dot...@li... >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >> >> >> >> >>------------------------------------------------------------------------- >>- >>---- >>Precog is a next-generation analytics platform capable of advanced >>analytics on semi-structured data. The platform includes APIs for >>building >>apps and a phenomenal toolset for data science. Developers can use >>our toolset for easy data analysis & visualization. Get a free account! >>http://www2.precog.com/precogplatform/slashdotnewsletter >>_______________________________________________ >>dotNetRDF-bugs mailing list >>dot...@li... >>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > > > >-------------------------------------------------------------------------- >---- >Precog is a next-generation analytics platform capable of advanced >analytics on semi-structured data. The platform includes APIs for building >apps and a phenomenal toolset for data science. Developers can use >our toolset for easy data analysis & visualization. Get a free account! >http://www2.precog.com/precogplatform/slashdotnewsletter >_______________________________________________ >dotNetRDF-bugs mailing list >dot...@li... >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Tomasz P. <tom...@gm...> - 2013-04-12 18:24:12
|
Hi Rob Thanks so much. And yes, I do have 4 or 5 cases which stumble on this same issue. I will add all these to the test fixture. Tom On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote: > Hey Tom > > This should now be fixed for your test case though I am not 100% convinced > that brute forcing is not still broken > > What I have done to fix this is to add an intermediate step between the > rules based and brute force mapping which does a divide and conquer > approach > > What this does is break the unmapped blank node portions of the graph into > its constituent isolated sub-graphs (those that share no blank nodes) and > then recursively calls Equals() on the candidate matches for the > sub-graphs. This approach reduces the amount of work required and the > likelihood of needing to brute force at all though we still fall back in > the worst case. > > If you can come up with any more graphs that break GraphMatcher those > would be much appreciated > > Rob > > On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: > >>s/not/now >> >>That should be "the test will now complete within the timeout" >> >>Rob >> >>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >> >>>Hey Tom >>> >>>So the logic for generating the brute force mappings was completely >>>broken >>>causing it to get stuck in a memory sucking spin cycle :( >>> >>>I rewrote the GenerateMappings() method from scratch to use yield return >>>and the test will not complete within the timeout but it fails so I still >>>need to dig further >>> >>>We may still be generating incorrect possible mappings or the logic for >>>brute force may be flawed elsewhere >>> >>>Rob >>> >>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >>> >>>>Hey Tom >>>> >>>>The problem is that graph isomorphism is NP-hard so sometimes the only >>>>option we have is to attempt to brute force the problem >>>> >>>>I've started added some Debug.WriteLine() to GraphMatcher to track down >>>>where things go wrong >>>> >>>>For your graphs they may look trivially equal but to code they are not, >>>>the reason this worked prior to 0.8.0 is that one of the things we try >>>>is >>>>a trivial mapping (assume blank nodes have same IDs in both graphs) so >>>>in >>>>previous releases you would likely have hit this case and been fine. >>>> >>>>You have 33 blank nodes in the graph of which only 6 are uniquely >>>>identifiable and mappable. The matcher generates a candidate mapping >>>>for >>>>the whole graph but its best effort is incorrect, so then it falls back >>>>to >>>>brute force. I need to dig further into whether the candidate mapping >>>>could be improved but this is not trivial to debug and will take some >>>>time >>>>to resolve. >>>> >>>>We may be able to reduce the "memory leak" by using yield rather than >>>>pre-generating all possible mapping but this is a tricky refactor, it's >>>>been a long time since I wrote the code originally and I remember that >>>>doing the mapping in the yield form proved thorny at the time so I chose >>>>not to. The code itself for generating the mappings has some slightly >>>>strange things in it so I really need to spend a block of time >>>>refreshing >>>>myself on the logic there to check that it is sound before I attempt to >>>>refactor. >>>> >>>>Rob >>>> >>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" <tom...@gm...> >>>>wrote: >>>> >>>>>Hm, I was wrong actually. >>>>> >>>>>I tried comparing the exact same graphs loaded from Turtle in >>>>>dotNetRDF test project but I got the unit test wrong. >>>>> >>>>>I have added the CORE-345 bug and committed a failing test case [1]. >>>>>Could you please have a look at this? >>>>> >>>>>Thanks, >>>>>Tom >>>>> >>>>>[1]: https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>>> >>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>>><tom...@gm...> wrote: >>>>>> Hi Rob >>>>>> >>>>>> I finally got back to R2RML to analyze why I am getting that memory >>>>>> leak. It seems connected to the changes you had to introduce for >>>>>> SPARQL 1.1. >>>>>> >>>>>> I have determined that it happens in GraphMatcher#GenerateMappings >>>>>> method. The graphs are equal and I'm not sure what causes the >>>>>>problem. >>>>>> As soon as TryBruteForceMapping is reached memory consumption >>>>>>explodes >>>>>> to gigabytes within minutes. >>>>>> >>>>>> The low-level problem is the mappings variable in the >>>>>> GenerateMappings, which within a few iteration contains thousands of >>>>>> elements. >>>>>> >>>>>> This problem no longer occurs on trunk. Have you actually been >>>>>> introducing any fixes around that area? >>>>>> >>>>>> Tom >>>>>> >>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> >>>>>>wrote: >>>>>>> Comments inline: >>>>>>> >>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> >>>>>>>wrote: >>>>>>> >>>>>>>>Hi Rob >>>>>>>> >>>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>>>>>>experiencing two issues. >>>>>>>> >>>>>>>>1. In my unit tests I relied on the way the library assigns blank >>>>>>>>node >>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>>separately >>>>>>>>each one passes but when I batch them they fail because in >>>>>>>>subsequent >>>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't >>>>>>>>share the same graph or triple store. Have you changed this behavior >>>>>>>>delbierately? >>>>>>> >>>>>>> Yes this behavior changed in the 0.8.x releases, the change was made >>>>>>>in >>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>>uncovered >>>>>>>a >>>>>>> bug in graph isomorphism calculation which was fixed. >>>>>>> >>>>>>> You shouldn't rely on an internal implementation detail like how the >>>>>>> library assigns blank node identifiers. Blank nodes should always >>>>>>>be >>>>>>> identifiable by the triples they appear in so it should be possible >>>>>>>to >>>>>>> formulate API calls or SPARQL queries that validate that you have >>>>>>>produced >>>>>>> the data you expected. >>>>>>> >>>>>>>> >>>>>>>>2. There is a bad memory leak in during SPARQL execution of this: >>>>>>> >>>>>>> Define bad memory leak? >>>>>>> >>>>>>> Updates are transactional so it may be a side effect of the library >>>>>>> maintaining the state necessary to rollback the transaction should >>>>>>>it >>>>>>>fail >>>>>>> or be aborted. Also the fact that you are replacing constant nodes >>>>>>>with >>>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>>identifiers >>>>>>> have to be tracked to prevent collisions. >>>>>>> >>>>>>>> >>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>>> >>>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>>> >>>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>>> >>>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>>WHERE { ?map rr:subject ?value } >>>>>>>> >>>>>>>>The full code is simply: >>>>>>>> >>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>>> var updateParser = new SparqlUpdateParser(); >>>>>>>> >>>>>>>> >>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutSub >>>>>>>>m >>>>>>>>a >>>>>>>>p >>>>>>>>sRe >>>>>>>>placeSparql)); >>>>>>>> >>>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>>investigate closely? >>>>>>> >>>>>>> This is not a known issue, I would also guess that the data being >>>>>>>used >>>>>>> would have some bearing on the severity of the problem. Please go >>>>>>>ahead >>>>>>> and investigate but I would suspect it is the two things I outlined >>>>>>>above >>>>>>> which are the culprits here. >>>>>>> >>>>>>> Rob >>>>>>> >>>>>>>> >>>>>>>>Thanks, >>>>>>>>Tom >>>>>>>> >>>>>>>>-------------------------------------------------------------------- >>>>>>>>- >>>>>>>>- >>>>>>>>- >>>>>>>>--- >>>>>>>>---- >>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>current >>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>>_______________________________________________ >>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>dot...@li... >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>--------------------------------------------------------------------- >>>>>>>- >>>>>>>- >>>>>>>- >>>>>>>------ >>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>current >>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>>> _______________________________________________ >>>>>>> dotNetRDF-bugs mailing list >>>>>>> dot...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> >>>>>----------------------------------------------------------------------- >>>>>- >>>>>- >>>>>- >>>>>---- >>>>>Minimize network downtime and maximize team effectiveness. >>>>>Reduce network management and security costs.Learn how to hire >>>>>the most talented Cisco Certified professionals. Visit the >>>>>Employer Resources Portal >>>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>>_______________________________________________ >>>>>dotNetRDF-bugs mailing list >>>>>dot...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >>>> >>>> >>>> >>>> >>>>------------------------------------------------------------------------ >>>>- >>>>- >>>>---- >>>>Precog is a next-generation analytics platform capable of advanced >>>>analytics on semi-structured data. The platform includes APIs for >>>>building >>>>apps and a phenomenal toolset for data science. Developers can use >>>>our toolset for easy data analysis & visualization. Get a free account! >>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>_______________________________________________ >>>>dotNetRDF-bugs mailing list >>>>dot...@li... >>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>> >>> >>> >>> >>>------------------------------------------------------------------------- >>>- >>>---- >>>Precog is a next-generation analytics platform capable of advanced >>>analytics on semi-structured data. The platform includes APIs for >>>building >>>apps and a phenomenal toolset for data science. Developers can use >>>our toolset for easy data analysis & visualization. Get a free account! >>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>_______________________________________________ >>>dotNetRDF-bugs mailing list >>>dot...@li... >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >> >> >> >> >>-------------------------------------------------------------------------- >>---- >>Precog is a next-generation analytics platform capable of advanced >>analytics on semi-structured data. The platform includes APIs for building >>apps and a phenomenal toolset for data science. Developers can use >>our toolset for easy data analysis & visualization. Get a free account! >>http://www2.precog.com/precogplatform/slashdotnewsletter >>_______________________________________________ >>dotNetRDF-bugs mailing list >>dot...@li... >>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > > > > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced > analytics on semi-structured data. The platform includes APIs for building > apps and a phenomenal toolset for data science. Developers can use > our toolset for easy data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter > _______________________________________________ > dotNetRDF-bugs mailing list > dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-12 18:34:21
|
Those would be useful Btw I closed the issue branch so please just add the tests to default Rob On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...> wrote: >Hi Rob > >Thanks so much. And yes, I do have 4 or 5 cases which stumble on this >same issue. I will add all these to the test fixture. > >Tom > >On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote: >> Hey Tom >> >> This should now be fixed for your test case though I am not 100% >>convinced >> that brute forcing is not still broken >> >> What I have done to fix this is to add an intermediate step between the >> rules based and brute force mapping which does a divide and conquer >> approach >> >> What this does is break the unmapped blank node portions of the graph >>into >> its constituent isolated sub-graphs (those that share no blank nodes) >>and >> then recursively calls Equals() on the candidate matches for the >> sub-graphs. This approach reduces the amount of work required and the >> likelihood of needing to brute force at all though we still fall back in >> the worst case. >> >> If you can come up with any more graphs that break GraphMatcher those >> would be much appreciated >> >> Rob >> >> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: >> >>>s/not/now >>> >>>That should be "the test will now complete within the timeout" >>> >>>Rob >>> >>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >>> >>>>Hey Tom >>>> >>>>So the logic for generating the brute force mappings was completely >>>>broken >>>>causing it to get stuck in a memory sucking spin cycle :( >>>> >>>>I rewrote the GenerateMappings() method from scratch to use yield >>>>return >>>>and the test will not complete within the timeout but it fails so I >>>>still >>>>need to dig further >>>> >>>>We may still be generating incorrect possible mappings or the logic for >>>>brute force may be flawed elsewhere >>>> >>>>Rob >>>> >>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >>>> >>>>>Hey Tom >>>>> >>>>>The problem is that graph isomorphism is NP-hard so sometimes the only >>>>>option we have is to attempt to brute force the problem >>>>> >>>>>I've started added some Debug.WriteLine() to GraphMatcher to track >>>>>down >>>>>where things go wrong >>>>> >>>>>For your graphs they may look trivially equal but to code they are >>>>>not, >>>>>the reason this worked prior to 0.8.0 is that one of the things we try >>>>>is >>>>>a trivial mapping (assume blank nodes have same IDs in both graphs) so >>>>>in >>>>>previous releases you would likely have hit this case and been fine. >>>>> >>>>>You have 33 blank nodes in the graph of which only 6 are uniquely >>>>>identifiable and mappable. The matcher generates a candidate mapping >>>>>for >>>>>the whole graph but its best effort is incorrect, so then it falls >>>>>back >>>>>to >>>>>brute force. I need to dig further into whether the candidate mapping >>>>>could be improved but this is not trivial to debug and will take some >>>>>time >>>>>to resolve. >>>>> >>>>>We may be able to reduce the "memory leak" by using yield rather than >>>>>pre-generating all possible mapping but this is a tricky refactor, >>>>>it's >>>>>been a long time since I wrote the code originally and I remember that >>>>>doing the mapping in the yield form proved thorny at the time so I >>>>>chose >>>>>not to. The code itself for generating the mappings has some slightly >>>>>strange things in it so I really need to spend a block of time >>>>>refreshing >>>>>myself on the logic there to check that it is sound before I attempt >>>>>to >>>>>refactor. >>>>> >>>>>Rob >>>>> >>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" >>>>><tom...@gm...> >>>>>wrote: >>>>> >>>>>>Hm, I was wrong actually. >>>>>> >>>>>>I tried comparing the exact same graphs loaded from Turtle in >>>>>>dotNetRDF test project but I got the unit test wrong. >>>>>> >>>>>>I have added the CORE-345 bug and committed a failing test case [1]. >>>>>>Could you please have a look at this? >>>>>> >>>>>>Thanks, >>>>>>Tom >>>>>> >>>>>>[1]: >>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>>>> >>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>>>><tom...@gm...> wrote: >>>>>>> Hi Rob >>>>>>> >>>>>>> I finally got back to R2RML to analyze why I am getting that memory >>>>>>> leak. It seems connected to the changes you had to introduce for >>>>>>> SPARQL 1.1. >>>>>>> >>>>>>> I have determined that it happens in GraphMatcher#GenerateMappings >>>>>>> method. The graphs are equal and I'm not sure what causes the >>>>>>>problem. >>>>>>> As soon as TryBruteForceMapping is reached memory consumption >>>>>>>explodes >>>>>>> to gigabytes within minutes. >>>>>>> >>>>>>> The low-level problem is the mappings variable in the >>>>>>> GenerateMappings, which within a few iteration contains thousands >>>>>>>of >>>>>>> elements. >>>>>>> >>>>>>> This problem no longer occurs on trunk. Have you actually been >>>>>>> introducing any fixes around that area? >>>>>>> >>>>>>> Tom >>>>>>> >>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> >>>>>>>wrote: >>>>>>>> Comments inline: >>>>>>>> >>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> >>>>>>>>wrote: >>>>>>>> >>>>>>>>>Hi Rob >>>>>>>>> >>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>>>>>>>experiencing two issues. >>>>>>>>> >>>>>>>>>1. In my unit tests I relied on the way the library assigns blank >>>>>>>>>node >>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>>>separately >>>>>>>>>each one passes but when I batch them they fail because in >>>>>>>>>subsequent >>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't >>>>>>>>>share the same graph or triple store. Have you changed this >>>>>>>>>behavior >>>>>>>>>delbierately? >>>>>>>> >>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was >>>>>>>>made >>>>>>>>in >>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>>>uncovered >>>>>>>>a >>>>>>>> bug in graph isomorphism calculation which was fixed. >>>>>>>> >>>>>>>> You shouldn't rely on an internal implementation detail like how >>>>>>>>the >>>>>>>> library assigns blank node identifiers. Blank nodes should always >>>>>>>>be >>>>>>>> identifiable by the triples they appear in so it should be >>>>>>>>possible >>>>>>>>to >>>>>>>> formulate API calls or SPARQL queries that validate that you have >>>>>>>>produced >>>>>>>> the data you expected. >>>>>>>> >>>>>>>>> >>>>>>>>>2. There is a bad memory leak in during SPARQL execution of this: >>>>>>>> >>>>>>>> Define bad memory leak? >>>>>>>> >>>>>>>> Updates are transactional so it may be a side effect of the >>>>>>>>library >>>>>>>> maintaining the state necessary to rollback the transaction should >>>>>>>>it >>>>>>>>fail >>>>>>>> or be aborted. Also the fact that you are replacing constant >>>>>>>>nodes >>>>>>>>with >>>>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>>>identifiers >>>>>>>> have to be tracked to prevent collisions. >>>>>>>> >>>>>>>>> >>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>>>> >>>>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>>>> >>>>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>>>> >>>>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>>>WHERE { ?map rr:subject ?value } >>>>>>>>> >>>>>>>>>The full code is simply: >>>>>>>>> >>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>>>> var updateParser = new SparqlUpdateParser(); >>>>>>>>> >>>>>>>>> >>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutS >>>>>>>>>ub >>>>>>>>>m >>>>>>>>>a >>>>>>>>>p >>>>>>>>>sRe >>>>>>>>>placeSparql)); >>>>>>>>> >>>>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>>>investigate closely? >>>>>>>> >>>>>>>> This is not a known issue, I would also guess that the data being >>>>>>>>used >>>>>>>> would have some bearing on the severity of the problem. Please go >>>>>>>>ahead >>>>>>>> and investigate but I would suspect it is the two things I >>>>>>>>outlined >>>>>>>>above >>>>>>>> which are the culprits here. >>>>>>>> >>>>>>>> Rob >>>>>>>> >>>>>>>>> >>>>>>>>>Thanks, >>>>>>>>>Tom >>>>>>>>> >>>>>>>>>------------------------------------------------------------------ >>>>>>>>>-- >>>>>>>>>- >>>>>>>>>- >>>>>>>>>- >>>>>>>>>--- >>>>>>>>>---- >>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, >>>>>>>>>CSS, >>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>>current >>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>>>_______________________________________________ >>>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>dot...@li... >>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>------------------------------------------------------------------- >>>>>>>>-- >>>>>>>>- >>>>>>>>- >>>>>>>>- >>>>>>>>------ >>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, >>>>>>>>CSS, >>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>current >>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>>>> _______________________________________________ >>>>>>>> dotNetRDF-bugs mailing list >>>>>>>> dot...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>> >>>>>>--------------------------------------------------------------------- >>>>>>-- >>>>>>- >>>>>>- >>>>>>- >>>>>>---- >>>>>>Minimize network downtime and maximize team effectiveness. >>>>>>Reduce network management and security costs.Learn how to hire >>>>>>the most talented Cisco Certified professionals. Visit the >>>>>>Employer Resources Portal >>>>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>>>_______________________________________________ >>>>>>dotNetRDF-bugs mailing list >>>>>>dot...@li... >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>---------------------------------------------------------------------- >>>>>-- >>>>>- >>>>>- >>>>>---- >>>>>Precog is a next-generation analytics platform capable of advanced >>>>>analytics on semi-structured data. The platform includes APIs for >>>>>building >>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>our toolset for easy data analysis & visualization. Get a free >>>>>account! >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>_______________________________________________ >>>>>dotNetRDF-bugs mailing list >>>>>dot...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >>>> >>>> >>>> >>>> >>>>----------------------------------------------------------------------- >>>>-- >>>>- >>>>---- >>>>Precog is a next-generation analytics platform capable of advanced >>>>analytics on semi-structured data. The platform includes APIs for >>>>building >>>>apps and a phenomenal toolset for data science. Developers can use >>>>our toolset for easy data analysis & visualization. Get a free account! >>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>_______________________________________________ >>>>dotNetRDF-bugs mailing list >>>>dot...@li... >>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>> >>> >>> >>> >>>------------------------------------------------------------------------ >>>-- >>>---- >>>Precog is a next-generation analytics platform capable of advanced >>>analytics on semi-structured data. The platform includes APIs for >>>building >>>apps and a phenomenal toolset for data science. Developers can use >>>our toolset for easy data analysis & visualization. Get a free account! >>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>_______________________________________________ >>>dotNetRDF-bugs mailing list >>>dot...@li... >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >> >> >> >> >> >>------------------------------------------------------------------------- >>----- >> Precog is a next-generation analytics platform capable of advanced >> analytics on semi-structured data. The platform includes APIs for >>building >> apps and a phenomenal toolset for data science. Developers can use >> our toolset for easy data analysis & visualization. Get a free account! >> http://www2.precog.com/precogplatform/slashdotnewsletter >> _______________________________________________ >> dotNetRDF-bugs mailing list >> dot...@li... >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >-------------------------------------------------------------------------- >---- >Precog is a next-generation analytics platform capable of advanced >analytics on semi-structured data. The platform includes APIs for building >apps and a phenomenal toolset for data science. Developers can use >our toolset for easy data analysis & visualization. Get a free account! >http://www2.precog.com/precogplatform/slashdotnewsletter >_______________________________________________ >dotNetRDF-bugs mailing list >dot...@li... >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Tomasz P. <tom...@gm...> - 2013-04-12 18:56:42
|
I've just committed more test cases. Out of the 6 none fail cause OOM anymore, which is marvellous. However case1 reports false but I'm positive these graphs are actually equal. Thanks, Tom On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote: > Those would be useful > > Btw I closed the issue branch so please just add the tests to default > > Rob > > On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...> > wrote: > >>Hi Rob >> >>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this >>same issue. I will add all these to the test fixture. >> >>Tom >> >>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote: >>> Hey Tom >>> >>> This should now be fixed for your test case though I am not 100% >>>convinced >>> that brute forcing is not still broken >>> >>> What I have done to fix this is to add an intermediate step between the >>> rules based and brute force mapping which does a divide and conquer >>> approach >>> >>> What this does is break the unmapped blank node portions of the graph >>>into >>> its constituent isolated sub-graphs (those that share no blank nodes) >>>and >>> then recursively calls Equals() on the candidate matches for the >>> sub-graphs. This approach reduces the amount of work required and the >>> likelihood of needing to brute force at all though we still fall back in >>> the worst case. >>> >>> If you can come up with any more graphs that break GraphMatcher those >>> would be much appreciated >>> >>> Rob >>> >>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: >>> >>>>s/not/now >>>> >>>>That should be "the test will now complete within the timeout" >>>> >>>>Rob >>>> >>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >>>> >>>>>Hey Tom >>>>> >>>>>So the logic for generating the brute force mappings was completely >>>>>broken >>>>>causing it to get stuck in a memory sucking spin cycle :( >>>>> >>>>>I rewrote the GenerateMappings() method from scratch to use yield >>>>>return >>>>>and the test will not complete within the timeout but it fails so I >>>>>still >>>>>need to dig further >>>>> >>>>>We may still be generating incorrect possible mappings or the logic for >>>>>brute force may be flawed elsewhere >>>>> >>>>>Rob >>>>> >>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >>>>> >>>>>>Hey Tom >>>>>> >>>>>>The problem is that graph isomorphism is NP-hard so sometimes the only >>>>>>option we have is to attempt to brute force the problem >>>>>> >>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track >>>>>>down >>>>>>where things go wrong >>>>>> >>>>>>For your graphs they may look trivially equal but to code they are >>>>>>not, >>>>>>the reason this worked prior to 0.8.0 is that one of the things we try >>>>>>is >>>>>>a trivial mapping (assume blank nodes have same IDs in both graphs) so >>>>>>in >>>>>>previous releases you would likely have hit this case and been fine. >>>>>> >>>>>>You have 33 blank nodes in the graph of which only 6 are uniquely >>>>>>identifiable and mappable. The matcher generates a candidate mapping >>>>>>for >>>>>>the whole graph but its best effort is incorrect, so then it falls >>>>>>back >>>>>>to >>>>>>brute force. I need to dig further into whether the candidate mapping >>>>>>could be improved but this is not trivial to debug and will take some >>>>>>time >>>>>>to resolve. >>>>>> >>>>>>We may be able to reduce the "memory leak" by using yield rather than >>>>>>pre-generating all possible mapping but this is a tricky refactor, >>>>>>it's >>>>>>been a long time since I wrote the code originally and I remember that >>>>>>doing the mapping in the yield form proved thorny at the time so I >>>>>>chose >>>>>>not to. The code itself for generating the mappings has some slightly >>>>>>strange things in it so I really need to spend a block of time >>>>>>refreshing >>>>>>myself on the logic there to check that it is sound before I attempt >>>>>>to >>>>>>refactor. >>>>>> >>>>>>Rob >>>>>> >>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" >>>>>><tom...@gm...> >>>>>>wrote: >>>>>> >>>>>>>Hm, I was wrong actually. >>>>>>> >>>>>>>I tried comparing the exact same graphs loaded from Turtle in >>>>>>>dotNetRDF test project but I got the unit test wrong. >>>>>>> >>>>>>>I have added the CORE-345 bug and committed a failing test case [1]. >>>>>>>Could you please have a look at this? >>>>>>> >>>>>>>Thanks, >>>>>>>Tom >>>>>>> >>>>>>>[1]: >>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>>>>> >>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>>>>><tom...@gm...> wrote: >>>>>>>> Hi Rob >>>>>>>> >>>>>>>> I finally got back to R2RML to analyze why I am getting that memory >>>>>>>> leak. It seems connected to the changes you had to introduce for >>>>>>>> SPARQL 1.1. >>>>>>>> >>>>>>>> I have determined that it happens in GraphMatcher#GenerateMappings >>>>>>>> method. The graphs are equal and I'm not sure what causes the >>>>>>>>problem. >>>>>>>> As soon as TryBruteForceMapping is reached memory consumption >>>>>>>>explodes >>>>>>>> to gigabytes within minutes. >>>>>>>> >>>>>>>> The low-level problem is the mappings variable in the >>>>>>>> GenerateMappings, which within a few iteration contains thousands >>>>>>>>of >>>>>>>> elements. >>>>>>>> >>>>>>>> This problem no longer occurs on trunk. Have you actually been >>>>>>>> introducing any fixes around that area? >>>>>>>> >>>>>>>> Tom >>>>>>>> >>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse <rv...@do...> >>>>>>>>wrote: >>>>>>>>> Comments inline: >>>>>>>>> >>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> >>>>>>>>>wrote: >>>>>>>>> >>>>>>>>>>Hi Rob >>>>>>>>>> >>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and I'm >>>>>>>>>>experiencing two issues. >>>>>>>>>> >>>>>>>>>>1. In my unit tests I relied on the way the library assigns blank >>>>>>>>>>node >>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>>>>separately >>>>>>>>>>each one passes but when I batch them they fail because in >>>>>>>>>>subsequent >>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they don't >>>>>>>>>>share the same graph or triple store. Have you changed this >>>>>>>>>>behavior >>>>>>>>>>delbierately? >>>>>>>>> >>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was >>>>>>>>>made >>>>>>>>>in >>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>>>>uncovered >>>>>>>>>a >>>>>>>>> bug in graph isomorphism calculation which was fixed. >>>>>>>>> >>>>>>>>> You shouldn't rely on an internal implementation detail like how >>>>>>>>>the >>>>>>>>> library assigns blank node identifiers. Blank nodes should always >>>>>>>>>be >>>>>>>>> identifiable by the triples they appear in so it should be >>>>>>>>>possible >>>>>>>>>to >>>>>>>>> formulate API calls or SPARQL queries that validate that you have >>>>>>>>>produced >>>>>>>>> the data you expected. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>2. There is a bad memory leak in during SPARQL execution of this: >>>>>>>>> >>>>>>>>> Define bad memory leak? >>>>>>>>> >>>>>>>>> Updates are transactional so it may be a side effect of the >>>>>>>>>library >>>>>>>>> maintaining the state necessary to rollback the transaction should >>>>>>>>>it >>>>>>>>>fail >>>>>>>>> or be aborted. Also the fact that you are replacing constant >>>>>>>>>nodes >>>>>>>>>with >>>>>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>>>>identifiers >>>>>>>>> have to be tracked to prevent collisions. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>>>>> >>>>>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>>>>> >>>>>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>>>>> >>>>>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>>>>WHERE { ?map rr:subject ?value } >>>>>>>>>> >>>>>>>>>>The full code is simply: >>>>>>>>>> >>>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>>>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>>>>> var updateParser = new SparqlUpdateParser(); >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(ShortcutS >>>>>>>>>>ub >>>>>>>>>>m >>>>>>>>>>a >>>>>>>>>>p >>>>>>>>>>sRe >>>>>>>>>>placeSparql)); >>>>>>>>>> >>>>>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>>>>investigate closely? >>>>>>>>> >>>>>>>>> This is not a known issue, I would also guess that the data being >>>>>>>>>used >>>>>>>>> would have some bearing on the severity of the problem. Please go >>>>>>>>>ahead >>>>>>>>> and investigate but I would suspect it is the two things I >>>>>>>>>outlined >>>>>>>>>above >>>>>>>>> which are the culprits here. >>>>>>>>> >>>>>>>>> Rob >>>>>>>>> >>>>>>>>>> >>>>>>>>>>Thanks, >>>>>>>>>>Tom >>>>>>>>>> >>>>>>>>>>------------------------------------------------------------------ >>>>>>>>>>-- >>>>>>>>>>- >>>>>>>>>>- >>>>>>>>>>- >>>>>>>>>>--- >>>>>>>>>>---- >>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, >>>>>>>>>>CSS, >>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>>>current >>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>>>>_______________________________________________ >>>>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>dot...@li... >>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>------------------------------------------------------------------- >>>>>>>>>-- >>>>>>>>>- >>>>>>>>>- >>>>>>>>>- >>>>>>>>>------ >>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, >>>>>>>>>CSS, >>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>>current >>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>>>>> _______________________________________________ >>>>>>>>> dotNetRDF-bugs mailing list >>>>>>>>> dot...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>> >>>>>>>--------------------------------------------------------------------- >>>>>>>-- >>>>>>>- >>>>>>>- >>>>>>>- >>>>>>>---- >>>>>>>Minimize network downtime and maximize team effectiveness. >>>>>>>Reduce network management and security costs.Learn how to hire >>>>>>>the most talented Cisco Certified professionals. Visit the >>>>>>>Employer Resources Portal >>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>>>>_______________________________________________ >>>>>>>dotNetRDF-bugs mailing list >>>>>>>dot...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>---------------------------------------------------------------------- >>>>>>-- >>>>>>- >>>>>>- >>>>>>---- >>>>>>Precog is a next-generation analytics platform capable of advanced >>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>building >>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>account! >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>_______________________________________________ >>>>>>dotNetRDF-bugs mailing list >>>>>>dot...@li... >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>----------------------------------------------------------------------- >>>>>-- >>>>>- >>>>>---- >>>>>Precog is a next-generation analytics platform capable of advanced >>>>>analytics on semi-structured data. The platform includes APIs for >>>>>building >>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>our toolset for easy data analysis & visualization. Get a free account! >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>_______________________________________________ >>>>>dotNetRDF-bugs mailing list >>>>>dot...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >>>> >>>> >>>> >>>> >>>>------------------------------------------------------------------------ >>>>-- >>>>---- >>>>Precog is a next-generation analytics platform capable of advanced >>>>analytics on semi-structured data. The platform includes APIs for >>>>building >>>>apps and a phenomenal toolset for data science. Developers can use >>>>our toolset for easy data analysis & visualization. Get a free account! >>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>_______________________________________________ >>>>dotNetRDF-bugs mailing list >>>>dot...@li... >>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>> >>> >>> >>> >>> >>>------------------------------------------------------------------------- >>>----- >>> Precog is a next-generation analytics platform capable of advanced >>> analytics on semi-structured data. The platform includes APIs for >>>building >>> apps and a phenomenal toolset for data science. Developers can use >>> our toolset for easy data analysis & visualization. Get a free account! >>> http://www2.precog.com/precogplatform/slashdotnewsletter >>> _______________________________________________ >>> dotNetRDF-bugs mailing list >>> dot...@li... >>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >>-------------------------------------------------------------------------- >>---- >>Precog is a next-generation analytics platform capable of advanced >>analytics on semi-structured data. The platform includes APIs for building >>apps and a phenomenal toolset for data science. Developers can use >>our toolset for easy data analysis & visualization. Get a free account! >>http://www2.precog.com/precogplatform/slashdotnewsletter >>_______________________________________________ >>dotNetRDF-bugs mailing list >>dot...@li... >>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > > > > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced > analytics on semi-structured data. The platform includes APIs for building > apps and a phenomenal toolset for data science. Developers can use > our toolset for easy data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter > _______________________________________________ > dotNetRDF-bugs mailing list > dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-12 18:59:03
|
Ok Can you push the commits up so I can pull them down and take a look at the new test cases Rob On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...> wrote: >I've just committed more test cases. Out of the 6 none fail cause OOM >anymore, which is marvellous. > >However case1 reports false but I'm positive these graphs are actually >equal. > >Thanks, >Tom > >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote: >> Those would be useful >> >> Btw I closed the issue branch so please just add the tests to default >> >> Rob >> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...> >> wrote: >> >>>Hi Rob >>> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this >>>same issue. I will add all these to the test fixture. >>> >>>Tom >>> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> wrote: >>>> Hey Tom >>>> >>>> This should now be fixed for your test case though I am not 100% >>>>convinced >>>> that brute forcing is not still broken >>>> >>>> What I have done to fix this is to add an intermediate step between >>>>the >>>> rules based and brute force mapping which does a divide and conquer >>>> approach >>>> >>>> What this does is break the unmapped blank node portions of the graph >>>>into >>>> its constituent isolated sub-graphs (those that share no blank nodes) >>>>and >>>> then recursively calls Equals() on the candidate matches for the >>>> sub-graphs. This approach reduces the amount of work required and the >>>> likelihood of needing to brute force at all though we still fall back >>>>in >>>> the worst case. >>>> >>>> If you can come up with any more graphs that break GraphMatcher those >>>> would be much appreciated >>>> >>>> Rob >>>> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: >>>> >>>>>s/not/now >>>>> >>>>>That should be "the test will now complete within the timeout" >>>>> >>>>>Rob >>>>> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >>>>> >>>>>>Hey Tom >>>>>> >>>>>>So the logic for generating the brute force mappings was completely >>>>>>broken >>>>>>causing it to get stuck in a memory sucking spin cycle :( >>>>>> >>>>>>I rewrote the GenerateMappings() method from scratch to use yield >>>>>>return >>>>>>and the test will not complete within the timeout but it fails so I >>>>>>still >>>>>>need to dig further >>>>>> >>>>>>We may still be generating incorrect possible mappings or the logic >>>>>>for >>>>>>brute force may be flawed elsewhere >>>>>> >>>>>>Rob >>>>>> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>> >>>>>>>Hey Tom >>>>>>> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the >>>>>>>only >>>>>>>option we have is to attempt to brute force the problem >>>>>>> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track >>>>>>>down >>>>>>>where things go wrong >>>>>>> >>>>>>>For your graphs they may look trivially equal but to code they are >>>>>>>not, >>>>>>>the reason this worked prior to 0.8.0 is that one of the things we >>>>>>>try >>>>>>>is >>>>>>>a trivial mapping (assume blank nodes have same IDs in both graphs) >>>>>>>so >>>>>>>in >>>>>>>previous releases you would likely have hit this case and been fine. >>>>>>> >>>>>>>You have 33 blank nodes in the graph of which only 6 are uniquely >>>>>>>identifiable and mappable. The matcher generates a candidate >>>>>>>mapping >>>>>>>for >>>>>>>the whole graph but its best effort is incorrect, so then it falls >>>>>>>back >>>>>>>to >>>>>>>brute force. I need to dig further into whether the candidate >>>>>>>mapping >>>>>>>could be improved but this is not trivial to debug and will take >>>>>>>some >>>>>>>time >>>>>>>to resolve. >>>>>>> >>>>>>>We may be able to reduce the "memory leak" by using yield rather >>>>>>>than >>>>>>>pre-generating all possible mapping but this is a tricky refactor, >>>>>>>it's >>>>>>>been a long time since I wrote the code originally and I remember >>>>>>>that >>>>>>>doing the mapping in the yield form proved thorny at the time so I >>>>>>>chose >>>>>>>not to. The code itself for generating the mappings has some >>>>>>>slightly >>>>>>>strange things in it so I really need to spend a block of time >>>>>>>refreshing >>>>>>>myself on the logic there to check that it is sound before I attempt >>>>>>>to >>>>>>>refactor. >>>>>>> >>>>>>>Rob >>>>>>> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" >>>>>>><tom...@gm...> >>>>>>>wrote: >>>>>>> >>>>>>>>Hm, I was wrong actually. >>>>>>>> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle in >>>>>>>>dotNetRDF test project but I got the unit test wrong. >>>>>>>> >>>>>>>>I have added the CORE-345 bug and committed a failing test case >>>>>>>>[1]. >>>>>>>>Could you please have a look at this? >>>>>>>> >>>>>>>>Thanks, >>>>>>>>Tom >>>>>>>> >>>>>>>>[1]: >>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>>>>>> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>>>>>><tom...@gm...> wrote: >>>>>>>>> Hi Rob >>>>>>>>> >>>>>>>>> I finally got back to R2RML to analyze why I am getting that >>>>>>>>>memory >>>>>>>>> leak. It seems connected to the changes you had to introduce for >>>>>>>>> SPARQL 1.1. >>>>>>>>> >>>>>>>>> I have determined that it happens in >>>>>>>>>GraphMatcher#GenerateMappings >>>>>>>>> method. The graphs are equal and I'm not sure what causes the >>>>>>>>>problem. >>>>>>>>> As soon as TryBruteForceMapping is reached memory consumption >>>>>>>>>explodes >>>>>>>>> to gigabytes within minutes. >>>>>>>>> >>>>>>>>> The low-level problem is the mappings variable in the >>>>>>>>> GenerateMappings, which within a few iteration contains thousands >>>>>>>>>of >>>>>>>>> elements. >>>>>>>>> >>>>>>>>> This problem no longer occurs on trunk. Have you actually been >>>>>>>>> introducing any fixes around that area? >>>>>>>>> >>>>>>>>> Tom >>>>>>>>> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse >>>>>>>>><rv...@do...> >>>>>>>>>wrote: >>>>>>>>>> Comments inline: >>>>>>>>>> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> >>>>>>>>>>wrote: >>>>>>>>>> >>>>>>>>>>>Hi Rob >>>>>>>>>>> >>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and >>>>>>>>>>>I'm >>>>>>>>>>>experiencing two issues. >>>>>>>>>>> >>>>>>>>>>>1. In my unit tests I relied on the way the library assigns >>>>>>>>>>>blank >>>>>>>>>>>node >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>>>>>separately >>>>>>>>>>>each one passes but when I batch them they fail because in >>>>>>>>>>>subsequent >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they >>>>>>>>>>>don't >>>>>>>>>>>share the same graph or triple store. Have you changed this >>>>>>>>>>>behavior >>>>>>>>>>>delbierately? >>>>>>>>>> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was >>>>>>>>>>made >>>>>>>>>>in >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>>>>>uncovered >>>>>>>>>>a >>>>>>>>>> bug in graph isomorphism calculation which was fixed. >>>>>>>>>> >>>>>>>>>> You shouldn't rely on an internal implementation detail like how >>>>>>>>>>the >>>>>>>>>> library assigns blank node identifiers. Blank nodes should >>>>>>>>>>always >>>>>>>>>>be >>>>>>>>>> identifiable by the triples they appear in so it should be >>>>>>>>>>possible >>>>>>>>>>to >>>>>>>>>> formulate API calls or SPARQL queries that validate that you >>>>>>>>>>have >>>>>>>>>>produced >>>>>>>>>> the data you expected. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL execution of >>>>>>>>>>>this: >>>>>>>>>> >>>>>>>>>> Define bad memory leak? >>>>>>>>>> >>>>>>>>>> Updates are transactional so it may be a side effect of the >>>>>>>>>>library >>>>>>>>>> maintaining the state necessary to rollback the transaction >>>>>>>>>>should >>>>>>>>>>it >>>>>>>>>>fail >>>>>>>>>> or be aborted. Also the fact that you are replacing constant >>>>>>>>>>nodes >>>>>>>>>>with >>>>>>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>>>>>identifiers >>>>>>>>>> have to be tracked to prevent collisions. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>>>>>WHERE { ?map rr:subject ?value } >>>>>>>>>>> >>>>>>>>>>>The full code is simply: >>>>>>>>>>> >>>>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); >>>>>>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>>>>>> var updateParser = new SparqlUpdateParser(); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu >>>>>>>>>>>tS >>>>>>>>>>>ub >>>>>>>>>>>m >>>>>>>>>>>a >>>>>>>>>>>p >>>>>>>>>>>sRe >>>>>>>>>>>placeSparql)); >>>>>>>>>>> >>>>>>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>>>>>investigate closely? >>>>>>>>>> >>>>>>>>>> This is not a known issue, I would also guess that the data >>>>>>>>>>being >>>>>>>>>>used >>>>>>>>>> would have some bearing on the severity of the problem. Please >>>>>>>>>>go >>>>>>>>>>ahead >>>>>>>>>> and investigate but I would suspect it is the two things I >>>>>>>>>>outlined >>>>>>>>>>above >>>>>>>>>> which are the culprits here. >>>>>>>>>> >>>>>>>>>> Rob >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>Thanks, >>>>>>>>>>>Tom >>>>>>>>>>> >>>>>>>>>>>---------------------------------------------------------------- >>>>>>>>>>>-- >>>>>>>>>>>-- >>>>>>>>>>>- >>>>>>>>>>>- >>>>>>>>>>>- >>>>>>>>>>>--- >>>>>>>>>>>---- >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, >>>>>>>>>>>CSS, >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>>>>current >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>>Microsoft >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>>>>>_______________________________________________ >>>>>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>>dot...@li... >>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>----------------------------------------------------------------- >>>>>>>>>>-- >>>>>>>>>>-- >>>>>>>>>>- >>>>>>>>>>- >>>>>>>>>>- >>>>>>>>>>------ >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, >>>>>>>>>>CSS, >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>>>current >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>Microsoft >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>>>>>> _______________________________________________ >>>>>>>>>> dotNetRDF-bugs mailing list >>>>>>>>>> dot...@li... >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>> >>>>>>>>------------------------------------------------------------------- >>>>>>>>-- >>>>>>>>-- >>>>>>>>- >>>>>>>>- >>>>>>>>- >>>>>>>>---- >>>>>>>>Minimize network downtime and maximize team effectiveness. >>>>>>>>Reduce network management and security costs.Learn how to hire >>>>>>>>the most talented Cisco Certified professionals. Visit the >>>>>>>>Employer Resources Portal >>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>>>>>_______________________________________________ >>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>dot...@li... >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>-------------------------------------------------------------------- >>>>>>>-- >>>>>>>-- >>>>>>>- >>>>>>>- >>>>>>>---- >>>>>>>Precog is a next-generation analytics platform capable of advanced >>>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>building >>>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>account! >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>_______________________________________________ >>>>>>>dotNetRDF-bugs mailing list >>>>>>>dot...@li... >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>--------------------------------------------------------------------- >>>>>>-- >>>>>>-- >>>>>>- >>>>>>---- >>>>>>Precog is a next-generation analytics platform capable of advanced >>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>building >>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>account! >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>_______________________________________________ >>>>>>dotNetRDF-bugs mailing list >>>>>>dot...@li... >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>---------------------------------------------------------------------- >>>>>-- >>>>>-- >>>>>---- >>>>>Precog is a next-generation analytics platform capable of advanced >>>>>analytics on semi-structured data. The platform includes APIs for >>>>>building >>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>our toolset for easy data analysis & visualization. Get a free >>>>>account! >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>_______________________________________________ >>>>>dotNetRDF-bugs mailing list >>>>>dot...@li... >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >>>> >>>> >>>> >>>> >>>> >>>>----------------------------------------------------------------------- >>>>-- >>>>----- >>>> Precog is a next-generation analytics platform capable of advanced >>>> analytics on semi-structured data. The platform includes APIs for >>>>building >>>> apps and a phenomenal toolset for data science. Developers can use >>>> our toolset for easy data analysis & visualization. Get a free >>>>account! >>>> http://www2.precog.com/precogplatform/slashdotnewsletter >>>> _______________________________________________ >>>> dotNetRDF-bugs mailing list >>>> dot...@li... >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>>------------------------------------------------------------------------ >>>-- >>>---- >>>Precog is a next-generation analytics platform capable of advanced >>>analytics on semi-structured data. The platform includes APIs for >>>building >>>apps and a phenomenal toolset for data science. Developers can use >>>our toolset for easy data analysis & visualization. Get a free account! >>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>_______________________________________________ >>>dotNetRDF-bugs mailing list >>>dot...@li... >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >> >> >> >> >> >>------------------------------------------------------------------------- >>----- >> Precog is a next-generation analytics platform capable of advanced >> analytics on semi-structured data. The platform includes APIs for >>building >> apps and a phenomenal toolset for data science. Developers can use >> our toolset for easy data analysis & visualization. Get a free account! >> http://www2.precog.com/precogplatform/slashdotnewsletter >> _______________________________________________ >> dotNetRDF-bugs mailing list >> dot...@li... >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >-------------------------------------------------------------------------- >---- >Precog is a next-generation analytics platform capable of advanced >analytics on semi-structured data. The platform includes APIs for building >apps and a phenomenal toolset for data science. Developers can use >our toolset for easy data analysis & visualization. Get a free account! >http://www2.precog.com/precogplatform/slashdotnewsletter >_______________________________________________ >dotNetRDF-bugs mailing list >dot...@li... >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Tomek P. <to...@pl...> - 2013-04-12 19:05:47
|
I did with a little delay. Please check now. Tom On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote: > Ok > > Can you push the commits up so I can pull them down and take a look at the > new test cases > > Rob > > On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...> > wrote: > > >I've just committed more test cases. Out of the 6 none fail cause OOM > >anymore, which is marvellous. > > > >However case1 reports false but I'm positive these graphs are actually > >equal. > > > >Thanks, > >Tom > > > >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote: > >> Those would be useful > >> > >> Btw I closed the issue branch so please just add the tests to default > >> > >> Rob > >> > >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm... > > > >> wrote: > >> > >>>Hi Rob > >>> > >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this > >>>same issue. I will add all these to the test fixture. > >>> > >>>Tom > >>> > >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> > wrote: > >>>> Hey Tom > >>>> > >>>> This should now be fixed for your test case though I am not 100% > >>>>convinced > >>>> that brute forcing is not still broken > >>>> > >>>> What I have done to fix this is to add an intermediate step between > >>>>the > >>>> rules based and brute force mapping which does a divide and conquer > >>>> approach > >>>> > >>>> What this does is break the unmapped blank node portions of the graph > >>>>into > >>>> its constituent isolated sub-graphs (those that share no blank nodes) > >>>>and > >>>> then recursively calls Equals() on the candidate matches for the > >>>> sub-graphs. This approach reduces the amount of work required and the > >>>> likelihood of needing to brute force at all though we still fall back > >>>>in > >>>> the worst case. > >>>> > >>>> If you can come up with any more graphs that break GraphMatcher those > >>>> would be much appreciated > >>>> > >>>> Rob > >>>> > >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: > >>>> > >>>>>s/not/now > >>>>> > >>>>>That should be "the test will now complete within the timeout" > >>>>> > >>>>>Rob > >>>>> > >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: > >>>>> > >>>>>>Hey Tom > >>>>>> > >>>>>>So the logic for generating the brute force mappings was completely > >>>>>>broken > >>>>>>causing it to get stuck in a memory sucking spin cycle :( > >>>>>> > >>>>>>I rewrote the GenerateMappings() method from scratch to use yield > >>>>>>return > >>>>>>and the test will not complete within the timeout but it fails so I > >>>>>>still > >>>>>>need to dig further > >>>>>> > >>>>>>We may still be generating incorrect possible mappings or the logic > >>>>>>for > >>>>>>brute force may be flawed elsewhere > >>>>>> > >>>>>>Rob > >>>>>> > >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: > >>>>>> > >>>>>>>Hey Tom > >>>>>>> > >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the > >>>>>>>only > >>>>>>>option we have is to attempt to brute force the problem > >>>>>>> > >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track > >>>>>>>down > >>>>>>>where things go wrong > >>>>>>> > >>>>>>>For your graphs they may look trivially equal but to code they are > >>>>>>>not, > >>>>>>>the reason this worked prior to 0.8.0 is that one of the things we > >>>>>>>try > >>>>>>>is > >>>>>>>a trivial mapping (assume blank nodes have same IDs in both graphs) > >>>>>>>so > >>>>>>>in > >>>>>>>previous releases you would likely have hit this case and been fine. > >>>>>>> > >>>>>>>You have 33 blank nodes in the graph of which only 6 are uniquely > >>>>>>>identifiable and mappable. The matcher generates a candidate > >>>>>>>mapping > >>>>>>>for > >>>>>>>the whole graph but its best effort is incorrect, so then it falls > >>>>>>>back > >>>>>>>to > >>>>>>>brute force. I need to dig further into whether the candidate > >>>>>>>mapping > >>>>>>>could be improved but this is not trivial to debug and will take > >>>>>>>some > >>>>>>>time > >>>>>>>to resolve. > >>>>>>> > >>>>>>>We may be able to reduce the "memory leak" by using yield rather > >>>>>>>than > >>>>>>>pre-generating all possible mapping but this is a tricky refactor, > >>>>>>>it's > >>>>>>>been a long time since I wrote the code originally and I remember > >>>>>>>that > >>>>>>>doing the mapping in the yield form proved thorny at the time so I > >>>>>>>chose > >>>>>>>not to. The code itself for generating the mappings has some > >>>>>>>slightly > >>>>>>>strange things in it so I really need to spend a block of time > >>>>>>>refreshing > >>>>>>>myself on the logic there to check that it is sound before I attempt > >>>>>>>to > >>>>>>>refactor. > >>>>>>> > >>>>>>>Rob > >>>>>>> > >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" > >>>>>>><tom...@gm...> > >>>>>>>wrote: > >>>>>>> > >>>>>>>>Hm, I was wrong actually. > >>>>>>>> > >>>>>>>>I tried comparing the exact same graphs loaded from Turtle in > >>>>>>>>dotNetRDF test project but I got the unit test wrong. > >>>>>>>> > >>>>>>>>I have added the CORE-345 bug and committed a failing test case > >>>>>>>>[1]. > >>>>>>>>Could you please have a look at this? > >>>>>>>> > >>>>>>>>Thanks, > >>>>>>>>Tom > >>>>>>>> > >>>>>>>>[1]: > >>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 > >>>>>>>> > >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz > >>>>>>>><tom...@gm...> wrote: > >>>>>>>>> Hi Rob > >>>>>>>>> > >>>>>>>>> I finally got back to R2RML to analyze why I am getting that > >>>>>>>>>memory > >>>>>>>>> leak. It seems connected to the changes you had to introduce for > >>>>>>>>> SPARQL 1.1. > >>>>>>>>> > >>>>>>>>> I have determined that it happens in > >>>>>>>>>GraphMatcher#GenerateMappings > >>>>>>>>> method. The graphs are equal and I'm not sure what causes the > >>>>>>>>>problem. > >>>>>>>>> As soon as TryBruteForceMapping is reached memory consumption > >>>>>>>>>explodes > >>>>>>>>> to gigabytes within minutes. > >>>>>>>>> > >>>>>>>>> The low-level problem is the mappings variable in the > >>>>>>>>> GenerateMappings, which within a few iteration contains thousands > >>>>>>>>>of > >>>>>>>>> elements. > >>>>>>>>> > >>>>>>>>> This problem no longer occurs on trunk. Have you actually been > >>>>>>>>> introducing any fixes around that area? > >>>>>>>>> > >>>>>>>>> Tom > >>>>>>>>> > >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse > >>>>>>>>><rv...@do...> > >>>>>>>>>wrote: > >>>>>>>>>> Comments inline: > >>>>>>>>>> > >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" <to...@pl...> > >>>>>>>>>>wrote: > >>>>>>>>>> > >>>>>>>>>>>Hi Rob > >>>>>>>>>>> > >>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and > >>>>>>>>>>>I'm > >>>>>>>>>>>experiencing two issues. > >>>>>>>>>>> > >>>>>>>>>>>1. In my unit tests I relied on the way the library assigns > >>>>>>>>>>>blank > >>>>>>>>>>>node > >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests > >>>>>>>>>>>separately > >>>>>>>>>>>each one passes but when I batch them they fail because in > >>>>>>>>>>>subsequent > >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they > >>>>>>>>>>>don't > >>>>>>>>>>>share the same graph or triple store. Have you changed this > >>>>>>>>>>>behavior > >>>>>>>>>>>delbierately? > >>>>>>>>>> > >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was > >>>>>>>>>>made > >>>>>>>>>>in > >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also > >>>>>>>>>>uncovered > >>>>>>>>>>a > >>>>>>>>>> bug in graph isomorphism calculation which was fixed. > >>>>>>>>>> > >>>>>>>>>> You shouldn't rely on an internal implementation detail like how > >>>>>>>>>>the > >>>>>>>>>> library assigns blank node identifiers. Blank nodes should > >>>>>>>>>>always > >>>>>>>>>>be > >>>>>>>>>> identifiable by the triples they appear in so it should be > >>>>>>>>>>possible > >>>>>>>>>>to > >>>>>>>>>> formulate API calls or SPARQL queries that validate that you > >>>>>>>>>>have > >>>>>>>>>>produced > >>>>>>>>>> the data you expected. > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>2. There is a bad memory leak in during SPARQL execution of > >>>>>>>>>>>this: > >>>>>>>>>> > >>>>>>>>>> Define bad memory leak? > >>>>>>>>>> > >>>>>>>>>> Updates are transactional so it may be a side effect of the > >>>>>>>>>>library > >>>>>>>>>> maintaining the state necessary to rollback the transaction > >>>>>>>>>>should > >>>>>>>>>>it > >>>>>>>>>>fail > >>>>>>>>>> or be aborted. Also the fact that you are replacing constant > >>>>>>>>>>nodes > >>>>>>>>>>with > >>>>>>>>>> blank nodes will assign a lot of new identifiers and those > >>>>>>>>>>identifiers > >>>>>>>>>> have to be tracked to prevent collisions. > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> > >>>>>>>>>>>DELETE { ?map rr:graph ?value . } > >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } > >>>>>>>>>>>WHERE { ?map rr:graph ?value } ; > >>>>>>>>>>> > >>>>>>>>>>>DELETE { ?map rr:object ?value . } > >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } > >>>>>>>>>>>WHERE { ?map rr:object ?value } ; > >>>>>>>>>>> > >>>>>>>>>>>DELETE { ?map rr:predicate ?value . } > >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } > >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ; > >>>>>>>>>>> > >>>>>>>>>>>DELETE { ?map rr:subject ?value . } > >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } > >>>>>>>>>>>WHERE { ?map rr:subject ?value } > >>>>>>>>>>> > >>>>>>>>>>>The full code is simply: > >>>>>>>>>>> > >>>>>>>>>>>var dataset = new InMemoryDataset(store, R2RMLMappings.BaseUri); > >>>>>>>>>>> ISparqlUpdateProcessor processor = new > >>>>>>>>>>>LeviathanUpdateProcessor(dataset); > >>>>>>>>>>> var updateParser = new SparqlUpdateParser(); > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu > >>>>>>>>>>>tS > >>>>>>>>>>>ub > >>>>>>>>>>>m > >>>>>>>>>>>a > >>>>>>>>>>>p > >>>>>>>>>>>sRe > >>>>>>>>>>>placeSparql)); > >>>>>>>>>>> > >>>>>>>>>>>Is this a know problem and has been already fixed or should I > >>>>>>>>>>>investigate closely? > >>>>>>>>>> > >>>>>>>>>> This is not a known issue, I would also guess that the data > >>>>>>>>>>being > >>>>>>>>>>used > >>>>>>>>>> would have some bearing on the severity of the problem. Please > >>>>>>>>>>go > >>>>>>>>>>ahead > >>>>>>>>>> and investigate but I would suspect it is the two things I > >>>>>>>>>>outlined > >>>>>>>>>>above > >>>>>>>>>> which are the culprits here. > >>>>>>>>>> > >>>>>>>>>> Rob > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>Thanks, > >>>>>>>>>>>Tom > >>>>>>>>>>> > >>>>>>>>>>>---------------------------------------------------------------- > >>>>>>>>>>>-- > >>>>>>>>>>>-- > >>>>>>>>>>>- > >>>>>>>>>>>- > >>>>>>>>>>>- > >>>>>>>>>>>--- > >>>>>>>>>>>---- > >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, > >>>>>>>>>>>CSS, > >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep your skills > >>>>>>>>>>>current > >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by > >>>>>>>>>>>Microsoft > >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: > >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712 > >>>>>>>>>>>_______________________________________________ > >>>>>>>>>>>dotNetRDF-bugs mailing list > >>>>>>>>>>>dot...@li... > >>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>----------------------------------------------------------------- > >>>>>>>>>>-- > >>>>>>>>>>-- > >>>>>>>>>>- > >>>>>>>>>>- > >>>>>>>>>>- > >>>>>>>>>>------ > >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, > >>>>>>>>>>CSS, > >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills > >>>>>>>>>>current > >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by > >>>>>>>>>>Microsoft > >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: > >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412 > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> dotNetRDF-bugs mailing list > >>>>>>>>>> dot...@li... > >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >>>>>>>> > >>>>>>>>------------------------------------------------------------------- > >>>>>>>>-- > >>>>>>>>-- > >>>>>>>>- > >>>>>>>>- > >>>>>>>>- > >>>>>>>>---- > >>>>>>>>Minimize network downtime and maximize team effectiveness. > >>>>>>>>Reduce network management and security costs.Learn how to hire > >>>>>>>>the most talented Cisco Certified professionals. Visit the > >>>>>>>>Employer Resources Portal > >>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html > >>>>>>>>_______________________________________________ > >>>>>>>>dotNetRDF-bugs mailing list > >>>>>>>>dot...@li... > >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>-------------------------------------------------------------------- > >>>>>>>-- > >>>>>>>-- > >>>>>>>- > >>>>>>>- > >>>>>>>---- > >>>>>>>Precog is a next-generation analytics platform capable of advanced > >>>>>>>analytics on semi-structured data. The platform includes APIs for > >>>>>>>building > >>>>>>>apps and a phenomenal toolset for data science. Developers can use > >>>>>>>our toolset for easy data analysis & visualization. Get a free > >>>>>>>account! > >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter > >>>>>>>_______________________________________________ > >>>>>>>dotNetRDF-bugs mailing list > >>>>>>>dot...@li... > >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>>--------------------------------------------------------------------- > >>>>>>-- > >>>>>>-- > >>>>>>- > >>>>>>---- > >>>>>>Precog is a next-generation analytics platform capable of advanced > >>>>>>analytics on semi-structured data. The platform includes APIs for > >>>>>>building > >>>>>>apps and a phenomenal toolset for data science. Developers can use > >>>>>>our toolset for easy data analysis & visualization. Get a free > >>>>>>account! > >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter > >>>>>>_______________________________________________ > >>>>>>dotNetRDF-bugs mailing list > >>>>>>dot...@li... > >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>---------------------------------------------------------------------- > >>>>>-- > >>>>>-- > >>>>>---- > >>>>>Precog is a next-generation analytics platform capable of advanced > >>>>>analytics on semi-structured data. The platform includes APIs for > >>>>>building > >>>>>apps and a phenomenal toolset for data science. Developers can use > >>>>>our toolset for easy data analysis & visualization. Get a free > >>>>>account! > >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter > >>>>>_______________________________________________ > >>>>>dotNetRDF-bugs mailing list > >>>>>dot...@li... > >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>----------------------------------------------------------------------- > >>>>-- > >>>>----- > >>>> Precog is a next-generation analytics platform capable of advanced > >>>> analytics on semi-structured data. The platform includes APIs for > >>>>building > >>>> apps and a phenomenal toolset for data science. Developers can use > >>>> our toolset for easy data analysis & visualization. Get a free > >>>>account! > >>>> http://www2.precog.com/precogplatform/slashdotnewsletter > >>>> _______________________________________________ > >>>> dotNetRDF-bugs mailing list > >>>> dot...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >>> > >>>------------------------------------------------------------------------ > >>>-- > >>>---- > >>>Precog is a next-generation analytics platform capable of advanced > >>>analytics on semi-structured data. The platform includes APIs for > >>>building > >>>apps and a phenomenal toolset for data science. Developers can use > >>>our toolset for easy data analysis & visualization. Get a free account! > >>>http://www2.precog.com/precogplatform/slashdotnewsletter > >>>_______________________________________________ > >>>dotNetRDF-bugs mailing list > >>>dot...@li... > >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > >> > >> > >> > >> > >> > >> > >>------------------------------------------------------------------------- > >>----- > >> Precog is a next-generation analytics platform capable of advanced > >> analytics on semi-structured data. The platform includes APIs for > >>building > >> apps and a phenomenal toolset for data science. Developers can use > >> our toolset for easy data analysis & visualization. Get a free account! > >> http://www2.precog.com/precogplatform/slashdotnewsletter > >> _______________________________________________ > >> dotNetRDF-bugs mailing list > >> dot...@li... > >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > >-------------------------------------------------------------------------- > >---- > >Precog is a next-generation analytics platform capable of advanced > >analytics on semi-structured data. The platform includes APIs for building > >apps and a phenomenal toolset for data science. Developers can use > >our toolset for easy data analysis & visualization. Get a free account! > >http://www2.precog.com/precogplatform/slashdotnewsletter > >_______________________________________________ > >dotNetRDF-bugs mailing list > >dot...@li... > >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > > > > > > > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced > analytics on semi-structured data. The platform includes APIs for building > apps and a phenomenal toolset for data science. Developers can use > our toolset for easy data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter > _______________________________________________ > dotNetRDF-bugs mailing list > dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > |
|
From: Rob V. <rv...@do...> - 2013-04-12 19:38:58
|
Yes I realized that when I tried a pull again after sending the reply Ok so that case is bombing out on brute force mapping which would tend to indicate that there may be an issue there still At a glance the graphs look equivalent but I need to verify this by hand because the sub-graphs are too large and blank node heavy to easily verify whether they are equal and we are just not detecting it correctly or if they are non-equal Rob From: Tomek Pluskiewicz <to...@pl...> Reply-To: dotNetRDF Bug Report tracking and resolution <dot...@li...> Date: Friday, April 12, 2013 12:05 PM To: dotNetRDF Bug Report tracking and resolution <dot...@li...> Subject: Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs > > I did with a little delay. Please check now. > > Tom > > On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote: >> Ok >> >> Can you push the commits up so I can pull them down and take a look at the >> new test cases >> >> Rob >> >> On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...> >> wrote: >> >>> >I've just committed more test cases. Out of the 6 none fail cause OOM >>> >anymore, which is marvellous. >>> > >>> >However case1 reports false but I'm positive these graphs are actually >>> >equal. >>> > >>> >Thanks, >>> >Tom >>> > >>> >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote: >>>> >> Those would be useful >>>> >> >>>> >> Btw I closed the issue branch so please just add the tests to default >>>> >> >>>> >> Rob >>>> >> >>>> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" <tom...@gm...> >>>> >> wrote: >>>> >> >>>>> >>>Hi Rob >>>>> >>> >>>>> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this >>>>> >>>same issue. I will add all these to the test fixture. >>>>> >>> >>>>> >>>Tom >>>>> >>> >>>>> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> >>>>> wrote: >>>>>> >>>> Hey Tom >>>>>> >>>> >>>>>> >>>> This should now be fixed for your test case though I am not 100% >>>>>> >>>>convinced >>>>>> >>>> that brute forcing is not still broken >>>>>> >>>> >>>>>> >>>> What I have done to fix this is to add an intermediate step between >>>>>> >>>>the >>>>>> >>>> rules based and brute force mapping which does a divide and conquer >>>>>> >>>> approach >>>>>> >>>> >>>>>> >>>> What this does is break the unmapped blank node portions of the >>>>>> graph >>>>>> >>>>into >>>>>> >>>> its constituent isolated sub-graphs (those that share no blank >>>>>> nodes) >>>>>> >>>>and >>>>>> >>>> then recursively calls Equals() on the candidate matches for the >>>>>> >>>> sub-graphs. This approach reduces the amount of work required and the >>>>>> >>>> likelihood of needing to brute force at all though we still fall back >>>>>> >>>>in >>>>>> >>>> the worst case. >>>>>> >>>> >>>>>> >>>> If you can come up with any more graphs that break GraphMatcher >>>>>> those >>>>>> >>>> would be much appreciated >>>>>> >>>> >>>>>> >>>> Rob >>>>>> >>>> >>>>>> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>> >>>> >>>>>>> >>>>>s/not/now >>>>>>> >>>>> >>>>>>> >>>>>That should be "the test will now complete within the timeout" >>>>>>> >>>>> >>>>>>> >>>>>Rob >>>>>>> >>>>> >>>>>>> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>> >>>>> >>>>>>>> >>>>>>Hey Tom >>>>>>>> >>>>>> >>>>>>>> >>>>>>So the logic for generating the brute force mappings was >>>>>>>> completely >>>>>>>> >>>>>>broken >>>>>>>> >>>>>>causing it to get stuck in a memory sucking spin cycle :( >>>>>>>> >>>>>> >>>>>>>> >>>>>>I rewrote the GenerateMappings() method from scratch to use yield >>>>>>>> >>>>>>return >>>>>>>> >>>>>>and the test will not complete within the timeout but it fails so I >>>>>>>> >>>>>>still >>>>>>>> >>>>>>need to dig further >>>>>>>> >>>>>> >>>>>>>> >>>>>>We may still be generating incorrect possible mappings or the logic >>>>>>>> >>>>>>for >>>>>>>> >>>>>>brute force may be flawed elsewhere >>>>>>>> >>>>>> >>>>>>>> >>>>>>Rob >>>>>>>> >>>>>> >>>>>>>> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>>> >>>>>> >>>>>>>>> >>>>>>>Hey Tom >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the >>>>>>>>> >>>>>>>only >>>>>>>>> >>>>>>>option we have is to attempt to brute force the problem >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track >>>>>>>>> >>>>>>>down >>>>>>>>> >>>>>>>where things go wrong >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>For your graphs they may look trivially equal but to code they are >>>>>>>>> >>>>>>>not, >>>>>>>>> >>>>>>>the reason this worked prior to 0.8.0 is that one of the things we >>>>>>>>> >>>>>>>try >>>>>>>>> >>>>>>>is >>>>>>>>> >>>>>>>a trivial mapping (assume blank nodes have same IDs in both >>>>>>>>> graphs) >>>>>>>>> >>>>>>>so >>>>>>>>> >>>>>>>in >>>>>>>>> >>>>>>>previous releases you would likely have hit this case and been fine. >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>You have 33 blank nodes in the graph of which only 6 are >>>>>>>>> uniquely >>>>>>>>> >>>>>>>identifiable and mappable. The matcher generates a candidate >>>>>>>>> >>>>>>>mapping >>>>>>>>> >>>>>>>for >>>>>>>>> >>>>>>>the whole graph but its best effort is incorrect, so then it falls >>>>>>>>> >>>>>>>back >>>>>>>>> >>>>>>>to >>>>>>>>> >>>>>>>brute force. I need to dig further into whether the candidate >>>>>>>>> >>>>>>>mapping >>>>>>>>> >>>>>>>could be improved but this is not trivial to debug and will take >>>>>>>>> >>>>>>>some >>>>>>>>> >>>>>>>time >>>>>>>>> >>>>>>>to resolve. >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>We may be able to reduce the "memory leak" by using yield rather >>>>>>>>> >>>>>>>than >>>>>>>>> >>>>>>>pre-generating all possible mapping but this is a tricky >>>>>>>>> refactor, >>>>>>>>> >>>>>>>it's >>>>>>>>> >>>>>>>been a long time since I wrote the code originally and I >>>>>>>>> remember >>>>>>>>> >>>>>>>that >>>>>>>>> >>>>>>>doing the mapping in the yield form proved thorny at the time so I >>>>>>>>> >>>>>>>chose >>>>>>>>> >>>>>>>not to. The code itself for generating the mappings has some >>>>>>>>> >>>>>>>slightly >>>>>>>>> >>>>>>>strange things in it so I really need to spend a block of time >>>>>>>>> >>>>>>>refreshing >>>>>>>>> >>>>>>>myself on the logic there to check that it is sound before I >>>>>>>>> attempt >>>>>>>>> >>>>>>>to >>>>>>>>> >>>>>>>refactor. >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>Rob >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" >>>>>>>>> >>>>>>><tom...@gm...> >>>>>>>>> >>>>>>>wrote: >>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>>Hm, I was wrong actually. >>>>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle in >>>>>>>>>> >>>>>>>>dotNetRDF test project but I got the unit test wrong. >>>>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>I have added the CORE-345 bug and committed a failing test case >>>>>>>>>> >>>>>>>>[1]. >>>>>>>>>> >>>>>>>>Could you please have a look at this? >>>>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>Thanks, >>>>>>>>>> >>>>>>>>Tom >>>>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>[1]: >>>>>>>>>> >>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>>>>>>>> >>>>>>>><tom...@gm...> wrote: >>>>>>>>>>> >>>>>>>>> Hi Rob >>>>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>> I finally got back to R2RML to analyze why I am getting that >>>>>>>>>>> >>>>>>>>>memory >>>>>>>>>>> >>>>>>>>> leak. It seems connected to the changes you had to >>>>>>>>>>> introduce for >>>>>>>>>>> >>>>>>>>> SPARQL 1.1. >>>>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>> I have determined that it happens in >>>>>>>>>>> >>>>>>>>>GraphMatcher#GenerateMappings >>>>>>>>>>> >>>>>>>>> method. The graphs are equal and I'm not sure what causes the >>>>>>>>>>> >>>>>>>>>problem. >>>>>>>>>>> >>>>>>>>> As soon as TryBruteForceMapping is reached memory >>>>>>>>>>> consumption >>>>>>>>>>> >>>>>>>>>explodes >>>>>>>>>>> >>>>>>>>> to gigabytes within minutes. >>>>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>> The low-level problem is the mappings variable in the >>>>>>>>>>> >>>>>>>>> GenerateMappings, which within a few iteration contains thousands >>>>>>>>>>> >>>>>>>>>of >>>>>>>>>>> >>>>>>>>> elements. >>>>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>> This problem no longer occurs on trunk. Have you actually been >>>>>>>>>>> >>>>>>>>> introducing any fixes around that area? >>>>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>> Tom >>>>>>>>>>> >>>>>>>>> >>>>>>>>>>> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse >>>>>>>>>>> >>>>>>>>><rv...@do...> >>>>>>>>>>> >>>>>>>>>wrote: >>>>>>>>>>>> >>>>>>>>>> Comments inline: >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" >>>>>>>>>>>> <to...@pl...> >>>>>>>>>>>> >>>>>>>>>>wrote: >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>Hi Rob >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and >>>>>>>>>>>>> >>>>>>>>>>>I'm >>>>>>>>>>>>> >>>>>>>>>>>experiencing two issues. >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>1. In my unit tests I relied on the way the library assigns >>>>>>>>>>>>> >>>>>>>>>>>blank >>>>>>>>>>>>> >>>>>>>>>>>node >>>>>>>>>>>>> >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>>>>>>> >>>>>>>>>>>separately >>>>>>>>>>>>> >>>>>>>>>>>each one passes but when I batch them they fail because in >>>>>>>>>>>>> >>>>>>>>>>>subsequent >>>>>>>>>>>>> >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. However they >>>>>>>>>>>>> >>>>>>>>>>>don't >>>>>>>>>>>>> >>>>>>>>>>>share the same graph or triple store. Have you changed this >>>>>>>>>>>>> >>>>>>>>>>>behavior >>>>>>>>>>>>> >>>>>>>>>>>delbierately? >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was >>>>>>>>>>>> >>>>>>>>>>made >>>>>>>>>>>> >>>>>>>>>>in >>>>>>>>>>>> >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>>>>>>> >>>>>>>>>>uncovered >>>>>>>>>>>> >>>>>>>>>>a >>>>>>>>>>>> >>>>>>>>>> bug in graph isomorphism calculation which was fixed. >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> You shouldn't rely on an internal implementation detail like how >>>>>>>>>>>> >>>>>>>>>>the >>>>>>>>>>>> >>>>>>>>>> library assigns blank node identifiers. Blank nodes should >>>>>>>>>>>> >>>>>>>>>>always >>>>>>>>>>>> >>>>>>>>>>be >>>>>>>>>>>> >>>>>>>>>> identifiable by the triples they appear in so it should be >>>>>>>>>>>> >>>>>>>>>>possible >>>>>>>>>>>> >>>>>>>>>>to >>>>>>>>>>>> >>>>>>>>>> formulate API calls or SPARQL queries that validate that you >>>>>>>>>>>> >>>>>>>>>>have >>>>>>>>>>>> >>>>>>>>>>produced >>>>>>>>>>>> >>>>>>>>>> the data you expected. >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL >>>>>>>>>>>>> execution of >>>>>>>>>>>>> >>>>>>>>>>>this: >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> Define bad memory leak? >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> Updates are transactional so it may be a side effect of the >>>>>>>>>>>> >>>>>>>>>>library >>>>>>>>>>>> >>>>>>>>>> maintaining the state necessary to rollback the >>>>>>>>>>>> transaction >>>>>>>>>>>> >>>>>>>>>>should >>>>>>>>>>>> >>>>>>>>>>it >>>>>>>>>>>> >>>>>>>>>>fail >>>>>>>>>>>> >>>>>>>>>> or be aborted. Also the fact that you are replacing constant >>>>>>>>>>>> >>>>>>>>>>nodes >>>>>>>>>>>> >>>>>>>>>>with >>>>>>>>>>>> >>>>>>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>>>>>>> >>>>>>>>>>identifiers >>>>>>>>>>>> >>>>>>>>>> have to be tracked to prevent collisions. >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:subject ?value } >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>The full code is simply: >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>var dataset = new InMemoryDataset(store, >>>>>>>>>>>>> R2RMLMappings.BaseUri); >>>>>>>>>>>>> >>>>>>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>>>>>>> >>>>>>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>>>>>>>> >>>>>>>>>>> var updateParser = new >>>>>>>>>>>>> SparqlUpdateParser(); >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu >>>>>>>>>>>>> >>>>>>>>>>>tS >>>>>>>>>>>>> >>>>>>>>>>>ub >>>>>>>>>>>>> >>>>>>>>>>>m >>>>>>>>>>>>> >>>>>>>>>>>a >>>>>>>>>>>>> >>>>>>>>>>>p >>>>>>>>>>>>> >>>>>>>>>>>sRe >>>>>>>>>>>>> >>>>>>>>>>>placeSparql)); >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>>>>>>> >>>>>>>>>>>investigate closely? >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> This is not a known issue, I would also guess that the data >>>>>>>>>>>> >>>>>>>>>>being >>>>>>>>>>>> >>>>>>>>>>used >>>>>>>>>>>> >>>>>>>>>> would have some bearing on the severity of the problem. Please >>>>>>>>>>>> >>>>>>>>>>go >>>>>>>>>>>> >>>>>>>>>>ahead >>>>>>>>>>>> >>>>>>>>>> and investigate but I would suspect it is the two things I >>>>>>>>>>>> >>>>>>>>>>outlined >>>>>>>>>>>> >>>>>>>>>>above >>>>>>>>>>>> >>>>>>>>>> which are the culprits here. >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> Rob >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>Thanks, >>>>>>>>>>>>> >>>>>>>>>>>Tom >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>---------------------------------------------------------------- >>>>>>>>>>>>> >>>>>>>>>>>-- >>>>>>>>>>>>> >>>>>>>>>>>-- >>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>> >>>>>>>>>>>--- >>>>>>>>>>>>> >>>>>>>>>>>---- >>>>>>>>>>>>> >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET >>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5, >>>>>>>>>>>>> >>>>>>>>>>>CSS, >>>>>>>>>>>>> >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep >>>>>>>>>>>>> your skills >>>>>>>>>>>>> >>>>>>>>>>>current >>>>>>>>>>>>> >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>>>> >>>>>>>>>>>Microsoft >>>>>>>>>>>>> >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>>>>>>> >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>>>>>>> >>>>>>>>>>>_______________________________________________ >>>>>>>>>>>>> >>>>>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>>>> >>>>>>>>>>>dot...@li... >>>>>>>>>>>>> >>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>----------------------------------------------------------------- >>>>>>>>>>>> >>>>>>>>>>-- >>>>>>>>>>>> >>>>>>>>>>-- >>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>> >>>>>>>>>>------ >>>>>>>>>>>> >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET >>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5, >>>>>>>>>>>> >>>>>>>>>>CSS, >>>>>>>>>>>> >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>>>>>>>>>>> >>>>>>>>>>current >>>>>>>>>>>> >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>>> >>>>>>>>>>Microsoft >>>>>>>>>>>> >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>>>>>>>> >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>>>> >>>>>>>>>> dotNetRDF-bugs mailing list >>>>>>>>>>>> >>>>>>>>>> dot...@li... >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>------------------------------------------------------------------- >>>>>>>>>> >>>>>>>>-- >>>>>>>>>> >>>>>>>>-- >>>>>>>>>> >>>>>>>>- >>>>>>>>>> >>>>>>>>- >>>>>>>>>> >>>>>>>>- >>>>>>>>>> >>>>>>>>---- >>>>>>>>>> >>>>>>>>Minimize network downtime and maximize team effectiveness. >>>>>>>>>> >>>>>>>>Reduce network management and security costs.Learn how to hire >>>>>>>>>> >>>>>>>>the most talented Cisco Certified professionals. Visit the >>>>>>>>>> >>>>>>>>Employer Resources Portal >>>>>>>>>> >>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>>>>>>> >>>>>>>>_______________________________________________ >>>>>>>>>> >>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>> >>>>>>>>dot...@li... >>>>>>>>>> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>>-------------------------------------------------------------------- >>>>>>>>> >>>>>>>-- >>>>>>>>> >>>>>>>-- >>>>>>>>> >>>>>>>- >>>>>>>>> >>>>>>>- >>>>>>>>> >>>>>>>---- >>>>>>>>> >>>>>>>Precog is a next-generation analytics platform capable of >>>>>>>>> advanced >>>>>>>>> >>>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>>> >>>>>>>building >>>>>>>>> >>>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>>> >>>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>>> >>>>>>>account! >>>>>>>>> >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>>> >>>>>>>_______________________________________________ >>>>>>>>> >>>>>>>dotNetRDF-bugs mailing list >>>>>>>>> >>>>>>>dot...@li... >>>>>>>>> >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>> >>>>>>>> >>>>>>--------------------------------------------------------------------- >>>>>>>> >>>>>>-- >>>>>>>> >>>>>>-- >>>>>>>> >>>>>>- >>>>>>>> >>>>>>---- >>>>>>>> >>>>>>Precog is a next-generation analytics platform capable of >>>>>>>> advanced >>>>>>>> >>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>> >>>>>>building >>>>>>>> >>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>> >>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>> >>>>>>account! >>>>>>>> >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>> >>>>>>_______________________________________________ >>>>>>>> >>>>>>dotNetRDF-bugs mailing list >>>>>>>> >>>>>>dot...@li... >>>>>>>> >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> >>>>>---------------------------------------------------------------------- >>>>>>> >>>>>-- >>>>>>> >>>>>-- >>>>>>> >>>>>---- >>>>>>> >>>>>Precog is a next-generation analytics platform capable of advanced >>>>>>> >>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>> >>>>>building >>>>>>> >>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>> >>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>> >>>>>account! >>>>>>> >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>> >>>>>_______________________________________________ >>>>>>> >>>>>dotNetRDF-bugs mailing list >>>>>>> >>>>>dot...@li... >>>>>>> >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>> >>>>>> >>>>----------------------------------------------------------------------- >>>>>> >>>>-- >>>>>> >>>>----- >>>>>> >>>> Precog is a next-generation analytics platform capable of advanced >>>>>> >>>> analytics on semi-structured data. The platform includes APIs for >>>>>> >>>>building >>>>>> >>>> apps and a phenomenal toolset for data science. Developers can use >>>>>> >>>> our toolset for easy data analysis & visualization. Get a free >>>>>> >>>>account! >>>>>> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>> >>>> _______________________________________________ >>>>>> >>>> dotNetRDF-bugs mailing list >>>>>> >>>> dot...@li... >>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> >>> >>>>> >>>------------------------------------------------------------------------ >>>>> >>>-- >>>>> >>>---- >>>>> >>>Precog is a next-generation analytics platform capable of advanced >>>>> >>>analytics on semi-structured data. The platform includes APIs for >>>>> >>>building >>>>> >>>apps and a phenomenal toolset for data science. Developers can use >>>>> >>>our toolset for easy data analysis & visualization. Get a free account! >>>>> >>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>> >>>_______________________________________________ >>>>> >>>dotNetRDF-bugs mailing list >>>>> >>>dot...@li... >>>>> >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >>------------------------------------------------------------------------- >>>> >>----- >>>> >> Precog is a next-generation analytics platform capable of advanced >>>> >> analytics on semi-structured data. The platform includes APIs for >>>> >>building >>>> >> apps and a phenomenal toolset for data science. Developers can use >>>> >> our toolset for easy data analysis & visualization. Get a free account! >>>> >> http://www2.precog.com/precogplatform/slashdotnewsletter >>>> >> _______________________________________________ >>>> >> dotNetRDF-bugs mailing list >>>> >> dot...@li... >>>> >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> > >>> >-------------------------------------------------------------------------- >>> >---- >>> >Precog is a next-generation analytics platform capable of advanced >>> >analytics on semi-structured data. The platform includes APIs for building >>> >apps and a phenomenal toolset for data science. Developers can use >>> >our toolset for easy data analysis & visualization. Get a free account! >>> >http://www2.precog.com/precogplatform/slashdotnewsletter >>> >_______________________________________________ >>> >dotNetRDF-bugs mailing list >>> >dot...@li... >>> >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> >> >> >> >> >> ----------------------------------------------------------------------------->> - >> Precog is a next-generation analytics platform capable of advanced >> analytics on semi-structured data. The platform includes APIs for building >> apps and a phenomenal toolset for data science. Developers can use >> our toolset for easy data analysis & visualization. Get a free account! >> http://www2.precog.com/precogplatform/slashdotnewsletter >> _______________________________________________ >> dotNetRDF-bugs mailing list >> dot...@li... >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced analytics > on semi-structured data. The platform includes APIs for building apps and a > phenomenal toolset for data science. Developers can use our toolset for easy > data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter______________________ > _________________________ dotNetRDF-bugs mailing list > dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-12 21:15:30
|
Hey Tom So I validated that those graphs were indeed equal Having gone through that process by hand I realized there was an additional rules based mapping step we could be using that we weren't, with this in place we now don't have to use the divide and conquer approach on any of your test cases which will improve performance. All your tests cases now pass, if you come up with any more please go ahead and add them. I will try and look more to figure out if the brute force generator is generating sensible mappings but hopefully now very few graphs should ever have to resort to that approach. Rob From: Rob Vesse <rv...@do...> Reply-To: dotNetRDF Bug Report tracking and resolution <dot...@li...> Date: Friday, April 12, 2013 12:37 PM To: dotNetRDF Bug Report tracking and resolution <dot...@li...> Subject: Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs > Yes I realized that when I tried a pull again after sending the reply > > Ok so that case is bombing out on brute force mapping which would tend to > indicate that there may be an issue there still > > At a glance the graphs look equivalent but I need to verify this by hand > because the sub-graphs are too large and blank node heavy to easily verify > whether they are equal and we are just not detecting it correctly or if they > are non-equal > > Rob > > From: Tomek Pluskiewicz <to...@pl...> > Reply-To: dotNetRDF Bug Report tracking and resolution > <dot...@li...> > Date: Friday, April 12, 2013 12:05 PM > To: dotNetRDF Bug Report tracking and resolution > <dot...@li...> > Subject: Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs > >> >> I did with a little delay. Please check now. >> >> Tom >> >> On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote: >>> Ok >>> >>> Can you push the commits up so I can pull them down and take a look at the >>> new test cases >>> >>> Rob >>> >>> On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...> >>> wrote: >>> >>>> >I've just committed more test cases. Out of the 6 none fail cause OOM >>>> >anymore, which is marvellous. >>>> > >>>> >However case1 reports false but I'm positive these graphs are actually >>>> >equal. >>>> > >>>> >Thanks, >>>> >Tom >>>> > >>>> >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote: >>>>> >> Those would be useful >>>>> >> >>>>> >> Btw I closed the issue branch so please just add the tests to default >>>>> >> >>>>> >> Rob >>>>> >> >>>>> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" >>>>> <tom...@gm...> >>>>> >> wrote: >>>>> >> >>>>>> >>>Hi Rob >>>>>> >>> >>>>>> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this >>>>>> >>>same issue. I will add all these to the test fixture. >>>>>> >>> >>>>>> >>>Tom >>>>>> >>> >>>>>> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> >>>>>> wrote: >>>>>>> >>>> Hey Tom >>>>>>> >>>> >>>>>>> >>>> This should now be fixed for your test case though I am not 100% >>>>>>> >>>>convinced >>>>>>> >>>> that brute forcing is not still broken >>>>>>> >>>> >>>>>>> >>>> What I have done to fix this is to add an intermediate step between >>>>>>> >>>>the >>>>>>> >>>> rules based and brute force mapping which does a divide and conquer >>>>>>> >>>> approach >>>>>>> >>>> >>>>>>> >>>> What this does is break the unmapped blank node portions of the >>>>>>> graph >>>>>>> >>>>into >>>>>>> >>>> its constituent isolated sub-graphs (those that share no blank >>>>>>> nodes) >>>>>>> >>>>and >>>>>>> >>>> then recursively calls Equals() on the candidate matches for the >>>>>>> >>>> sub-graphs. This approach reduces the amount of work required and the >>>>>>> >>>> likelihood of needing to brute force at all though we still fall back >>>>>>> >>>>in >>>>>>> >>>> the worst case. >>>>>>> >>>> >>>>>>> >>>> If you can come up with any more graphs that break GraphMatcher >>>>>>> those >>>>>>> >>>> would be much appreciated >>>>>>> >>>> >>>>>>> >>>> Rob >>>>>>> >>>> >>>>>>> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>> >>>> >>>>>>>> >>>>>s/not/now >>>>>>>> >>>>> >>>>>>>> >>>>>That should be "the test will now complete within the timeout" >>>>>>>> >>>>> >>>>>>>> >>>>>Rob >>>>>>>> >>>>> >>>>>>>> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>>> >>>>> >>>>>>>>> >>>>>>Hey Tom >>>>>>>>> >>>>>> >>>>>>>>> >>>>>>So the logic for generating the brute force mappings was >>>>>>>>> completely >>>>>>>>> >>>>>>broken >>>>>>>>> >>>>>>causing it to get stuck in a memory sucking spin cycle :( >>>>>>>>> >>>>>> >>>>>>>>> >>>>>>I rewrote the GenerateMappings() method from scratch to use yield >>>>>>>>> >>>>>>return >>>>>>>>> >>>>>>and the test will not complete within the timeout but it fails so I >>>>>>>>> >>>>>>still >>>>>>>>> >>>>>>need to dig further >>>>>>>>> >>>>>> >>>>>>>>> >>>>>>We may still be generating incorrect possible mappings or the logic >>>>>>>>> >>>>>>for >>>>>>>>> >>>>>>brute force may be flawed elsewhere >>>>>>>>> >>>>>> >>>>>>>>> >>>>>>Rob >>>>>>>>> >>>>>> >>>>>>>>> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>>>> >>>>>> >>>>>>>>>> >>>>>>>Hey Tom >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the >>>>>>>>>> >>>>>>>only >>>>>>>>>> >>>>>>>option we have is to attempt to brute force the problem >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track >>>>>>>>>> >>>>>>>down >>>>>>>>>> >>>>>>>where things go wrong >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>For your graphs they may look trivially equal but to code they are >>>>>>>>>> >>>>>>>not, >>>>>>>>>> >>>>>>>the reason this worked prior to 0.8.0 is that one of the >>>>>>>>>> things we >>>>>>>>>> >>>>>>>try >>>>>>>>>> >>>>>>>is >>>>>>>>>> >>>>>>>a trivial mapping (assume blank nodes have same IDs in both >>>>>>>>>> graphs) >>>>>>>>>> >>>>>>>so >>>>>>>>>> >>>>>>>in >>>>>>>>>> >>>>>>>previous releases you would likely have hit this case and been fine. >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>You have 33 blank nodes in the graph of which only 6 are >>>>>>>>>> uniquely >>>>>>>>>> >>>>>>>identifiable and mappable. The matcher generates a candidate >>>>>>>>>> >>>>>>>mapping >>>>>>>>>> >>>>>>>for >>>>>>>>>> >>>>>>>the whole graph but its best effort is incorrect, so then it falls >>>>>>>>>> >>>>>>>back >>>>>>>>>> >>>>>>>to >>>>>>>>>> >>>>>>>brute force. I need to dig further into whether the candidate >>>>>>>>>> >>>>>>>mapping >>>>>>>>>> >>>>>>>could be improved but this is not trivial to debug and will take >>>>>>>>>> >>>>>>>some >>>>>>>>>> >>>>>>>time >>>>>>>>>> >>>>>>>to resolve. >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>We may be able to reduce the "memory leak" by using yield rather >>>>>>>>>> >>>>>>>than >>>>>>>>>> >>>>>>>pre-generating all possible mapping but this is a tricky >>>>>>>>>> refactor, >>>>>>>>>> >>>>>>>it's >>>>>>>>>> >>>>>>>been a long time since I wrote the code originally and I >>>>>>>>>> remember >>>>>>>>>> >>>>>>>that >>>>>>>>>> >>>>>>>doing the mapping in the yield form proved thorny at the time so I >>>>>>>>>> >>>>>>>chose >>>>>>>>>> >>>>>>>not to. The code itself for generating the mappings has some >>>>>>>>>> >>>>>>>slightly >>>>>>>>>> >>>>>>>strange things in it so I really need to spend a block of time >>>>>>>>>> >>>>>>>refreshing >>>>>>>>>> >>>>>>>myself on the logic there to check that it is sound before I >>>>>>>>>> attempt >>>>>>>>>> >>>>>>>to >>>>>>>>>> >>>>>>>refactor. >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>Rob >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" >>>>>>>>>> >>>>>>><tom...@gm...> >>>>>>>>>> >>>>>>>wrote: >>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>>Hm, I was wrong actually. >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle in >>>>>>>>>>> >>>>>>>>dotNetRDF test project but I got the unit test wrong. >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>I have added the CORE-345 bug and committed a failing test case >>>>>>>>>>> >>>>>>>>[1]. >>>>>>>>>>> >>>>>>>>Could you please have a look at this? >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>Thanks, >>>>>>>>>>> >>>>>>>>Tom >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>[1]: >>>>>>>>>>> >>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>>>>>>>>> >>>>>>>><tom...@gm...> wrote: >>>>>>>>>>>> >>>>>>>>> Hi Rob >>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>> >>>>>>>>> I finally got back to R2RML to analyze why I am getting that >>>>>>>>>>>> >>>>>>>>>memory >>>>>>>>>>>> >>>>>>>>> leak. It seems connected to the changes you had to >>>>>>>>>>>> introduce for >>>>>>>>>>>> >>>>>>>>> SPARQL 1.1. >>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>> >>>>>>>>> I have determined that it happens in >>>>>>>>>>>> >>>>>>>>>GraphMatcher#GenerateMappings >>>>>>>>>>>> >>>>>>>>> method. The graphs are equal and I'm not sure what causes the >>>>>>>>>>>> >>>>>>>>>problem. >>>>>>>>>>>> >>>>>>>>> As soon as TryBruteForceMapping is reached memory >>>>>>>>>>>> consumption >>>>>>>>>>>> >>>>>>>>>explodes >>>>>>>>>>>> >>>>>>>>> to gigabytes within minutes. >>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>> >>>>>>>>> The low-level problem is the mappings variable in the >>>>>>>>>>>> >>>>>>>>> GenerateMappings, which within a few iteration contains thousands >>>>>>>>>>>> >>>>>>>>>of >>>>>>>>>>>> >>>>>>>>> elements. >>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>> >>>>>>>>> This problem no longer occurs on trunk. Have you actually been >>>>>>>>>>>> >>>>>>>>> introducing any fixes around that area? >>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>> >>>>>>>>> Tom >>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse >>>>>>>>>>>> >>>>>>>>><rv...@do...> >>>>>>>>>>>> >>>>>>>>>wrote: >>>>>>>>>>>>> >>>>>>>>>> Comments inline: >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" >>>>>>>>>>>>> <to...@pl...> >>>>>>>>>>>>> >>>>>>>>>>wrote: >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>Hi Rob >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and >>>>>>>>>>>>>> >>>>>>>>>>>I'm >>>>>>>>>>>>>> >>>>>>>>>>>experiencing two issues. >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>1. In my unit tests I relied on the way the library assigns >>>>>>>>>>>>>> >>>>>>>>>>>blank >>>>>>>>>>>>>> >>>>>>>>>>>node >>>>>>>>>>>>>> >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>>>>>>>> >>>>>>>>>>>separately >>>>>>>>>>>>>> >>>>>>>>>>>each one passes but when I batch them they fail because in >>>>>>>>>>>>>> >>>>>>>>>>>subsequent >>>>>>>>>>>>>> >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. >>>>>>>>>>>>>> However they >>>>>>>>>>>>>> >>>>>>>>>>>don't >>>>>>>>>>>>>> >>>>>>>>>>>share the same graph or triple store. Have you changed this >>>>>>>>>>>>>> >>>>>>>>>>>behavior >>>>>>>>>>>>>> >>>>>>>>>>>delbierately? >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was >>>>>>>>>>>>> >>>>>>>>>>made >>>>>>>>>>>>> >>>>>>>>>>in >>>>>>>>>>>>> >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>>>>>>>> >>>>>>>>>>uncovered >>>>>>>>>>>>> >>>>>>>>>>a >>>>>>>>>>>>> >>>>>>>>>> bug in graph isomorphism calculation which was fixed. >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> You shouldn't rely on an internal implementation detail like how >>>>>>>>>>>>> >>>>>>>>>>the >>>>>>>>>>>>> >>>>>>>>>> library assigns blank node identifiers. Blank nodes should >>>>>>>>>>>>> >>>>>>>>>>always >>>>>>>>>>>>> >>>>>>>>>>be >>>>>>>>>>>>> >>>>>>>>>> identifiable by the triples they appear in so it should be >>>>>>>>>>>>> >>>>>>>>>>possible >>>>>>>>>>>>> >>>>>>>>>>to >>>>>>>>>>>>> >>>>>>>>>> formulate API calls or SPARQL queries that validate that you >>>>>>>>>>>>> >>>>>>>>>>have >>>>>>>>>>>>> >>>>>>>>>>produced >>>>>>>>>>>>> >>>>>>>>>> the data you expected. >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL >>>>>>>>>>>>>> execution of >>>>>>>>>>>>>> >>>>>>>>>>>this: >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> Define bad memory leak? >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> Updates are transactional so it may be a side effect of the >>>>>>>>>>>>> >>>>>>>>>>library >>>>>>>>>>>>> >>>>>>>>>> maintaining the state necessary to rollback the >>>>>>>>>>>>> transaction >>>>>>>>>>>>> >>>>>>>>>>should >>>>>>>>>>>>> >>>>>>>>>>it >>>>>>>>>>>>> >>>>>>>>>>fail >>>>>>>>>>>>> >>>>>>>>>> or be aborted. Also the fact that you are replacing constant >>>>>>>>>>>>> >>>>>>>>>>nodes >>>>>>>>>>>>> >>>>>>>>>>with >>>>>>>>>>>>> >>>>>>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>>>>>>>> >>>>>>>>>>identifiers >>>>>>>>>>>>> >>>>>>>>>> have to be tracked to prevent collisions. >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:subject ?value } >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>The full code is simply: >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>var dataset = new InMemoryDataset(store, >>>>>>>>>>>>>> R2RMLMappings.BaseUri); >>>>>>>>>>>>>> >>>>>>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>>>>>>>> >>>>>>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>>>>>>>>> >>>>>>>>>>> var updateParser = new >>>>>>>>>>>>>> SparqlUpdateParser(); >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromString(Shortcu >>>>>>>>>>>>>> >>>>>>>>>>>tS >>>>>>>>>>>>>> >>>>>>>>>>>ub >>>>>>>>>>>>>> >>>>>>>>>>>m >>>>>>>>>>>>>> >>>>>>>>>>>a >>>>>>>>>>>>>> >>>>>>>>>>>p >>>>>>>>>>>>>> >>>>>>>>>>>sRe >>>>>>>>>>>>>> >>>>>>>>>>>placeSparql)); >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>>>>>>>> >>>>>>>>>>>investigate closely? >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> This is not a known issue, I would also guess that the data >>>>>>>>>>>>> >>>>>>>>>>being >>>>>>>>>>>>> >>>>>>>>>>used >>>>>>>>>>>>> >>>>>>>>>> would have some bearing on the severity of the problem. Please >>>>>>>>>>>>> >>>>>>>>>>go >>>>>>>>>>>>> >>>>>>>>>>ahead >>>>>>>>>>>>> >>>>>>>>>> and investigate but I would suspect it is the two things I >>>>>>>>>>>>> >>>>>>>>>>outlined >>>>>>>>>>>>> >>>>>>>>>>above >>>>>>>>>>>>> >>>>>>>>>> which are the culprits here. >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> Rob >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>Tom >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>---------------------------------------------------------------- >>>>>>>>>>>>>> >>>>>>>>>>>-- >>>>>>>>>>>>>> >>>>>>>>>>>-- >>>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>>> >>>>>>>>>>>--- >>>>>>>>>>>>>> >>>>>>>>>>>---- >>>>>>>>>>>>>> >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET >>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5, >>>>>>>>>>>>>> >>>>>>>>>>>CSS, >>>>>>>>>>>>>> >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep >>>>>>>>>>>>>> your skills >>>>>>>>>>>>>> >>>>>>>>>>>current >>>>>>>>>>>>>> >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>>>>> >>>>>>>>>>>Microsoft >>>>>>>>>>>>>> >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>>>>>>>> >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>>>>>>>> >>>>>>>>>>>_______________________________________________ >>>>>>>>>>>>>> >>>>>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>>>>> >>>>>>>>>>>dot...@li... >>>>>>>>>>>>>> >>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>----------------------------------------------------------------- >>>>>>>>>>>>> >>>>>>>>>>-- >>>>>>>>>>>>> >>>>>>>>>>-- >>>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>>> >>>>>>>>>>------ >>>>>>>>>>>>> >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET >>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5, >>>>>>>>>>>>> >>>>>>>>>>CSS, >>>>>>>>>>>>> >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep >>>>>>>>>>>>> your skills >>>>>>>>>>>>> >>>>>>>>>>current >>>>>>>>>>>>> >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>>>> >>>>>>>>>>Microsoft >>>>>>>>>>>>> >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>>>>>>>>> >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> >>>>>>>>>> dotNetRDF-bugs mailing list >>>>>>>>>>>>> >>>>>>>>>> dot...@li... >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> >>>>>>>>------------------------------------------------------------------- >>>>>>>>>>> >>>>>>>>-- >>>>>>>>>>> >>>>>>>>-- >>>>>>>>>>> >>>>>>>>- >>>>>>>>>>> >>>>>>>>- >>>>>>>>>>> >>>>>>>>- >>>>>>>>>>> >>>>>>>>---- >>>>>>>>>>> >>>>>>>>Minimize network downtime and maximize team effectiveness. >>>>>>>>>>> >>>>>>>>Reduce network management and security costs.Learn how to hire >>>>>>>>>>> >>>>>>>>the most talented Cisco Certified professionals. Visit the >>>>>>>>>>> >>>>>>>>Employer Resources Portal >>>>>>>>>>> >>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>>>>>>>> >>>>>>>>_______________________________________________ >>>>>>>>>>> >>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>> >>>>>>>>dot...@li... >>>>>>>>>>> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>> >>>>>>>>>> >>>>>>>-------------------------------------------------------------------- >>>>>>>>>> >>>>>>>-- >>>>>>>>>> >>>>>>>-- >>>>>>>>>> >>>>>>>- >>>>>>>>>> >>>>>>>- >>>>>>>>>> >>>>>>>---- >>>>>>>>>> >>>>>>>Precog is a next-generation analytics platform capable of >>>>>>>>>> advanced >>>>>>>>>> >>>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>>>> >>>>>>>building >>>>>>>>>> >>>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>>>> >>>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>>>> >>>>>>>account! >>>>>>>>>> >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>>>> >>>>>>>_______________________________________________ >>>>>>>>>> >>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>> >>>>>>>dot...@li... >>>>>>>>>> >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> >>>>>>--------------------------------------------------------------------- >>>>>>>>> >>>>>>-- >>>>>>>>> >>>>>>-- >>>>>>>>> >>>>>>- >>>>>>>>> >>>>>>---- >>>>>>>>> >>>>>>Precog is a next-generation analytics platform capable of >>>>>>>>> advanced >>>>>>>>> >>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>>> >>>>>>building >>>>>>>>> >>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>>> >>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>>> >>>>>>account! >>>>>>>>> >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>>> >>>>>>_______________________________________________ >>>>>>>>> >>>>>>dotNetRDF-bugs mailing list >>>>>>>>> >>>>>>dot...@li... >>>>>>>>> >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>> >>>>>>>> >>>>>---------------------------------------------------------------------- >>>>>>>> >>>>>-- >>>>>>>> >>>>>-- >>>>>>>> >>>>>---- >>>>>>>> >>>>>Precog is a next-generation analytics platform capable of advanced >>>>>>>> >>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>> >>>>>building >>>>>>>> >>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>> >>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>> >>>>>account! >>>>>>>> >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>> >>>>>_______________________________________________ >>>>>>>> >>>>>dotNetRDF-bugs mailing list >>>>>>>> >>>>>dot...@li... >>>>>>>> >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>>----------------------------------------------------------------------- >>>>>>> >>>>-- >>>>>>> >>>>----- >>>>>>> >>>> Precog is a next-generation analytics platform capable of advanced >>>>>>> >>>> analytics on semi-structured data. The platform includes APIs for >>>>>>> >>>>building >>>>>>> >>>> apps and a phenomenal toolset for data science. Developers can use >>>>>>> >>>> our toolset for easy data analysis & visualization. Get a free >>>>>>> >>>>account! >>>>>>> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>> >>>> _______________________________________________ >>>>>>> >>>> dotNetRDF-bugs mailing list >>>>>>> >>>> dot...@li... >>>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>> >>> >>>>>> >>>------------------------------------------------------------------------ >>>>>> >>>-- >>>>>> >>>---- >>>>>> >>>Precog is a next-generation analytics platform capable of advanced >>>>>> >>>analytics on semi-structured data. The platform includes APIs for >>>>>> >>>building >>>>>> >>>apps and a phenomenal toolset for data science. Developers can use >>>>>> >>>our toolset for easy data analysis & visualization. Get a free >>>>>> account! >>>>>> >>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>> >>>_______________________________________________ >>>>>> >>>dotNetRDF-bugs mailing list >>>>>> >>>dot...@li... >>>>>> >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> >>------------------------------------------------------------------------- >>>>> >>----- >>>>> >> Precog is a next-generation analytics platform capable of advanced >>>>> >> analytics on semi-structured data. The platform includes APIs for >>>>> >>building >>>>> >> apps and a phenomenal toolset for data science. Developers can use >>>>> >> our toolset for easy data analysis & visualization. Get a free account! >>>>> >> http://www2.precog.com/precogplatform/slashdotnewsletter >>>>> >> _______________________________________________ >>>>> >> dotNetRDF-bugs mailing list >>>>> >> dot...@li... >>>>> >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> > >>>> >-------------------------------------------------------------------------- >>>> >---- >>>> >Precog is a next-generation analytics platform capable of advanced >>>> >analytics on semi-structured data. The platform includes APIs for building >>>> >apps and a phenomenal toolset for data science. Developers can use >>>> >our toolset for easy data analysis & visualization. Get a free account! >>>> >http://www2.precog.com/precogplatform/slashdotnewsletter >>>> >_______________________________________________ >>>> >dotNetRDF-bugs mailing list >>>> >dot...@li... >>>> >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> >>> >>> >>> >>> >>> ---------------------------------------------------------------------------- >>> -- >>> Precog is a next-generation analytics platform capable of advanced >>> analytics on semi-structured data. The platform includes APIs for building >>> apps and a phenomenal toolset for data science. Developers can use >>> our toolset for easy data analysis & visualization. Get a free account! >>> http://www2.precog.com/precogplatform/slashdotnewsletter >>> _______________________________________________ >>> dotNetRDF-bugs mailing list >>> dot...@li... >>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >> ----------------------------------------------------------------------------- >> - Precog is a next-generation analytics platform capable of advanced >> analytics on semi-structured data. The platform includes APIs for building >> apps and a phenomenal toolset for data science. Developers can use our >> toolset for easy data analysis & visualization. Get a free account! >> http://www2.precog.com/precogplatform/slashdotnewsletter_____________________ >> __________________________ dotNetRDF-bugs mailing list >> dot...@li...://lists.sourceforge.net/lists/listi >> nfo/dotnetrdf-bugs > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced analytics > on semi-structured data. The platform includes APIs for building apps and a > phenomenal toolset for data science. Developers can use our toolset for easy > data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter______________________ > _________________________ dotNetRDF-bugs mailing list > dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |
|
From: Rob V. <rv...@do...> - 2013-04-12 21:49:48
|
I have now fixed the brute force generator and added unit tests specifically for it so as to verify that it does generate all possible mappings I will close out CORE-345 since this should now be completely resolved Rob From: Rob Vesse <rv...@do...> Reply-To: dotNetRDF Bug Report tracking and resolution <dot...@li...> Date: Friday, April 12, 2013 2:14 PM To: dotNetRDF Bug Report tracking and resolution <dot...@li...> Subject: Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs > Hey Tom > > So I validated that those graphs were indeed equal > > Having gone through that process by hand I realized there was an additional > rules based mapping step we could be using that we weren't, with this in place > we now don't have to use the divide and conquer approach on any of your test > cases which will improve performance. > > All your tests cases now pass, if you come up with any more please go ahead > and add them. > > I will try and look more to figure out if the brute force generator is > generating sensible mappings but hopefully now very few graphs should ever > have to resort to that approach. > > Rob > > From: Rob Vesse <rv...@do...> > Reply-To: dotNetRDF Bug Report tracking and resolution > <dot...@li...> > Date: Friday, April 12, 2013 12:37 PM > To: dotNetRDF Bug Report tracking and resolution > <dot...@li...> > Subject: Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs > >> Yes I realized that when I tried a pull again after sending the reply >> >> Ok so that case is bombing out on brute force mapping which would tend to >> indicate that there may be an issue there still >> >> At a glance the graphs look equivalent but I need to verify this by hand >> because the sub-graphs are too large and blank node heavy to easily verify >> whether they are equal and we are just not detecting it correctly or if they >> are non-equal >> >> Rob >> >> From: Tomek Pluskiewicz <to...@pl...> >> Reply-To: dotNetRDF Bug Report tracking and resolution >> <dot...@li...> >> Date: Friday, April 12, 2013 12:05 PM >> To: dotNetRDF Bug Report tracking and resolution >> <dot...@li...> >> Subject: Re: [dotNetRDF-bugs] Scope of autassigned bland node IDs >> >>> >>> I did with a little delay. Please check now. >>> >>> Tom >>> >>> On Apr 12, 2013 8:59 PM, "Rob Vesse" <rv...@do...> wrote: >>>> Ok >>>> >>>> Can you push the commits up so I can pull them down and take a look at the >>>> new test cases >>>> >>>> Rob >>>> >>>> On 4/12/13 11:55 AM, "Tomasz Pluskiewicz" <tom...@gm...> >>>> wrote: >>>> >>>>> >I've just committed more test cases. Out of the 6 none fail cause OOM >>>>> >anymore, which is marvellous. >>>>> > >>>>> >However case1 reports false but I'm positive these graphs are actually >>>>> >equal. >>>>> > >>>>> >Thanks, >>>>> >Tom >>>>> > >>>>> >On Fri, Apr 12, 2013 at 8:33 PM, Rob Vesse <rv...@do...> wrote: >>>>>> >> Those would be useful >>>>>> >> >>>>>> >> Btw I closed the issue branch so please just add the tests to default >>>>>> >> >>>>>> >> Rob >>>>>> >> >>>>>> >> On 4/12/13 11:23 AM, "Tomasz Pluskiewicz" >>>>>> <tom...@gm...> >>>>>> >> wrote: >>>>>> >> >>>>>>> >>>Hi Rob >>>>>>> >>> >>>>>>> >>>Thanks so much. And yes, I do have 4 or 5 cases which stumble on this >>>>>>> >>>same issue. I will add all these to the test fixture. >>>>>>> >>> >>>>>>> >>>Tom >>>>>>> >>> >>>>>>> >>>On Fri, Apr 12, 2013 at 8:20 PM, Rob Vesse <rv...@do...> >>>>>>> wrote: >>>>>>>> >>>> Hey Tom >>>>>>>> >>>> >>>>>>>> >>>> This should now be fixed for your test case though I am not 100% >>>>>>>> >>>>convinced >>>>>>>> >>>> that brute forcing is not still broken >>>>>>>> >>>> >>>>>>>> >>>> What I have done to fix this is to add an intermediate step >>>>>>>> between >>>>>>>> >>>>the >>>>>>>> >>>> rules based and brute force mapping which does a divide and >>>>>>>> conquer >>>>>>>> >>>> approach >>>>>>>> >>>> >>>>>>>> >>>> What this does is break the unmapped blank node portions of the >>>>>>>> graph >>>>>>>> >>>>into >>>>>>>> >>>> its constituent isolated sub-graphs (those that share no blank >>>>>>>> nodes) >>>>>>>> >>>>and >>>>>>>> >>>> then recursively calls Equals() on the candidate matches for the >>>>>>>> >>>> sub-graphs. This approach reduces the amount of work required and the >>>>>>>> >>>> likelihood of needing to brute force at all though we still fall back >>>>>>>> >>>>in >>>>>>>> >>>> the worst case. >>>>>>>> >>>> >>>>>>>> >>>> If you can come up with any more graphs that break GraphMatcher >>>>>>>> those >>>>>>>> >>>> would be much appreciated >>>>>>>> >>>> >>>>>>>> >>>> Rob >>>>>>>> >>>> >>>>>>>> >>>> On 4/12/13 10:25 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>>> >>>> >>>>>>>>> >>>>>s/not/now >>>>>>>>> >>>>> >>>>>>>>> >>>>>That should be "the test will now complete within the timeout" >>>>>>>>> >>>>> >>>>>>>>> >>>>>Rob >>>>>>>>> >>>>> >>>>>>>>> >>>>>On 4/12/13 10:23 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>>>> >>>>> >>>>>>>>>> >>>>>>Hey Tom >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>>So the logic for generating the brute force mappings was >>>>>>>>>> completely >>>>>>>>>> >>>>>>broken >>>>>>>>>> >>>>>>causing it to get stuck in a memory sucking spin cycle :( >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>>I rewrote the GenerateMappings() method from scratch to use yield >>>>>>>>>> >>>>>>return >>>>>>>>>> >>>>>>and the test will not complete within the timeout but it fails so I >>>>>>>>>> >>>>>>still >>>>>>>>>> >>>>>>need to dig further >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>>We may still be generating incorrect possible mappings or the logic >>>>>>>>>> >>>>>>for >>>>>>>>>> >>>>>>brute force may be flawed elsewhere >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>>Rob >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>>On 4/9/13 10:34 AM, "Rob Vesse" <rv...@do...> wrote: >>>>>>>>>> >>>>>> >>>>>>>>>>> >>>>>>>Hey Tom >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>The problem is that graph isomorphism is NP-hard so sometimes the >>>>>>>>>>> >>>>>>>only >>>>>>>>>>> >>>>>>>option we have is to attempt to brute force the problem >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>I've started added some Debug.WriteLine() to GraphMatcher to track >>>>>>>>>>> >>>>>>>down >>>>>>>>>>> >>>>>>>where things go wrong >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>For your graphs they may look trivially equal but to code >>>>>>>>>>> they are >>>>>>>>>>> >>>>>>>not, >>>>>>>>>>> >>>>>>>the reason this worked prior to 0.8.0 is that one of the >>>>>>>>>>> things we >>>>>>>>>>> >>>>>>>try >>>>>>>>>>> >>>>>>>is >>>>>>>>>>> >>>>>>>a trivial mapping (assume blank nodes have same IDs in both >>>>>>>>>>> graphs) >>>>>>>>>>> >>>>>>>so >>>>>>>>>>> >>>>>>>in >>>>>>>>>>> >>>>>>>previous releases you would likely have hit this case and >>>>>>>>>>> been fine. >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>You have 33 blank nodes in the graph of which only 6 are >>>>>>>>>>> uniquely >>>>>>>>>>> >>>>>>>identifiable and mappable. The matcher generates a candidate >>>>>>>>>>> >>>>>>>mapping >>>>>>>>>>> >>>>>>>for >>>>>>>>>>> >>>>>>>the whole graph but its best effort is incorrect, so then it falls >>>>>>>>>>> >>>>>>>back >>>>>>>>>>> >>>>>>>to >>>>>>>>>>> >>>>>>>brute force. I need to dig further into whether the >>>>>>>>>>> candidate >>>>>>>>>>> >>>>>>>mapping >>>>>>>>>>> >>>>>>>could be improved but this is not trivial to debug and will take >>>>>>>>>>> >>>>>>>some >>>>>>>>>>> >>>>>>>time >>>>>>>>>>> >>>>>>>to resolve. >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>We may be able to reduce the "memory leak" by using yield rather >>>>>>>>>>> >>>>>>>than >>>>>>>>>>> >>>>>>>pre-generating all possible mapping but this is a tricky >>>>>>>>>>> refactor, >>>>>>>>>>> >>>>>>>it's >>>>>>>>>>> >>>>>>>been a long time since I wrote the code originally and I >>>>>>>>>>> remember >>>>>>>>>>> >>>>>>>that >>>>>>>>>>> >>>>>>>doing the mapping in the yield form proved thorny at the time so I >>>>>>>>>>> >>>>>>>chose >>>>>>>>>>> >>>>>>>not to. The code itself for generating the mappings has some >>>>>>>>>>> >>>>>>>slightly >>>>>>>>>>> >>>>>>>strange things in it so I really need to spend a block of time >>>>>>>>>>> >>>>>>>refreshing >>>>>>>>>>> >>>>>>>myself on the logic there to check that it is sound before I >>>>>>>>>>> attempt >>>>>>>>>>> >>>>>>>to >>>>>>>>>>> >>>>>>>refactor. >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>Rob >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>On 4/7/13 11:20 AM, "Tomasz Pluskiewicz" >>>>>>>>>>> >>>>>>><tom...@gm...> >>>>>>>>>>> >>>>>>>wrote: >>>>>>>>>>> >>>>>>> >>>>>>>>>>>> >>>>>>>>Hm, I was wrong actually. >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> >>>>>>>>I tried comparing the exact same graphs loaded from Turtle in >>>>>>>>>>>> >>>>>>>>dotNetRDF test project but I got the unit test wrong. >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> >>>>>>>>I have added the CORE-345 bug and committed a failing test case >>>>>>>>>>>> >>>>>>>>[1]. >>>>>>>>>>>> >>>>>>>>Could you please have a look at this? >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> >>>>>>>>Thanks, >>>>>>>>>>>> >>>>>>>>Tom >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> >>>>>>>>[1]: >>>>>>>>>>>> >>>>>>>>https://bitbucket.org/dotnetrdf/dotnetrdf/commits/branch/CORE-345 >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> >>>>>>>>On Sun, Apr 7, 2013 at 7:36 PM, Tomasz Pluskiewicz >>>>>>>>>>>> >>>>>>>><tom...@gm...> wrote: >>>>>>>>>>>>> >>>>>>>>> Hi Rob >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> I finally got back to R2RML to analyze why I am getting that >>>>>>>>>>>>> >>>>>>>>>memory >>>>>>>>>>>>> >>>>>>>>> leak. It seems connected to the changes you had to >>>>>>>>>>>>> introduce for >>>>>>>>>>>>> >>>>>>>>> SPARQL 1.1. >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> I have determined that it happens in >>>>>>>>>>>>> >>>>>>>>>GraphMatcher#GenerateMappings >>>>>>>>>>>>> >>>>>>>>> method. The graphs are equal and I'm not sure what >>>>>>>>>>>>> causes the >>>>>>>>>>>>> >>>>>>>>>problem. >>>>>>>>>>>>> >>>>>>>>> As soon as TryBruteForceMapping is reached memory >>>>>>>>>>>>> consumption >>>>>>>>>>>>> >>>>>>>>>explodes >>>>>>>>>>>>> >>>>>>>>> to gigabytes within minutes. >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> The low-level problem is the mappings variable in the >>>>>>>>>>>>> >>>>>>>>> GenerateMappings, which within a few iteration contains thousands >>>>>>>>>>>>> >>>>>>>>>of >>>>>>>>>>>>> >>>>>>>>> elements. >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> This problem no longer occurs on trunk. Have you >>>>>>>>>>>>> actually been >>>>>>>>>>>>> >>>>>>>>> introducing any fixes around that area? >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> Tom >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> On Mon, Jan 14, 2013 at 12:32 PM, Rob Vesse >>>>>>>>>>>>> >>>>>>>>><rv...@do...> >>>>>>>>>>>>> >>>>>>>>>wrote: >>>>>>>>>>>>>> >>>>>>>>>> Comments inline: >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> On 1/10/13 7:14 PM, "Tomek Pluskiewicz" >>>>>>>>>>>>>> <to...@pl...> >>>>>>>>>>>>>> >>>>>>>>>>wrote: >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>Hi Rob >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>I have just updated to latest dotNetRDF available on NuGet and >>>>>>>>>>>>>>> >>>>>>>>>>>I'm >>>>>>>>>>>>>>> >>>>>>>>>>>experiencing two issues. >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>1. In my unit tests I relied on the way the library assigns >>>>>>>>>>>>>>> >>>>>>>>>>>blank >>>>>>>>>>>>>>> >>>>>>>>>>>node >>>>>>>>>>>>>>> >>>>>>>>>>>identifiers: autos1, autos2 and so on. When I run the tests >>>>>>>>>>>>>>> >>>>>>>>>>>separately >>>>>>>>>>>>>>> >>>>>>>>>>>each one passes but when I batch them they fail because in >>>>>>>>>>>>>>> >>>>>>>>>>>subsequent >>>>>>>>>>>>>>> >>>>>>>>>>>tests blank nodes are name autos2, autos3, etc. >>>>>>>>>>>>>>> However they >>>>>>>>>>>>>>> >>>>>>>>>>>don't >>>>>>>>>>>>>>> >>>>>>>>>>>share the same graph or triple store. Have you >>>>>>>>>>>>>>> changed this >>>>>>>>>>>>>>> >>>>>>>>>>>behavior >>>>>>>>>>>>>>> >>>>>>>>>>>delbierately? >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> Yes this behavior changed in the 0.8.x releases, the change was >>>>>>>>>>>>>> >>>>>>>>>>made >>>>>>>>>>>>>> >>>>>>>>>>in >>>>>>>>>>>>>> >>>>>>>>>> order to resolve a bug in SPARQL 1.1 Update support and also >>>>>>>>>>>>>> >>>>>>>>>>uncovered >>>>>>>>>>>>>> >>>>>>>>>>a >>>>>>>>>>>>>> >>>>>>>>>> bug in graph isomorphism calculation which was fixed. >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> You shouldn't rely on an internal implementation >>>>>>>>>>>>>> detail like how >>>>>>>>>>>>>> >>>>>>>>>>the >>>>>>>>>>>>>> >>>>>>>>>> library assigns blank node identifiers. Blank nodes should >>>>>>>>>>>>>> >>>>>>>>>>always >>>>>>>>>>>>>> >>>>>>>>>>be >>>>>>>>>>>>>> >>>>>>>>>> identifiable by the triples they appear in so it should be >>>>>>>>>>>>>> >>>>>>>>>>possible >>>>>>>>>>>>>> >>>>>>>>>>to >>>>>>>>>>>>>> >>>>>>>>>> formulate API calls or SPARQL queries that validate that you >>>>>>>>>>>>>> >>>>>>>>>>have >>>>>>>>>>>>>> >>>>>>>>>>produced >>>>>>>>>>>>>> >>>>>>>>>> the data you expected. >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>2. There is a bad memory leak in during SPARQL >>>>>>>>>>>>>>> execution of >>>>>>>>>>>>>>> >>>>>>>>>>>this: >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> Define bad memory leak? >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> Updates are transactional so it may be a side effect of the >>>>>>>>>>>>>> >>>>>>>>>>library >>>>>>>>>>>>>> >>>>>>>>>> maintaining the state necessary to rollback the >>>>>>>>>>>>>> transaction >>>>>>>>>>>>>> >>>>>>>>>>should >>>>>>>>>>>>>> >>>>>>>>>>it >>>>>>>>>>>>>> >>>>>>>>>>fail >>>>>>>>>>>>>> >>>>>>>>>> or be aborted. Also the fact that you are replacing constant >>>>>>>>>>>>>> >>>>>>>>>>nodes >>>>>>>>>>>>>> >>>>>>>>>>with >>>>>>>>>>>>>> >>>>>>>>>> blank nodes will assign a lot of new identifiers and those >>>>>>>>>>>>>> >>>>>>>>>>identifiers >>>>>>>>>>>>>> >>>>>>>>>> have to be tracked to prevent collisions. >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>PREFIX rr: <http://www.w3.org/ns/r2rml#> >>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:graph ?value . } >>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:graphMap [ rr:constant ?value ] . } >>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:graph ?value } ; >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:object ?value . } >>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:objectMap [ rr:constant ?value ] . } >>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:object ?value } ; >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:predicate ?value . } >>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:predicateMap [ rr:constant ?value ] . } >>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:predicate ?value } ; >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>DELETE { ?map rr:subject ?value . } >>>>>>>>>>>>>>> >>>>>>>>>>>INSERT { ?map rr:subjectMap [ rr:constant ?value ] . } >>>>>>>>>>>>>>> >>>>>>>>>>>WHERE { ?map rr:subject ?value } >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>The full code is simply: >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>var dataset = new InMemoryDataset(store, >>>>>>>>>>>>>>> R2RMLMappings.BaseUri); >>>>>>>>>>>>>>> >>>>>>>>>>> ISparqlUpdateProcessor processor = new >>>>>>>>>>>>>>> >>>>>>>>>>>LeviathanUpdateProcessor(dataset); >>>>>>>>>>>>>>> >>>>>>>>>>> var updateParser = new >>>>>>>>>>>>>>> SparqlUpdateParser(); >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>processor.ProcessCommandSet(updateParser.ParseFromStr>>>>>>>>>>>>>>> ing(Shortcu >>>>>>>>>>>>>>> >>>>>>>>>>>tS >>>>>>>>>>>>>>> >>>>>>>>>>>ub >>>>>>>>>>>>>>> >>>>>>>>>>>m >>>>>>>>>>>>>>> >>>>>>>>>>>a >>>>>>>>>>>>>>> >>>>>>>>>>>p >>>>>>>>>>>>>>> >>>>>>>>>>>sRe >>>>>>>>>>>>>>> >>>>>>>>>>>placeSparql)); >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>Is this a know problem and has been already fixed or should I >>>>>>>>>>>>>>> >>>>>>>>>>>investigate closely? >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> This is not a known issue, I would also guess that the data >>>>>>>>>>>>>> >>>>>>>>>>being >>>>>>>>>>>>>> >>>>>>>>>>used >>>>>>>>>>>>>> >>>>>>>>>> would have some bearing on the severity of the >>>>>>>>>>>>>> problem. Please >>>>>>>>>>>>>> >>>>>>>>>>go >>>>>>>>>>>>>> >>>>>>>>>>ahead >>>>>>>>>>>>>> >>>>>>>>>> and investigate but I would suspect it is the two things I >>>>>>>>>>>>>> >>>>>>>>>>outlined >>>>>>>>>>>>>> >>>>>>>>>>above >>>>>>>>>>>>>> >>>>>>>>>> which are the culprits here. >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> Rob >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>Tom >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>----------------------------------------------------->>>>>>>>>>>>>>> ----------- >>>>>>>>>>>>>>> >>>>>>>>>>>-- >>>>>>>>>>>>>>> >>>>>>>>>>>-- >>>>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>>>> >>>>>>>>>>>- >>>>>>>>>>>>>>> >>>>>>>>>>>--- >>>>>>>>>>>>>>> >>>>>>>>>>>---- >>>>>>>>>>>>>>> >>>>>>>>>>>Master Visual Studio, SharePoint, SQL, ASP.NET >>>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5, >>>>>>>>>>>>>>> >>>>>>>>>>>CSS, >>>>>>>>>>>>>>> >>>>>>>>>>>MVC, Windows 8 Apps, JavaScript and much more. Keep >>>>>>>>>>>>>>> your skills >>>>>>>>>>>>>>> >>>>>>>>>>>current >>>>>>>>>>>>>>> >>>>>>>>>>>with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>>>>>> >>>>>>>>>>>Microsoft >>>>>>>>>>>>>>> >>>>>>>>>>>MVPs and experts. ON SALE this month only -- learn more at: >>>>>>>>>>>>>>> >>>>>>>>>>>http://p.sf.net/sfu/learnmore_122712 >>>>>>>>>>>>>>> >>>>>>>>>>>_______________________________________________ >>>>>>>>>>>>>>> >>>>>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>>>>>> >>>>>>>>>>>dot...@li... >>>>>>>>>>>>>>> >>>>>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>------------------------------------------------------->>>>>>>>>>>>>> ---------- >>>>>>>>>>>>>> >>>>>>>>>>-- >>>>>>>>>>>>>> >>>>>>>>>>-- >>>>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>>>> >>>>>>>>>>- >>>>>>>>>>>>>> >>>>>>>>>>------ >>>>>>>>>>>>>> >>>>>>>>>> Master Visual Studio, SharePoint, SQL, ASP.NET >>>>>>>>>>>>>> <http://ASP.NET> , C# 2012, HTML5, >>>>>>>>>>>>>> >>>>>>>>>>CSS, >>>>>>>>>>>>>> >>>>>>>>>> MVC, Windows 8 Apps, JavaScript and much more. Keep >>>>>>>>>>>>>> your skills >>>>>>>>>>>>>> >>>>>>>>>>current >>>>>>>>>>>>>> >>>>>>>>>> with LearnDevNow - 3,200 step-by-step video tutorials by >>>>>>>>>>>>>> >>>>>>>>>>Microsoft >>>>>>>>>>>>>> >>>>>>>>>> MVPs and experts. SALE $99.99 this month only -- learn more at: >>>>>>>>>>>>>> >>>>>>>>>> http://p.sf.net/sfu/learnmore_122412 >>>>>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> >>>>>>>>>> dotNetRDF-bugs mailing list >>>>>>>>>>>>>> >>>>>>>>>> dot...@li... >>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> >>>>>>>>----------------------------------------------------------->>>>>>>>>>>> -------- >>>>>>>>>>>> >>>>>>>>-- >>>>>>>>>>>> >>>>>>>>-- >>>>>>>>>>>> >>>>>>>>- >>>>>>>>>>>> >>>>>>>>- >>>>>>>>>>>> >>>>>>>>- >>>>>>>>>>>> >>>>>>>>---- >>>>>>>>>>>> >>>>>>>>Minimize network downtime and maximize team effectiveness. >>>>>>>>>>>> >>>>>>>>Reduce network management and security costs.Learn how to hire >>>>>>>>>>>> >>>>>>>>the most talented Cisco Certified professionals. Visit the >>>>>>>>>>>> >>>>>>>>Employer Resources Portal >>>>>>>>>>>> >>>>>>>>http://www.cisco.com/web/learning/employer_resources/index.html >>>>>>>>>>>> >>>>>>>>_______________________________________________ >>>>>>>>>>>> >>>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>>> >>>>>>>>dot...@li... >>>>>>>>>>>> >>>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>>>>>> >>>>>>>------------------------------------------------------------->>>>>>>>>>> ------- >>>>>>>>>>> >>>>>>>-- >>>>>>>>>>> >>>>>>>-- >>>>>>>>>>> >>>>>>>- >>>>>>>>>>> >>>>>>>- >>>>>>>>>>> >>>>>>>---- >>>>>>>>>>> >>>>>>>Precog is a next-generation analytics platform capable of >>>>>>>>>>> advanced >>>>>>>>>>> >>>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>>>>> >>>>>>>building >>>>>>>>>>> >>>>>>>apps and a phenomenal toolset for data science. Developers >>>>>>>>>>> can use >>>>>>>>>>> >>>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>>>>> >>>>>>>account! >>>>>>>>>>> >>>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>>>>> >>>>>>>_______________________________________________ >>>>>>>>>>> >>>>>>>dotNetRDF-bugs mailing list >>>>>>>>>>> >>>>>>>dot...@li... >>>>>>>>>>> >>>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>>--------------------------------------------------------------->>>>>>>>>> ------ >>>>>>>>>> >>>>>>-- >>>>>>>>>> >>>>>>-- >>>>>>>>>> >>>>>>- >>>>>>>>>> >>>>>>---- >>>>>>>>>> >>>>>>Precog is a next-generation analytics platform capable of >>>>>>>>>> advanced >>>>>>>>>> >>>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>>>> >>>>>>building >>>>>>>>>> >>>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>>>> >>>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>>>> >>>>>>account! >>>>>>>>>> >>>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>>>> >>>>>>_______________________________________________ >>>>>>>>>> >>>>>>dotNetRDF-bugs mailing list >>>>>>>>>> >>>>>>dot...@li... >>>>>>>>>> >>>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> >>>>>----------------------------------------------------------------->>>>>>>>> ----- >>>>>>>>> >>>>>-- >>>>>>>>> >>>>>-- >>>>>>>>> >>>>>---- >>>>>>>>> >>>>>Precog is a next-generation analytics platform capable of >>>>>>>>> advanced >>>>>>>>> >>>>>analytics on semi-structured data. The platform includes APIs for >>>>>>>>> >>>>>building >>>>>>>>> >>>>>apps and a phenomenal toolset for data science. Developers can use >>>>>>>>> >>>>>our toolset for easy data analysis & visualization. Get a free >>>>>>>>> >>>>>account! >>>>>>>>> >>>>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>>> >>>>>_______________________________________________ >>>>>>>>> >>>>>dotNetRDF-bugs mailing list >>>>>>>>> >>>>>dot...@li... >>>>>>>>> >>>>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>> >>>>>>>> >>>>------------------------------------------------------------------->>>>>>>> ---- >>>>>>>> >>>>-- >>>>>>>> >>>>----- >>>>>>>> >>>> Precog is a next-generation analytics platform capable of advanced >>>>>>>> >>>> analytics on semi-structured data. The platform includes APIs for >>>>>>>> >>>>building >>>>>>>> >>>> apps and a phenomenal toolset for data science. Developers can use >>>>>>>> >>>> our toolset for easy data analysis & visualization. Get a free >>>>>>>> >>>>account! >>>>>>>> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>> >>>> _______________________________________________ >>>>>>>> >>>> dotNetRDF-bugs mailing list >>>>>>>> >>>> dot...@li... >>>>>>>> >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>>> >>> >>>>>>> >>>--------------------------------------------------------------------->>>>>>> --- >>>>>>> >>>-- >>>>>>> >>>---- >>>>>>> >>>Precog is a next-generation analytics platform capable of advanced >>>>>>> >>>analytics on semi-structured data. The platform includes APIs for >>>>>>> >>>building >>>>>>> >>>apps and a phenomenal toolset for data science. Developers can use >>>>>>> >>>our toolset for easy data analysis & visualization. Get a free >>>>>>> account! >>>>>>> >>>http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>> >>>_______________________________________________ >>>>>>> >>>dotNetRDF-bugs mailing list >>>>>>> >>>dot...@li... >>>>>>> >>>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >>----------------------------------------------------------------------->>>>>> -- >>>>>> >>----- >>>>>> >> Precog is a next-generation analytics platform capable of advanced >>>>>> >> analytics on semi-structured data. The platform includes APIs for >>>>>> >>building >>>>>> >> apps and a phenomenal toolset for data science. Developers can use >>>>>> >> our toolset for easy data analysis & visualization. Get a free >>>>>> account! >>>>>> >> http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>> >> _______________________________________________ >>>>>> >> dotNetRDF-bugs mailing list >>>>>> >> dot...@li... >>>>>> >> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>>> > >>>>> >------------------------------------------------------------------------->>>>> - >>>>> >---- >>>>> >Precog is a next-generation analytics platform capable of advanced >>>>> >analytics on semi-structured data. The platform includes APIs for >>>>> building >>>>> >apps and a phenomenal toolset for data science. Developers can use >>>>> >our toolset for easy data analysis & visualization. Get a free account! >>>>> >http://www2.precog.com/precogplatform/slashdotnewsletter >>>>> >_______________________________________________ >>>>> >dotNetRDF-bugs mailing list >>>>> >dot...@li... >>>>> >https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>>> >>>> >>>> >>>> >>>> >>>> --------------------------------------------------------------------------- >>>> --- >>>> Precog is a next-generation analytics platform capable of advanced >>>> analytics on semi-structured data. The platform includes APIs for building >>>> apps and a phenomenal toolset for data science. Developers can use >>>> our toolset for easy data analysis & visualization. Get a free account! >>>> http://www2.precog.com/precogplatform/slashdotnewsletter >>>> _______________________________________________ >>>> dotNetRDF-bugs mailing list >>>> dot...@li... >>>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs >>> ---------------------------------------------------------------------------- >>> -- Precog is a next-generation analytics platform capable of advanced >>> analytics on semi-structured data. The platform includes APIs for building >>> apps and a phenomenal toolset for data science. Developers can use our >>> toolset for easy data analysis & visualization. Get a free account! >>> http://www2.precog.com/precogplatform/slashdotnewsletter____________________ >>> ___________________________ dotNetRDF-bugs mailing list >>> dot...@li...://lists.sourceforge.net/lists/list >>> info/dotnetrdf-bugs >> ----------------------------------------------------------------------------- >> - Precog is a next-generation analytics platform capable of advanced >> analytics on semi-structured data. The platform includes APIs for building >> apps and a phenomenal toolset for data science. Developers can use our >> toolset for easy data analysis & visualization. Get a free account! >> http://www2.precog.com/precogplatform/slashdotnewsletter_____________________ >> __________________________ dotNetRDF-bugs mailing list >> dot...@li...://lists.sourceforge.net/lists/listi >> nfo/dotnetrdf-bugs > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced analytics > on semi-structured data. The platform includes APIs for building apps and a > phenomenal toolset for data science. Developers can use our toolset for easy > data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter______________________ > _________________________ dotNetRDF-bugs mailing list > dot...@li... > https://lists.sourceforge.net/lists/listinfo/dotnetrdf-bugs |