Re: [jgrapht-users] some potential speed improvements for containsEdge(u, v) and getEdge(u, v)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I agree with Joris. Implement the current setup and solve a performance 
collapse if and when it causes issues.

-- 
Information System on Graph Classes and their Inclusions (ISGCI)
http://www.graphclasses.org

On 10 April 2016 9:07:25 pm Joris Kinable <de...@gm...> wrote:

> I've done some testing to measure the impact of frequent object creation.
> *Algorithm setup:*
> 1. Implementation where VertexPair objects are frequently created and
> immediately discarded afterwards (see
> https://github.com/jgrapht/jgrapht/pull/188).
> 2. Implementation where I use a ModifiableVertexPair as proposed by Ernst.
> In this implementation I created a single ModifiableVertexPair indexPair.
> Each time an edge is queried in the Map<ModifiableVertexPair<V>,
> ArrayUnenforcedSet<E>> touchingVerticesToEdgeMap,
>  indexPair.updatePair(source, target) gets invoked first. This entirely
> avoids the creation of new VertexPairs. To ensure that read operations are
> thread-save, I made the getEdge(u,v) and getAllEdges(u,v) synchronized.
> Here's the implementation:
> https://github.com/jkinable/jgrapht/blob/SpecificsOptimization_experimental/jgrapht-core/src/main/java/org/jgrapht/graph/specifics/FastLookupDirectedSpecifics_experimental.java
> . If you want to try it yourself or see my implementation, download this
> branch:
> https://github.com/jkinable/jgrapht/tree/SpecificsOptimization_experimental
>
> *Experiment setup:*
> java version "1.8.0_77"
> Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
>
> Each experiment is repeated 5 times, with 5 warmup iterations each.
>
> *Experiments:*
> 1. ConstructGraph: Build 100 graphs, each consisting of 1000 vertices and
> 100000 edges
> 2. UseGraph: Build 100 graphs, run 3 different algorithms on each graph,
> randomly destroy part of the graph (triggering containsEdge(u,v) and
> removeEdge(u,v)).
>
> Results:
>
> Benchmark
>                   Mode  Cnt      Score      Error  Units
> ConstructGraph(objectCreation)                  avgt    5  11313.907 ±
>  216.430  ms/op
> UseGraph(objectCreation)                  avgt    5  33234.987 ± 1009.527
>  ms/op
> ConstructGraph(IndexPair)  avgt    5  11230.629 ±  146.290  ms/op
> UseGraph(IndexPair)  avgt    5  32295.133 ± 1208.719  ms/op
>
> Here objectCreation uses a DirectedSpecifics implementation where
> short-lived VertexPairs are created. IndexPair uses a DirectedSpecifics
> implementation where objects are reused.
> I repeated these experiment several times, never was there a clear winner.
>
> Conclusion:
> I was unable to create a scenario where reusing the objects was
> significantly more efficient than recreating them. Quite an interesting
> result if you ask me. Consequently, I would propose to continue with my
> pull request: https://github.com/jgrapht/jgrapht/pull/188
>
>
> br,
>
> Joris Kinable
>
>
> On Sun, Apr 10, 2016 at 7:24 AM, H.N. de Ridder <hnr...@gr...>
> wrote:
>
>> On Sat, Apr 09, 2016 at 08:25:15PM -0700, John Sichi wrote:
>>
>> Yes, that's true. Thread-unsafety is a k.o.-criterium. I tested with 64bit
>> icedtea 2.6.5. I'm short on time right now, but I'll see in a week or so
>> whether I can repeat the tests with java8 and also if I can pinpoint when
>> memory gets tight and what happens then.
>>
>> > Hmmm...the problem with the reusable pair object is that it makes even
>> > reads thread-unsafe (unless it's implemented via a thread-local
>> variable).
>> > I'm not 100% sure, but I think the last time we analyzed it, the graph
>> data
>> > structures themselves were all OK for concurrent reads (though not of
>> > course for concurrent reads+writes).
>> >
>> > It might be good to repeat the benchmarking with the latest JVM.
>> >
>> > On Sat, Apr 9, 2016 at 11:14 AM, H.N. de Ridder <
>> hnr...@gr...>
>> > wrote:
>> >
>> > > On Fri, Apr 08, 2016 at 02:36:37AM -0700, John Sichi wrote:
>> > > > Regarding short-lived VertexPairs:  long ago I learned to stop
>> worrying
>> > > and
>> > > > love the Eden approach of the garbage collector.  Think about how
>> many
>> > > > iterators are created and forgotten during a typical algorithm
>> > > execution...
>> > >
>> > > But iterators typically are used more than once, here the new pair is
>> used
>> > > only for a single hash lookup.
>> > >
>> > > Several years ago I ran into the same performance issue with JgraphT
>> and
>> > > as a
>> > > quick hack created a wrapper class like this (and then forget about it
>> > > until
>> > > this discussion):
>> > >
>> > > public class CacheGraph<V,E> extends ListenableDirectedGraph<V,E>
>> > >         implements GraphListener<V,E> {
>> > >     private DirectedGraph<V,E> graph;   /** The graph we're caching */
>> > >     private HashMap<V,V> nodeCache;
>> > >     private HashMap<Pair,E> edgeCache;
>> > >     private Pair reusablePair;          /** For looking up edges */
>> > >
>> > > The rest of the class is about as you'd expect. Note that Pair,
>> different
>> > > from
>> > > org.jgrapht.util.VertexPair, is modifiable. I used this class to time
>> the
>> > > impact of allocating a new Pair() when performing an edge lookup and
>> in my
>> > > application it adds about 9% to the running time. And this is when
>> there
>> > > is no
>> > > shortage of memory. When there is, the shit can really hit the fan. So
>> I
>> > > suggest using a reusable pair object and not reallocating one for every
>> > > test.
>> > >
>> > > Regards,
>> > >    Ernst
>> > >
>> > > >
>> > > > On Wed, Apr 6, 2016 at 4:18 PM, Joris Kinable <de...@gm...>
>> wrote:
>> > > >
>> > > > > John,
>> > > > >
>> > > > > If I understand you correct, you would propose to do the following:
>> > > > > To the Specifics class, you would add:
>> > > > > Map<VertexPair<V,V>,Set<E>> vertexPairToEdgeMap=....;
>> > > > >
>> > > > > Then in DirectedSpecifics.addEdgeToTouchingVertices(E e) you would
>> > > *add*:
>> > > > > VertexPair<V,V> pair=new VertexPair<>(source, target);
>> > > > > if(!vertexPairToEdgeMap.containsKey(pair))
>> > > vertexPairToEdgeMap.put(pair,
>> > > > > new LinkedHashSet<>());
>> > > > > vertexPairToEdgeMap.get(pair).put(e);
>> > > > >
>> > > > > And then the current DirectedSpecifics.getEdge(V source, V target)
>> > > would
>> > > > > be *replaced* by:
>> > > > > VertexPair<V,V> pair=new VertexPair<>(source, target);
>> > > > > if(!vertexPairToEdgeMap.containsKey(pair) ||
>> > > > > vertexPairToEdgeMap.get(pair).isEmpty()) return null;
>> > > > > else return vertexPairToEdgeMap.get(pair).iterator().next();
>> > > > >
>> > > > > This would be possible, but then the
>> SimpleIdentityDirectedGraphTest
>> > > would
>> > > > > fail. (In this test, the Vertex objects are changed *after* they
>> are
>> > > > > created and added to graph, so an IdentityHashMap is required, see
>> the
>> > > test
>> > > > > for details). I'm not sure how to resolve this. An application can
>> > > indeed
>> > > > > override createDirectedSpecifics, but both Specifics and
>> > > > > DirectedSpecifics are resp. private and protected, so you cannot
>> > > provide a
>> > > > > custom DirectedSpecifics implementation which avoid this issue.
>> > > > >
>> > > > > Also, would the frequent construction of short-lived VertexPair
>> objects
>> > > > > not be an issue, i.e. a new VertexPair would be created each time
>> > > getEdge(V
>> > > > > source, V target) is invoked.
>> > > > >
>> > > > > Joris
>> > > > >
>> > > > > On Wed, Apr 6, 2016 at 5:25 PM, John Sichi <js...@gm...>
>> wrote:
>> > > > >
>> > > > >> Per the last option mentioned by Joris, I don't think any
>> > > equals/hashCode
>> > > > >> override is required for any object.
>> > > > >>
>> > > > >> My preference is a hybrid of his second and third options.
>> > > > >>
>> > > > >> We would add an additional graph-level map inside of Specifics
>> (not
>> > > > >> vertex-level inside of EdgeContainer), where the key is
>> VertexPair,
>> > > and the
>> > > > >> value is Edge (or Set<Edge> for multigraphs).  When adding an
>> edge,
>> > > we add
>> > > > >> a corresponding entry.  When removing an edge, we remove the
>> entry (or
>> > > > >> remove the edge from the set for multigraphs, deleting the entry
>> once
>> > > the
>> > > > >> set is empty).  This allows us to optimize both containsEdge and
>> > > getEdge
>> > > > >> (which is slightly different depending on directed vs undirected).
>> > > > >>
>> > > > >> The default implementation would be this new indexing.  If an
>> > > application
>> > > > >> wants to revert to the old approach to save memory, this would be
>> > > possible
>> > > > >> by overriding the createDirectedSpecifics and
>> > > createUndirectedSpecifics
>> > > > >> methods.
>> > > > >>
>> > > > >>
>> > > > >> On Wed, Apr 6, 2016 at 2:01 PM, Rushang Karia <
>> > > rus...@ho...>
>> > > > >> wrote:
>> > > > >>
>> > > > >>>
>> > > > >>> Wouldn't this be solved by checking for references as well in
>> > > addition
>> > > > >>> to the equals()?
>> > > > >>>
>> > > > >>> ----- Reply message -----
>> > > > >>> From: "Aidan Delaney" <ai...@on...>
>> > > > >>> To: "jgr...@li..." <
>> > > > >>> jgr...@li...>
>> > > > >>> Subject: [jgrapht-users] some potential speed improvements for
>> > > > >>> containsEdge(u, v) and getEdge(u, v)
>> > > > >>> Date: Wed, Apr 6, 2016 01:33
>> > > > >>>
>> > > > >>>
>> > > > >>> Dear all,
>> > > > >>> As far as I recall (and I'm happy to be wrong) overriding hash
>> code
>> > > for
>> > > > >>> Edge leads to Bad Things (tm).  JGraphT is a general graph
>> > > > >>> implementation.  As such, it doesn't assume that you're graph is
>> not
>> > > a
>> > > > >>> multigraph.
>> > > > >>>         In the JGraphT implementation, if we add edge e between
>> > > vertex
>> > > > >>> v1 and v2 we -- unsurprisingly -- have an edge between v1 and v2.
>> > > If I
>> > > > >>> create another edge, e', which is between v1 and v2 and
>> identical to
>> > > e
>> > > > >>> in all other respects, then my graph has two edges, e and e',
>> between
>> > > > >>> v1 and v2.  By overriding hashCode, e and e' would be equal
>> edges and
>> > > > >>> you'd only have one edge between v1 and v2.
>> > > > >>>         I hope the above helps.  I'll read it again after my
>> first
>> > > > >>> coffee to see if it's actually intelligible.
>> > > > >>>
>> > > > >>> --
>> > > > >>> Dr Aidan Delaney
>> > > > >>> Principal Lecturer
>> > > > >>> Computing, Engineering & Maths
>> > > > >>> University of Brighton
>> > > > >>>
>> > > > >>> @aidandelaney
>> > > > >>>
>> > >
>> > > --
>> > > Information System on Graph Classes and their Inclusions (ISGCI)
>> > > http://www.graphclasses.org
>> > >
>> > >
>> > >
>> ------------------------------------------------------------------------------
>> > > Find and fix application performance issues faster with Applications
>> > > Manager
>> > > Applications Manager provides deep performance insights into multiple
>> > > tiers of
>> > > your business applications. It resolves application problems quickly
>> and
>> > > reduces your MTTR. Get your free trial!
>> http://pubads.g.doubleclick.net/
>> > > gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
>> > > _______________________________________________
>> > > jgrapht-users mailing list
>> > > jgr...@li...
>> > > https://lists.sourceforge.net/lists/listinfo/jgrapht-users
>> > >
>>
>> --
>> Information System on Graph Classes and their Inclusions (ISGCI)
>> http://www.graphclasses.org
>>
>>
>> ------------------------------------------------------------------------------
>> Find and fix application performance issues faster with Applications
>> Manager
>> Applications Manager provides deep performance insights into multiple
>> tiers of
>> your business applications. It resolves application problems quickly and
>> reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
>> gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
>> _______________________________________________
>> jgrapht-users mailing list
>> jgr...@li...
>> https://lists.sourceforge.net/lists/listinfo/jgrapht-users
>>
>