From: Jody G. <jga...@re...> - 2007-03-20 23:56:17
|
Hi Sanjay - I am into the code review part now :-) The following caught my eye for two reasons: a) it had clone in it b) I could not understand the comment without the help of Richard > public DirectPositionImpl(final DirectPosition POSITION) { > this.crs = POSITION.getCoordinateReferenceSystem(); > // Comment by Sanjay > // VORSICHT: Die folgende Codezeile verursachte, dass das > selbe Objekt > // (double Array) verwendet wurde; folglich wurde z.B. beim > // Envelope Min und Max Position auf die selben Koordinaten > zugegriffen. > // this.coordinate=p.getCoordinates(); > // Bitte um kenntnisnahme und berücksichtigung in sourcen: > Arrays müssen > // explizit kopiert werden, nur elementare Datentypen werden > automatisch > // von Java neu erzeugt, alles andere sind nur Referenzen > // TODO Das Klonen sollte in die Factory verlagert werden > > // the above seems to say that the array must be explicitly copied > // but the direct position javadocs say that getCoordinates > produces > // a copy ... so we should be good without the clone. > this.coordinate = POSITION.getCoordinates().clone() > } The getCoordiantes() method is actually defined to return a copy (if you really want to edit in place you need to use the getOrdinate/setOrdinate methods.. > /** > * Returns the sequence of numbers that hold the coordinate of this > * position in its reference system. > * > * @return A copy of the coordinates. Changes in the returned > array will not be reflected > * back in this {@code DirectPosition} object. > */ > @UML(identifier="coordinates", obligation=MANDATORY, > specification=ISO_19107) > double[] getCoordinates(); I am patching your code to agree with the interface here - that means the getCoordiantes() method will now clone. This is a classic performance trade off and I believe GeoAPI is forcing us to jump the wrong way. Reading for an array of ordinates is done way more often; it is okay to specify the array returned is not to be edited - but forcing it to be a copy every time is a bit painful. Copying on creation time is preferred (ie. PositionFactory.createPosition( Position origional ) ) I am not creative enough right now to word things so that: a) GeoAPI does not force us to chose the "clone on use" solution b) GeoAPI also still works (can copy from one DirectPosition implementation to another ) Suggestions welcome; for now I am fixing the code to be correct but slower. Jody |
From: Martin D. <mar...@ge...> - 2007-03-23 16:23:12
|
Jody Garnett a écrit : > Reading for an array of ordinates is done way more often; it is okay to > specify the array returned is not to be edited - but forcing it to be a > copy every time is a bit painful. My assumption was that reading would be performed by DirectPosition.getOrdinate(i) and not DirectPosition.getOrdinates()[i] The former should not have the copy overhead. I believe that looping over getOrdinate(int) should be close in performance to looping directly over array elements in modern JVM. In some implementations (e.g. an implementation that computes ordinates on the fly using some algorithm, instead of storing them), getOrdinate(int) may be more performant than getOrdinates() since the later forces immediate computation of every coordinates. The intend was to encourage usage of getOrdinate(int) over getOrdinates() in order to avoid the performance issue raised. However I admit that forcing array clone in getOrdinates() is a questionable decision. The rational was: * Precedence given to data integrity over 'getOrdinates()' performance, especially since I believe that this performance concern can be avoided with usage of 'getOrdinate(int)' instead (but I admit that I didn't benchmarked that). * A garantee that 'getOrdinates()' always clone the array can help performance in some cases. Use case inspired from Geotools code: - We want an array of coordinates with the intend to modify it for computation purpose (without modifying the original DirectPosition), or we want to protect the array from future DirectPosition changes. - If DirectPosition.getOrdinates() is garanteed to not return the backing array, then we can work directly on this array. If we don't have this garantee, then we must conservatively clone the array in very cases. - Cloning the returned array is useless if the implementation cloned the array or was forced to returns a new array anyway (for example because the coordinates were computed on the fly). An other way to see the idea is: * Use DirectPosition.getOrdinates() when you want coordinates in an array protected from changes in the most efficient way. We can not have as much efficiency if DirectPosition.getOrdinates() is not garantee to returns a new array on every call. * Use DirectPosition.getOrdinate(int) if you want some coordinate values without the cost of cloning an array. With the trivial implementation and using HotSpot compiler, I believe that it is as efficient than iterating directly though array elements (but I didn't benchmarked - I just expect HotSpot inlining to work as explained on Sun web site). Martin |
From: Jody G. <jga...@re...> - 2007-03-23 17:10:18
|
Martin Desruisseaux wrote: > Jody Garnett a écrit : >> Reading for an array of ordinates is done way more often; it is okay >> to specify the array returned is not to be edited - but forcing it to >> be a copy every time is a bit painful. > My assumption was that reading would be performed by > > DirectPosition.getOrdinate(i) > > and not > > DirectPosition.getCoordinates()[i] > > The former should not have the copy overhead. I believe that looping > over getOrdinate(int) should be close in performance to looping > directly over array elements in modern JVM. In some implementations > (e.g. an implementation that computes ordinates on the fly using some > algorithm, instead of storing them), getOrdinate(int) may be more > performant than getOrdinates() since the later forces immediate > computation of every coordinates. I agree with your reasoning - I have found lots of client code that just grabs the getCoordinates() so they get a nice comfortable array ... but if they do not read the javadocs they are doomed to poor performance. So we are agreed the interface is correct - getCoordinates() *must* copy and setOrdinate( index, value ) must be the only way to edit. > The intend was to encourage usage of getOrdinate(int) over > getOrdinates() in order to avoid the performance issue raised. However > I admit that forcing array clone in getOrdinates() is a questionable > decision. The rational was: > > * Precedence given to data integrity over 'getOrdinates()' performance, > especially since I believe that this performance concern can be avoided > with usage of 'getOrdinate(int)' instead (but I admit that I didn't > benchmarked that). > > * A garantee that 'getOrdinates()' always clone the array can help > performance > in some cases. Use case inspired from Geotools code: > > - We want an array of coordinates with the intend to modify it for > computation purpose (without modifying the original DirectPosition), > or we want to protect the array from future DirectPosition changes. > > - If DirectPosition.getOrdinates() is garanteed to not return the > backing > array, then we can work directly on this array. If we don't have > this > garantee, then we must conservatively clone the array in very cases. > > - Cloning the returned array is useless if the implementation > cloned the > array or was forced to returns a new array anyway (for example > because > the coordinates were computed on the fly). Good stuff Martin > An other way to see the idea is: > > * Use DirectPosition.getOrdinates() when you want coordinates in an array > protected from changes in the most efficient way. We can not have as > much > efficiency if DirectPosition.getOrdinates() is not garantee to > returns a > new array on every call. > > * Use DirectPosition.getOrdinate(int) if you want some coordinate values > without the cost of cloning an array. With the trivial > implementation and > using HotSpot compiler, I believe that it is as efficient than > iterating > directly though array elements (but I didn't benchmarked - I just > expect > HotSpot inlining to work as explained on Sun web site). > So I was correct in fixing Sanjay's implementation - I am copying some of your points into the javadocs of DirectPosition. Jody |
From: Jody G. <jga...@re...> - 2007-03-23 17:18:20
|
Aside - here is the conflicting javadocs for getCoordinates(); > * Returns the sequence of numbers that hold the coordinate of this > * position in its reference system. > * > * @return A copy of the coordinates. Changes in the returned > array will not be reflected > * back in this {@code DirectPosition} object. The description indicates that the returned double[] *is* the sequence of numbers that hold the coordinates of this position The @return indicates that this is "a copy of the coordinates" Based on your email I will revise the javadocs to indicate a copy is used in all cases. Cheers, Jody Martin Desruisseaux wrote: > Jody Garnett a écrit : >> Reading for an array of ordinates is done way more often; it is okay >> to specify the array returned is not to be edited - but forcing it to >> be a copy every time is a bit painful. > > My assumption was that reading would be performed by > > DirectPosition.getOrdinate(i) > > and not > > DirectPosition.getOrdinates()[i] > > The former should not have the copy overhead. I believe that looping > over getOrdinate(int) should be close in performance to looping > directly over array elements in modern JVM. In some implementations > (e.g. an implementation that computes ordinates on the fly using some > algorithm, instead of storing them), getOrdinate(int) may be more > performant than getOrdinates() since the later forces immediate > computation of every coordinates. > > The intend was to encourage usage of getOrdinate(int) over > getOrdinates() in order to avoid the performance issue raised. However > I admit that forcing array clone in getOrdinates() is a questionable > decision. The rational was: > > * Precedence given to data integrity over 'getOrdinates()' performance, > especially since I believe that this performance concern can be avoided > with usage of 'getOrdinate(int)' instead (but I admit that I didn't > benchmarked that). > > * A garantee that 'getOrdinates()' always clone the array can help > performance > in some cases. Use case inspired from Geotools code: > > - We want an array of coordinates with the intend to modify it for > computation purpose (without modifying the original DirectPosition), > or we want to protect the array from future DirectPosition changes. > > - If DirectPosition.getOrdinates() is garanteed to not return the > backing > array, then we can work directly on this array. If we don't have > this > garantee, then we must conservatively clone the array in very cases. > > - Cloning the returned array is useless if the implementation > cloned the > array or was forced to returns a new array anyway (for example > because > the coordinates were computed on the fly). > > > An other way to see the idea is: > > * Use DirectPosition.getOrdinates() when you want coordinates in an array > protected from changes in the most efficient way. We can not have as > much > efficiency if DirectPosition.getOrdinates() is not garantee to > returns a > new array on every call. > > * Use DirectPosition.getOrdinate(int) if you want some coordinate values > without the cost of cloning an array. With the trivial > implementation and > using HotSpot compiler, I believe that it is as efficient than > iterating > directly though array elements (but I didn't benchmarked - I just > expect > HotSpot inlining to work as explained on Sun web site). > > Martin |
From: Martin D. <mar...@ge...> - 2007-03-23 18:49:47
|
Jody Garnett a écrit : > The description indicates that the returned double[] *is* the sequence > of numbers that hold the coordinates of this position > The @return indicates that this is "a copy of the coordinates" > > Based on your email I will revise the javadocs to indicate a copy is > used in all cases. Thanks :) Martin |