GeographicLib / Help / Open Discussion: Can GeographicLib Be Made Faster But Less Accurate?

Comment has been marked as spam.
Undo

View and moderate all "Open Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Help"

Anonymous - 2016-02-15

Is there a way to make GeographicLib run faster, but less
accurately? Accuracy to the nearest meter is sufficient for
my purposes. Is there a flag I can set or mode I can specify
to have the library do less work? I only need to do
distance and bearing calculations on the surface of the Earth
and on no other shapes, if that helps eliminate any extra
code logic that may be used.

I've implemented the Vincenty approach and it is a bit faster
on average, but I am looking for a more realiable approach and
even Vincenty is more accurate than I need.

I believe calculations on a spherical model are not accurate
enough in many cases, if the results in the accepted answer
here:

https://gis.stackexchange.com/questions/25494/how-accurate-is-approximating-the-earth-as-a-sphere

...are to be believed.

Does anyone have any suggestions?

Is there a way to make GeographicLib run faster, but less accurately? Accuracy to the nearest meter is sufficient for my purposes. Is there a flag I can set or mode I can specify to have the library do less work? I only need to do distance and bearing calculations on the surface of the Earth and on no other shapes, if that helps eliminate any extra code logic that may be used. I've implemented the Vincenty approach and it is a bit faster on average, but I am looking for a more realiable approach and even Vincenty is more accurate than I need. I believe calculations on a spherical model are not accurate enough in many cases, if the results in the accepted answer here: https://gis.stackexchange.com/questions/25494/how-accurate-is-approximating-the-earth-as-a-sphere ...are to be believed. Does anyone have any suggestions?

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Charles Karney - 2016-02-16

Some questions:

Please give more details on the calculations you're trying to do.
Presumably you're doing millions of such calculations for this
question to come up?

Have you run your code with a profiler to verify that it is the
distance calculations that take the time and not, say, disk I/O,
decoding DMS to decimal degrees, performing map projections, etc?

What's your target performance?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Comment has been marked as spam.
Undo

View and moderate all "Open Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Help"

Anonymous - 2016-02-16

Presumably you're doing millions of such calculations for this
question to come up?

Yes. To clarify the types of calculations we typically do, in
the case of the Inverse problem for example, most of our
calculations will be between points that are within a 1000
nautical miles of each other. In order to minimize the number
of expensive calculations we do, we first do a very rough
calculation to determine if the more expensive calculation is
necessary at all. If the rough distance is below some
threshold, say 1000 nautical miles, we then do the more
expensive calculation. We usually do not need to calculate
very long distances on the surface of the Earth with
significant accuracy--only the closer distances to 1 meter
accuracy or so. But, there are the occasional longer cases,
so I suspect that I cannot replace our expensive calculation
with say a spherical Earth algorithm because of the
innaccuracy it apparently has.

Have you run your code with a profiler to verify that it
is the distance calculations that take the time and not,
say, disk I/O, decoding DMS to decimal degrees, performing
map projections, etc?

Yes, we've profiled our application while it used the Vincenty
approach and the methods doing the range and bearing
calculations account for about 15% of the CPU's time. There
is some I/O when logging to a file, especially at the FINEST
log level, but the CPU time is consistent whether logging is
turned on or not. Besides the optional logging, there is no
disk I/O in this interaction because all data is streamed
in from a socket and all results streamed out to a remote
machine. There are no networking delays from the local LAN.
All the geo-calculations are being done in RAM. All values
passed to the methods in question are doubles in decimal
degrees and in the case of our Vincenty implementation, are
being converted to Radians and back. I am not sure what you
mean about map projections. Our application passes on it's
results to other machines for display. Our code is written
in Java and I have considered writing a Java Native Interface
wrapper to a C/C++ implementation, but did not want to go
down the less portable path unless no other more efficient
geo-algorithms are available.

What's your target performance?

We've been using our Vincenty implementation for more than 10
years now and while it's been working fine, I've been tasked
with trying to optimize any "low hanging fruit." I'm doing
a preliminary assessment of what is available to see if
there is a more efficient alternative to our overkill Vincenty
implementation. That said, any easily implemented efficiency
improvement is worthwhile. To us, GeographicLib is an
example of an easy drop in replacement.

> Presumably you're doing millions of such calculations for this > question to come up? Yes. To clarify the types of calculations we typically do, in the case of the Inverse problem for example, most of our calculations will be between points that are within a 1000 nautical miles of each other. In order to minimize the number of expensive calculations we do, we first do a very rough calculation to determine if the more expensive calculation is necessary at all. If the rough distance is below some threshold, say 1000 nautical miles, we then do the more expensive calculation. We usually do not need to calculate very long distances on the surface of the Earth with significant accuracy--only the closer distances to 1 meter accuracy or so. But, there are the occasional longer cases, so I suspect that I cannot replace our expensive calculation with say a spherical Earth algorithm because of the innaccuracy it apparently has. > Have you run your code with a profiler to verify that it > is the distance calculations that take the time and not, > say, disk I/O, decoding DMS to decimal degrees, performing > map projections, etc? Yes, we've profiled our application while it used the Vincenty approach and the methods doing the range and bearing calculations account for about 15% of the CPU's time. There is some I/O when logging to a file, especially at the FINEST log level, but the CPU time is consistent whether logging is turned on or not. Besides the optional logging, there is no disk I/O in this interaction because all data is streamed in from a socket and all results streamed out to a remote machine. There are no networking delays from the local LAN. All the geo-calculations are being done in RAM. All values passed to the methods in question are doubles in decimal degrees and in the case of our Vincenty implementation, are being converted to Radians and back. I am not sure what you mean about map projections. Our application passes on it's results to other machines for display. Our code is written in Java and I have considered writing a Java Native Interface wrapper to a C/C++ implementation, but did not want to go down the less portable path unless no other more efficient geo-algorithms are available. > What's your target performance? We've been using our Vincenty implementation for more than 10 years now and while it's been working fine, I've been tasked with trying to optimize any "low hanging fruit." I'm doing a preliminary assessment of what is available to see if there is a more efficient alternative to our overkill Vincenty implementation. That said, any easily implemented efficiency improvement is worthwhile. To us, GeographicLib is an example of an easy drop in replacement.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Charles Karney - 2016-02-16

Oh, buried in your reply is the fact that you're using Java! Here are
some things to try:

Call Inverse with the outmask flag, Geodesic.DISTANCE |
Geodesic.AZIMUTH. I don't expect this to have any effect. But at
least, you're signaling exactly what you want.

If you are computing way-points, there are substantial savings to be
had (a factor of 2 or more) using the GeodesicLine class. But that's
not what you said you were doing.

Possibly calling C++ from Java will help. I would be interested in
any timing comparisons.

If you are using C++, you can set GEOGRAPHICLIB_GEODESIC_ORDER
to 3 in Geodesic.hpp. This is the smallest allowed value and this
will increase the truncation error from 7.5e-13 m (for order = 6) to
3.7e-5 m for the WGS84 ellipsoid. This is the same order as
Vincenty's methods, but the truncation errors with Vincenty are
larger, 9.1e-5 m. I expect this will result only in a modest speed up
(maybe 10%).

Most likely the real savings will come from rethinking your whole
approach. All you have told me is that you have millions of distances
to calculate and you're not interested in the result if the distance is
greater than 1000 NM. So I can't really offer much guidance.

However, if you looking for the closest point in a set of points to a
given point, there are huge savings to be had by organizing your data
into a quadtree or some similar data structure. This comes up when
computing the median maritime boundary and it essentially changes an
O(N²) computation into an O(N) one.

Last edit: Charles Karney 2016-02-16
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Comment has been marked as spam.
Undo

View and moderate all "Open Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Help"

Anonymous - 2016-02-16

Thanks for the suggestions. I found the masks in the GeodesicMask class, not
Geodesic, and they indeed did not make any noticeable difference, but the change
at least makes our intentions known and may help, if future development on your
end can utilize this information. That said, are you thinking of parallelizing any of
the Java code to take advantage of multiple cores in today's CPUs? Obviously,
you are aware of this, but for the sake of others reading this, using Lambdas,
which were introduced in Java 1.8, you can parallelize with much less effort.

While I like the prospects of doing so, we'll likely not go to the effort of using
JNI unless I am mandated to do so.

I agree that rethinking our approach is likely the best way forward, but that entails
man-years of work refactoring our code. The idea of storing data in a quadtree or
other similar data structure for more efficient access to points of interest is a good
one, but there is the issue of cost to continuously adjust the location of the points
in the structure. The points represent continuously moving vehicles/vessels often
in close proximity to one another, so I envision an insert, move, or remove once
per position report, of which we are getting tens of millions of per day. Then, a
reassessment would need to be done for all the nearest neighbors that were
affected by the change(s). Even so, I suspect this is still better than what we're
doing now.

Unless I am allowed to pursue this avenue of improvement, it too will likely not
be implemented in the near term. That said, are you aware of any faster, less
accurate distance and bearing algorithms that may suit our needs? I realize
that asking this of you is awkward in that you are promoting a "competitor," but
I appreciate any advice you can give--even if it is to avoid other algorithms for
whatever reason.

By the way, thank you for spending this time helping me out.

Thanks for the suggestions. I found the masks in the GeodesicMask class, not Geodesic, and they indeed did not make any noticeable difference, but the change at least makes our intentions known and may help, if future development on your end can utilize this information. That said, are you thinking of parallelizing any of the Java code to take advantage of multiple cores in today's CPUs? Obviously, you are aware of this, but for the sake of others reading this, using Lambdas, which were introduced in Java 1.8, you can parallelize with much less effort. While I like the prospects of doing so, we'll likely not go to the effort of using JNI unless I am mandated to do so. I agree that rethinking our approach is likely the best way forward, but that entails man-years of work refactoring our code. The idea of storing data in a quadtree or other similar data structure for more efficient access to points of interest is a good one, but there is the issue of cost to continuously adjust the location of the points in the structure. The points represent continuously moving vehicles/vessels often in close proximity to one another, so I envision an insert, move, or remove once per position report, of which we are getting tens of millions of per day. Then, a reassessment would need to be done for all the nearest neighbors that were affected by the change(s). Even so, I suspect this is still better than what we're doing now. Unless I am allowed to pursue this avenue of improvement, it too will likely not be implemented in the near term. That said, are you aware of any faster, less accurate distance and bearing algorithms that may suit our needs? I realize that asking this of you is awkward in that you are promoting a "competitor," but I appreciate any advice you can give--even if it is to avoid other algorithms for whatever reason. By the way, thank you for spending this time helping me out.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Charles Karney - 2016-02-16

Surely parallelization can be done by the end-user of GeographicLib.
Once the Geodesic object is constructed, its member functions are all
"const" functions. So multiple threads can safely use the same Geodesic
object. (And of course it would be only slightly less efficient to give
each thread its own private Geodesic object.)

See

http://geographiclib.sourceforge.net/html/geodesic.html#geodshort

for a discussion of Bowring's method of computing short geodesics. This
is a reasonably principled way of reducing the problem to the closest
great circle problem.

My own take on such approximate methods is that you waste a lot of
(human) time ascertaining that the accuracy is sufficient and worrying
whether this still holds for the next application. So I sleep better at
night using an "exact" method.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Can GeographicLib Be Made Faster But Less Accurate?

Geographic library

Forums

Help

Can GeographicLib Be Made Faster But Less Accurate?

Can GeographicLib Be Made Faster But Less Accurate?

Geographic library

Forums

Help

Can GeographicLib Be Made Faster But Less Accurate? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Can GeographicLib Be Made Faster But Less Accurate?