Menu

Can GeographicLib Be Made Faster But Less Accurate?

Anonymous
2016-02-15
2016-02-16
  • Anonymous

    Anonymous - 2016-02-15

    Is there a way to make GeographicLib run faster, but less
    accurately? Accuracy to the nearest meter is sufficient for
    my purposes. Is there a flag I can set or mode I can specify
    to have the library do less work? I only need to do
    distance and bearing calculations on the surface of the Earth
    and on no other shapes, if that helps eliminate any extra
    code logic that may be used.

    I've implemented the Vincenty approach and it is a bit faster
    on average, but I am looking for a more realiable approach and
    even Vincenty is more accurate than I need.

    I believe calculations on a spherical model are not accurate
    enough in many cases, if the results in the accepted answer
    here:

    https://gis.stackexchange.com/questions/25494/how-accurate-is-approximating-the-earth-as-a-sphere

    ...are to be believed.

    Does anyone have any suggestions?

     
  • Charles Karney

    Charles Karney - 2016-02-16

    Some questions:

    1. Please give more details on the calculations you're trying to do.
      Presumably you're doing millions of such calculations for this
      question to come up?
    2. Have you run your code with a profiler to verify that it is the
      distance calculations that take the time and not, say, disk I/O,
      decoding DMS to decimal degrees, performing map projections, etc?
    3. What's your target performance?
     
  • Anonymous

    Anonymous - 2016-02-16

    Presumably you're doing millions of such calculations for this
    question to come up?

    Yes. To clarify the types of calculations we typically do, in
    the case of the Inverse problem for example, most of our
    calculations will be between points that are within a 1000
    nautical miles of each other. In order to minimize the number
    of expensive calculations we do, we first do a very rough
    calculation to determine if the more expensive calculation is
    necessary at all. If the rough distance is below some
    threshold, say 1000 nautical miles, we then do the more
    expensive calculation. We usually do not need to calculate
    very long distances on the surface of the Earth with
    significant accuracy--only the closer distances to 1 meter
    accuracy or so. But, there are the occasional longer cases,
    so I suspect that I cannot replace our expensive calculation
    with say a spherical Earth algorithm because of the
    innaccuracy it apparently has.

    Have you run your code with a profiler to verify that it
    is the distance calculations that take the time and not,
    say, disk I/O, decoding DMS to decimal degrees, performing
    map projections, etc?

    Yes, we've profiled our application while it used the Vincenty
    approach and the methods doing the range and bearing
    calculations account for about 15% of the CPU's time. There
    is some I/O when logging to a file, especially at the FINEST
    log level, but the CPU time is consistent whether logging is
    turned on or not. Besides the optional logging, there is no
    disk I/O in this interaction because all data is streamed
    in from a socket and all results streamed out to a remote
    machine. There are no networking delays from the local LAN.
    All the geo-calculations are being done in RAM. All values
    passed to the methods in question are doubles in decimal
    degrees and in the case of our Vincenty implementation, are
    being converted to Radians and back. I am not sure what you
    mean about map projections. Our application passes on it's
    results to other machines for display. Our code is written
    in Java and I have considered writing a Java Native Interface
    wrapper to a C/C++ implementation, but did not want to go
    down the less portable path unless no other more efficient
    geo-algorithms are available.

    What's your target performance?

    We've been using our Vincenty implementation for more than 10
    years now and while it's been working fine, I've been tasked
    with trying to optimize any "low hanging fruit." I'm doing
    a preliminary assessment of what is available to see if
    there is a more efficient alternative to our overkill Vincenty
    implementation. That said, any easily implemented efficiency
    improvement is worthwhile. To us, GeographicLib is an
    example of an easy drop in replacement.

     
  • Charles Karney

    Charles Karney - 2016-02-16

    Oh, buried in your reply is the fact that you're using Java! Here are
    some things to try:

    • Call Inverse with the outmask flag, Geodesic.DISTANCE |
      Geodesic.AZIMUTH. I don't expect this to have any effect. But at
      least, you're signaling exactly what you want.

    • If you are computing way-points, there are substantial savings to be
      had (a factor of 2 or more) using the GeodesicLine class. But that's
      not what you said you were doing.

    • Possibly calling C++ from Java will help. I would be interested in
      any timing comparisons.

    • If you are using C++, you can set GEOGRAPHICLIB_GEODESIC_ORDER
      to 3 in Geodesic.hpp. This is the smallest allowed value and this
      will increase the truncation error from 7.5e-13 m (for order = 6) to
      3.7e-5 m for the WGS84 ellipsoid. This is the same order as
      Vincenty's methods, but the truncation errors with Vincenty are
      larger, 9.1e-5 m. I expect this will result only in a modest speed up
      (maybe 10%).

    Most likely the real savings will come from rethinking your whole
    approach. All you have told me is that you have millions of distances
    to calculate and you're not interested in the result if the distance is
    greater than 1000 NM. So I can't really offer much guidance.

    However, if you looking for the closest point in a set of points to a
    given point, there are huge savings to be had by organizing your data
    into a quadtree or some similar data structure. This comes up when
    computing the median maritime boundary and it essentially changes an
    O(N2) computation into an O(N) one.

     

    Last edit: Charles Karney 2016-02-16
  • Anonymous

    Anonymous - 2016-02-16

    Thanks for the suggestions. I found the masks in the GeodesicMask class, not
    Geodesic, and they indeed did not make any noticeable difference, but the change
    at least makes our intentions known and may help, if future development on your
    end can utilize this information. That said, are you thinking of parallelizing any of
    the Java code to take advantage of multiple cores in today's CPUs? Obviously,
    you are aware of this, but for the sake of others reading this, using Lambdas,
    which were introduced in Java 1.8, you can parallelize with much less effort.

    While I like the prospects of doing so, we'll likely not go to the effort of using
    JNI unless I am mandated to do so.

    I agree that rethinking our approach is likely the best way forward, but that entails
    man-years of work refactoring our code. The idea of storing data in a quadtree or
    other similar data structure for more efficient access to points of interest is a good
    one, but there is the issue of cost to continuously adjust the location of the points
    in the structure. The points represent continuously moving vehicles/vessels often
    in close proximity to one another, so I envision an insert, move, or remove once
    per position report, of which we are getting tens of millions of per day. Then, a
    reassessment would need to be done for all the nearest neighbors that were
    affected by the change(s). Even so, I suspect this is still better than what we're
    doing now.

    Unless I am allowed to pursue this avenue of improvement, it too will likely not
    be implemented in the near term. That said, are you aware of any faster, less
    accurate distance and bearing algorithms that may suit our needs? I realize
    that asking this of you is awkward in that you are promoting a "competitor," but
    I appreciate any advice you can give--even if it is to avoid other algorithms for
    whatever reason.

    By the way, thank you for spending this time helping me out.

     
  • Charles Karney

    Charles Karney - 2016-02-16

    Surely parallelization can be done by the end-user of GeographicLib.
    Once the Geodesic object is constructed, its member functions are all
    "const" functions. So multiple threads can safely use the same Geodesic
    object. (And of course it would be only slightly less efficient to give
    each thread its own private Geodesic object.)

    See

    http://geographiclib.sourceforge.net/html/geodesic.html#geodshort

    for a discussion of Bowring's method of computing short geodesics. This
    is a reasonably principled way of reducing the problem to the closest
    great circle problem.

    My own take on such approximate methods is that you waste a lot of
    (human) time ascertaining that the accuracy is sufficient and worrying
    whether this still holds for the next application. So I sleep better at
    night using an "exact" method.

     

Anonymous
Anonymous

Add attachments
Cancel