Menu

#2780 segfault in updateLocal

2020.4
NeedInfo
Medium
2023-04-27
2022-11-19
eatdirt
No

After a few hours of flight, I am getting a regular segfault there (AI deactivated):

Thread 1 "fgfs" received signal SIGSEGV, Segmentation fault.
0x00007efd09196ce2 in SGTime::updateLocal(SGGeod const&, SGPath const&) ()
   from /lib64/libSimGearCore.so.2020.4.0
(gdb) bt
#0  0x00007efd09196ce2 in SGTime::updateLocal(SGGeod const&, SGPath const&) ()
   from /lib64/libSimGearCore.so.2020.4.0
#1  0x0000000000b02122 in TimeManager::updateLocalTime() ()
#2  0x0000000000b041c8 in TimeManager::update(double) ()
#3  0x00007efd09182e46 in SGSubsystemGroup::Member::update(double) ()
   from /lib64/libSimGearCore.so.2020.4.0
#4  0x00007efd09182ef9 in SGSubsystemGroup::updateMembers(int, double) ()
   from /lib64/libSimGearCore.so.2020.4.0
#5  0x00007efd0917ecaa in SGSubsystemMgr::update(double) ()
   from /lib64/libSimGearCore.so.2020.4.0
#6  0x0000000000ce45fd in fgMainLoop() ()
#7  0x0000000000c4863c in fgOSMainLoop() ()
#8  0x0000000000ce999f in fgMainInit(int, char**) ()
#9  0x0000000000545d6e in main ()

Discussion

  • eatdirt

    eatdirt - 2022-11-20

    Another run (2020.4.0, forgot to say), the same segfault with slightly more info:

    Thread 1 "fgfs" received signal SIGSEGV, Segmentation fault.
    SGTime::updateLocal (this=this@entry=0x4c93b70, aLocation=..., root=...)
        at /usr/src/debug/simgear-2020.4.0-17.mga8.x86_64/simgear/timing/sg_time.cxx:241
    241     description = nearestTz->getDescription();
    (gdb) bt
    #0  SGTime::updateLocal (this=this@entry=0x4c93b70, aLocation=..., root=...)
        at /usr/src/debug/simgear-2020.4.0-17.mga8.x86_64/simgear/timing/sg_time.cxx:241
    #1  0x0000000000b02122 in TimeManager::updateLocalTime (this=0x4caa3f0)
        at /usr/src/debug/flightgear-2020.4.0-17.mga8.x86_64/src/Time/TimeManager.cxx:565
    #2  0x0000000000b041c8 in TimeManager::update (this=0x4caa3f0, dt=<optimized out>)
        at /usr/src/debug/flightgear-2020.4.0-17.mga8.x86_64/src/Time/TimeManager.cxx:508
    #3  0x00007fc629aa2e46 in SGSubsystemGroup::Member::update (this=0x4c93930, delta_time_sec=<optimized out>)
        at /usr/src/debug/simgear-2020.4.0-17.mga8.x86_64/simgear/structure/subsystem_mgr.cxx:873
    #4  0x00007fc629aa2ef9 in SGSubsystemGroup::updateMembers (this=0x2548800, loopCount=0, delta_time_sec=0.025000000000000001)
        at /usr/src/debug/simgear-2020.4.0-17.mga8.x86_64/simgear/structure/subsystem_mgr.cxx:436
    #5  0x00007fc629a9ecaa in SGSubsystemMgr::update (this=0x2549000, delta_time_sec=0.025000000000000001)
        at /usr/src/debug/simgear-2020.4.0-17.mga8.x86_64/simgear/structure/subsystem_mgr.cxx:1027
    #6  0x0000000000ce45fd in fgMainLoop () at /usr/src/debug/flightgear-2020.4.0-17.mga8.x86_64/src/Main/main.cxx:162
    #7  0x0000000000c4863c in fgOSMainLoop () at /usr/src/debug/flightgear-2020.4.0-17.mga8.x86_64/src/Viewer/fg_os_osgviewer.cxx:464
    #8  0x0000000000ce999f in fgMainInit (argc=<optimized out>, argv=0x7ffd7855d0b8)
        at /usr/src/debug/flightgear-2020.4.0-17.mga8.x86_64/src/Main/main.cxx:791
    #9  0x0000000000545d6e in main (argc=10, argv=0x7ffd7855d0b8) at /usr/src/debug/flightgear-2020.4.0-17.mga8.x86_64/src/Main/bootstrap.cxx:371
    
     
  • eatdirt

    eatdirt - 2022-11-20

    Mmmm, if SGGeod is returning junk, that could trigger the pb.

    Might be related to? https://sourceforge.net/p/flightgear/codetickets/2764/

     
  • eatdirt

    eatdirt - 2022-11-20
    SGGeod location(aLocation);
        if (!aLocation.isValid()) {
            location = SGGeod();
        }
        SGTimeZone* nearestTz = static_tzContainer->getNearest(location);
    
     

    Last edit: eatdirt 2022-11-20
  • eatdirt

    eatdirt - 2022-11-20

    In math/SGGeod.hxx, a comment says that a new SGGeod() object is by default created to be invalid for historical reasons. So these lines suggest that when aLocation is invalid there is a pb, but if the problem is to be invalid, having a new object "location" is not going to help!

    Edit: That seems to be fine, the constructor set invalid to true but reset lon/lat/evel to 0. However, isValid is not checking NaN on elevation, that might be the culprit.

    I am having this bug in Space, so may be elevation is the trigger for having aLocation valid while it is not. But, I've never had that on 2018.x.y though, why now?

     

    Last edit: eatdirt 2022-11-20
  • James Turner

    James Turner - 2022-11-20

    Line 241 is:
    description = nearestTz->getDescription();

    I'm 99% sure th eproblem is 'nearestTz' being null or invalid. Can you add a check and log message before line 241, somethiing liek if (nearestTz == nullptr) { SG_LOG(SG_GENERAL, SG_ALERT, "No timezone"); return; }

    ... and see if this fixes the crash, and if the log message is printed?

     
  • eatdirt

    eatdirt - 2022-11-21

    You're right, as usual :)

    I am having it under gdb, and location and alocation are perfectly fine indeed.

    And indeed, nearestTz is null as equalized to that guy:

    print static_tzContainer->getNearest(location)
    $5 = (SGTimeZone *) 0x0
    

    It's interesting, there is a conditional in the SGTime::init() before trying to getDescription(), but not in updateLocal().

     

    Last edit: eatdirt 2022-11-21
  • James Turner

    James Turner - 2022-11-22

    If we bail out of updateLocal, the sim local time is going to get weird: should we maybe default to UTC instead? Do you see any problems if you just early-return from updateLocale, in your testing?

     
  • eatdirt

    eatdirt - 2022-11-23

    I've checked a hard bailout:

     if (nearestTz)
        {
          description = nearestTz->getDescription();
        }
        else {
          std::cout << " location " << location << "  alocaltion " << aLocation; 
          return;
        }
    

    and that is very rarely happening. So, if you want to skip the whole updatelocal above a certain altitude, UTC would be the choice (notice however that all the other functions like sidereal time etc.. are required for space flight). But if the idea is to bailout only when that pointer is null, I would let the local time, it recovers the next iteration and the time zone remains fine.

    Now, why that pointer is becoming null seems to be a bug elsewhere?

    PS: This bug seems to be triggered by a very peculiar situation for the Space Shuttle, it happens soon after we simulate a second orbiting target (here the Hubble Space Telescope), I haven't figured out why, but we stress things by doing this :)

    NB: Unrelated to this bug, space flights are also stressing the tile manager, I have megabytes of this in the logs (the terrain is not displayed as clipped and replaced by earthview). Maybe there could be a way to stop this above a certain altitude as well.

    1291.33 [INFO]:terrain   /home/eatdirt/perso/flightgear/BUILD/flightgear-2020.4.0/src/Scenery/tilemgr.cxx:285: sched_tile: new STG tile entry for:-20:0, -18:0
    
     
  • James Turner

    James Turner - 2022-11-24

    There's definitely an underlying bug that we ever get a null tz value, but fixing that may be trickier so let's just bail out for now, if the situation is so obscure.

    About the other message, maybe start a seperate discussion on disabling tile loading when earthview is active? Since I agree that probably makes sense.

     
  • James Turner

    James Turner - 2022-11-24
    • status: New --> NeedInfo
    • assigned_to: James Turner
     
  • James Turner

    James Turner - 2022-11-24

    Pushed the work around to next now, it logs the problem SGGeod so curious to see what values occur. Leaving this as NeedInfo for now, to fix the real issue of getting the nearest TZ failing.

     
  • eatdirt

    eatdirt - 2022-11-29

    Here we go, maybe the Shuttle being fast it just has a highest probability to explore a region where the timezone has a bug? It is really only this, no error after passing this zone, no error before!

    190.00 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.3416deg, lat = 12.325deg, elev = 532016m
      190.18 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.3269deg, lat = 12.3174deg, elev = 532015m
      190.46 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.3116deg, lat = 12.3095deg, elev = 532014m
      190.72 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.2973deg, lat = 12.3022deg, elev = 532013m
      190.99 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.283deg, lat = 12.2948deg, elev = 532012m
      191.28 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.2682deg, lat = 12.2872deg, elev = 532010m
      191.59 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.2516deg, lat = 12.2787deg, elev = 532009m
      191.90 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.235deg, lat = 12.2701deg, elev = 532008m
      192.21 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.218deg, lat = 12.2614deg, elev = 532007m
      192.53 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.2037deg, lat = 12.254deg, elev = 532006m
      192.74 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.1893deg, lat = 12.2466deg, elev = 532004m
      193.04 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.1732deg, lat = 12.2383deg, elev = 532003m
      193.32 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.158deg, lat = 12.2305deg, elev = 532002m
      193.61 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.1428deg, lat = 12.2226deg, elev = 532001m
      193.93 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.1262deg, lat = 12.2141deg, elev = 532000m
      194.27 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.1083deg, lat = 12.2049deg, elev = 531998m
      194.55 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.0931deg, lat = 12.197deg, elev = 531997m
      194.86 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.0761deg, lat = 12.1883deg, elev = 531996m
      195.16 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.0599deg, lat = 12.18deg, elev = 531995m
      195.50 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.042deg, lat = 12.1707deg, elev = 531993m
      195.94 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.0192deg, lat = 12.1589deg, elev = 531992m
      196.21 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -69.0031deg, lat = 12.1506deg, elev = 531990m
      196.55 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.9852deg, lat = 12.1414deg, elev = 531989m
      196.87 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.9682deg, lat = 12.1326deg, elev = 531988m
      197.22 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.9494deg, lat = 12.1229deg, elev = 531987m
      197.52 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.9333deg, lat = 12.1146deg, elev = 531985m
      197.85 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.9159deg, lat = 12.1056deg, elev = 531984m
      198.15 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.9015deg, lat = 12.0982deg, elev = 531983m
      198.40 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.8859deg, lat = 12.0901deg, elev = 531982m
      198.65 [INFO]:nasal      CDR2: Button 1
      198.65 [INFO]:nasal      Button routing to MEDSOmsMps CDR2
      198.67 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.8716deg, lat = 12.0827deg, elev = 531981m
      199.04 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.8532deg, lat = 12.0732deg, elev = 531980m
      199.31 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.8376deg, lat = 12.0651deg, elev = 531979m
      199.63 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.8201deg, lat = 12.0561deg, elev = 531977m
      199.90 [INFO]:nasal      CDR2: Button 4
      199.90 [INFO]:nasal      Button routing to MainMenu CDR2
      199.96 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.8027deg, lat = 12.0471deg, elev = 531976m
      199.97 [INFO]:nasal      DPS update CDR2
      200.29 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.7884deg, lat = 12.0397deg, elev = 531975m
      200.47 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.7736deg, lat = 12.0321deg, elev = 531974m
      200.75 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.7593deg, lat = 12.0247deg, elev = 531973m
      201.01 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.745deg, lat = 12.0173deg, elev = 531972m
      201.29 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.7303deg, lat = 12.0096deg, elev = 531971m
      201.61 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.7137deg, lat = 12.0011deg, elev = 531970m
      202.05 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.6918deg, lat = 11.9897deg, elev = 531968m
      202.33 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.6753deg, lat = 11.9812deg, elev = 531967m
      202.64 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.661deg, lat = 11.9738deg, elev = 531966m
      202.86 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.6458deg, lat = 11.9659deg, elev = 531965m
      203.14 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.6311deg, lat = 11.9582deg, elev = 531964m
      203.43 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.6159deg, lat = 11.9504deg, elev = 531963m
      203.69 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.6016deg, lat = 11.9429deg, elev = 531962m
      204.01 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.5859deg, lat = 11.9348deg, elev = 531961m
      204.30 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.5716deg, lat = 11.9274deg, elev = 531960m
      204.53 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.5564deg, lat = 11.9195deg, elev = 531959m
      204.81 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.5421deg, lat = 11.9121deg, elev = 531958m
      205.15 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.5247deg, lat = 11.9031deg, elev = 531957m
      205.41 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.5095deg, lat = 11.8952deg, elev = 531956m
      205.69 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.4952deg, lat = 11.8878deg, elev = 531955m
      205.99 [ALRT]:environment SGTime::updateLocal: Timezone not found for location: lon = -68.4787deg, lat = 11.8792deg, elev = 531954m
    
     
  • James Turner

    James Turner - 2022-11-30

    Just checking, this is location in the Carribean, off the north west tip of Curaco: I'm wondering if our timezone data has an edge case there?

     
  • James Turner

    James Turner - 2022-11-30

    I added a unit-test for this: (-69.5, 12.0) works (gets TZ of Americas/Caracas, (-69.0, 12.0) does not find a matching time zone and trips the error. I suspect this is a bug in the zone-detect input data we use, but I'm not sure how we get a fix there.

    Importantly; this is nothing to do with the Shuttle: if you fly with UFO or anything in this area, you would have got the same crash.

     
  • eatdirt

    eatdirt - 2022-11-30

    Well found! I remember some loading of a binary file in the code, maybe it misses data (timezone16.bin), but that would require some work to check all this (FGDATA/Timezone has some info though).

     
  • James Turner

    James Turner - 2022-12-01

    I've pushed an updated timezone16.bit to FGData, which fixes the problem for me (in this location), I will backport that to 2020.3

     
  • xDraconian

    xDraconian - 2023-04-27
    • labels: --> Segfault, SimGear
     

Log in to post a comment.