|
From: Patrick C. <pat...@gm...> - 2020-08-09 02:44:10
|
And all I did was ask. Apparently the right way. Wow. Thanks Scott. Will study this in detail later tonight. Exercises for the student: 0. Checkout Flightgear and Simgear and fgdata as of commits dated 8/8/2020. Do a build. 1. Replicate the valgrind run locally using Scott's valgrind infocation, look for the given example. 2. Run in debugger (probably gdbgui). set breakpoints to illustrate the example further. 3. Try to confirm Scott's conclusion. 4. Look for other examples of raw pointer usage in flightgear/simgear code. 5. Find out more about raw vs shared_ptr and the advantages/disadvantages of each 6. Observe response from other flightgear-devel participants to Scott's recommendation. Will start working on this list tomorrow. -Pat https://giphy.com/gifs/D3OdaKTGlpTBC On Sat, Aug 8, 2020 at 5:59 PM Scott <sc...@gm...> wrote: > Per Pat's request, let me share my thought process on analyzing a memory > leak. If you find this useful, please let me know if you would like me to > do another walk-through. > > Let's start with the basics: > > Running valgrind (which defaults to using the *memcheck* tool). I > used the below command to capture the report: > > *valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes > -s ./flightgear/bin/fgfs --aircraft=ufo --airport=PHTO --fg-root=$FG_ROOT* > Let's focus on just one leak that was detected > > On 8/7/20 11:21 AM, Scott wrote: > > 01 ==9803== 99,418,032 bytes in 654,066 blocks are still reachable in > loss record 15,500 of 15,501 > 02 ==9803== at 0x483BE8E: operator new(unsigned long) > (vg_replace_malloc.c:342) > 03 ==9803== by 0xE3F2F0: > SGPropertyNode::getChild(std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > const&, int, bool) > (props.cxx:1199) > 04 ==9803== by 0xE4A5CB: copyProperties (props_io.cxx:808) > 05 ==9803== by 0xE4A5CB: copyProperties(SGPropertyNode const*, > SGPropertyNode*) (props_io.cxx:788) > 06 ==9803== by 0xE4A58E: copyProperties (props_io.cxx:813) > 07 ==9803== by 0xE4A58E: copyProperties(SGPropertyNode const*, > SGPropertyNode*) (props_io.cxx:788) > 08 ==9803== by 0xE4A58E: copyProperties (props_io.cxx:813) > 09 ==9803== by 0xE4A58E: copyProperties(SGPropertyNode const*, > SGPropertyNode*) (props_io.cxx:788) > 10 ==9803== by 0xF9BCD4: > simgear::effect::mergePropertyTrees(SGPropertyNode*, SGPropertyNode const*, > SGPropertyNode const*) (makeEffect.cxx:88) > 11 ==9803== by 0xF9DFB6: simgear::makeEffect(SGPropertyNode*, bool, > simgear::SGReaderWriterOptions const*) (makeEffect.cxx:211) > 12 ==9803== by 0xFA1EE4: > SGMaterial::buildEffectProperties(simgear::SGReaderWriterOptions const*) > (mat.cxx:537) > 13 ==9803== by 0xFA7155: SGMaterial::SGMaterial(osgDB::Options const*, > SGPropertyNode const*, SGPropertyNode*, std::vector<SGRect<float>, > std::allocator<SGRect<float> > >*, SGSharedPtr<SGCondition const>) > (mat.cxx:121) > 14 ==9803== by 0xFA9E44: SGMaterialLib::load(SGPath const&, SGPath > const&, SGPropertyNode*) (matlib.cxx:136) > 15 ==9803== by 0xC1144D: fgCreateSubsystems(bool) (fg_init.cxx:957) > 16 ==9803== by 0xC3C5F6: fgIdleFunction() (main.cxx:363) > > First thing I do is look at the big picture: > > You read the report from bottom up. > fgIdleFunction was where things started. > SGPropertyNode::getChild() is where the memory was > allocated. > > 0xC1???? - 0xC3???? = Appears to be code within > FlightGear > 0xE3???? - 0xFA???? = Appears to be code within > SimGear > 0x48???? = Appears to be code within Valgrind > > ==9803== is the thread id. Since all are identical > we know this is single-threaded. > > [Line 01] 99,418.032 total bytes have leaked > 654,066 blocks represents how many times memory was > allocated. > total/blocks = 99,418,032 / 654,066 = *152 bytes* > allocated 654,066 times. > > conclusion: we're looking for a leak of ~152 bytes. > This same code (memory usage) is repeatedly used, eventually leading to > leaking nearly 100 MB of memory. The longer the flight, the more memory > being lost. > > [Line 03] props.cxx; line 1199 is the source code location where > the leaked memory was allocated. > > > SGPropertyNode * > SGPropertyNode::getChild (const std::string& name, int index, bool create) > { > #if PROPS_STANDALONE > const char *n = name.c_str(); > int pos = find_child(n, n + strlen(n), index, _children); > if (pos >= 0) { > return _children[pos]; > #else > SGPropertyNode* node = getExistingChild(name.begin(), name.end(), index); > if (node) { > return node; > #endif > } else if (create) { > SGPropertyNode* node = new SGPropertyNode(name, index, this); <== > [Line 1199] > _children.push_back(node); > fireChildAdded(node); > return node; > } else { > return 0; > } > } > > conclusion: SGPropertyNode is the resource being > leaked. We know the resource is heap allocated. > The resource is being captured > via raw pointer. > Raw pointer is added to container > _children > Raw pointer is shared with > fireChildAdded() > Raw pointer is shared with the > calling code (via return statement) > We are within the SGPropertyNode > class, calling a method by the name of getChild, and we are creating a new > object as ourselves > We pass a reference to > self (this) as the third argument of the constructor > So we know we are dealing > with a parent-child relationship > My expectation is that the > parent will trigger the deletion of the children when the parent is > destructed - this is reinforced by the fact that the new object is added to > the _children container > > The way this raw pointer is being > distributed makes is a good candidate for std::shared_ptr (shared > ownership) or std::unique_ptr (sole ownership). > > Let's next look to validate our theory that during parent destruction, the > _children container will be iterated to also destroy each child object. > > /** > * Destructor. > */ > SGPropertyNode::~SGPropertyNode () > { > // zero out all parent pointers, else they might be dangling > for (unsigned i = 0; i < _children.size(); ++i) > _children[i]->_parent = 0; > clearValue(); > > if (_listeners) { > vector<SGPropertyChangeListener*>::iterator it; > for (it = _listeners->_items.begin(); it != _listeners->_items.end(); > ++it) > (*it)->unregister_property(this); > delete _listeners; > } > } > > conclusion: The _children contain is in fact iterated, > but not to destroy the child objects, but only to clear its *_parent* > data member. > Appears that destruction of the > parent container is intentionally orphaning the children. > Let's examine that clearValue() > method to see if it destructs the children objects. > > void > SGPropertyNode::clearValue () > { > if (_type == props::ALIAS) { > put(_value.alias); > _value.alias = 0; > } else if (_type != props::NONE) { > switch (_type) { > case props::BOOL: > _local_val.bool_val = SGRawValue<bool>::DefaultValue(); > break; > case props::INT: > _local_val.int_val = SGRawValue<int>::DefaultValue(); > break; > case props::LONG: > _local_val.long_val = SGRawValue<long>::DefaultValue(); > break; > case props::FLOAT: > _local_val.float_val = SGRawValue<float>::DefaultValue(); > break; > case props::DOUBLE: > _local_val.double_val = SGRawValue<double>::DefaultValue(); > break; > case props::STRING: > case props::UNSPECIFIED: > if (!_tied) { > delete [] _local_val.string_val; > } > _local_val.string_val = 0; > break; > default: // avoid compiler warning > break; > } > delete _value.val; > _value.val = 0; > } > _tied = false; > _type = props::NONE; > } > > conclusion: Nope. Nothing related to memory > management of the _children here > > Let's return back to when the resource's raw pointer was shared with > fireChildAdded() method. Based on the name, I would assume the usage > within the method is transient (i.e. short lived and temporary). > > void > SGPropertyNode::fireChildAdded (SGPropertyNode * child) > { > fireChildAdded(this, child); > } > > conclusion: As expected. Let's follow the flow. > > > void > SGPropertyNode::fireChildAdded (SGPropertyNode * parent, > SGPropertyNode * child) > { > forEachListener( > parent, > _listeners, > [&](SGPropertyChangeListener* listener) { > listener->childAdded(parent, child);} > ); > if (_parent != 0) > _parent->fireChildAdded(parent, child); > } > > conclusion: As expected. Dead end. > > Last thing to research is what does the calling code do with the raw > pointer that was shared with it via the > *return node; *We need to examine [Line 04] to determine the code which > called getChild() > > /** > * Copy one property tree to another. > * > * @param in The source property tree. > * @param out The destination property tree. > * @return true if all properties were copied, false if some failed > * (for example, if the property's value is tied read-only). > */ > bool > copyProperties (const SGPropertyNode *in, SGPropertyNode *out) > { > using namespace simgear; > bool retval = copyPropertyValue(in, out); > if (!retval) { > return false; > } > > // copy the attributes. > out->setAttributes( in->getAttributes() ); > > // Next, copy the children. > int nChildren = in->nChildren(); > for (int i = 0; i < nChildren; i++) { > const SGPropertyNode * in_child = in->getChild(i); > SGPropertyNode * out_child = out->getChild(in_child->getNameString(), > in_child->getIndex(), > false); > if (!out_child) > { > out_child = out->getChild(in_child->getNameString(), > in_child->getIndex(), > true); > } > > if (out_child && !copyProperties(in_child, out_child)) > retval = false; > } > > return retval; > } > > conclusion: We have a recursive function happening. > We can see that reflected in the valgrind trace - copyProperties is called > several times. > The part we are most interested in > is whether this method continues to share the node when it returns. > The return type is *bool* and not > *SGPropertyNode**, so this is the end of the line. > > Final thoughts: > > We're seen that no code is taking responsibility for destruction of > the child nodes. > The severance of the parent-child relationship, thus making the child > nodes orphans may have been done intentionally to enable the copying of the > SGPropertyNode within the PropertyTree. > So if the same SGPropertyNode can be copied to multiple locations > within the PropertyTree, how will I know when that node is no longer used > by any parent node? > That's exactly what the std::shared_ptr was designed for. It > reference counts every usage and automatically destructs the resource when > the last usage is terminated. > > Recommend fix: > > Convert the raw pointer to shared_ptr > Destruct the _children when the parent is destructed > The shared_ptr will own the responsibility to free the resources when > the last parent is destroyed. > > Let's get a second opinion from our veteran core developers to ensure we > haven't overlooked an important aspect of the property tree. > > James, Stuart, Richard - thoughts? > > > > _______________________________________________ > Flightgear-devel mailing list > Fli...@li... > https://lists.sourceforge.net/lists/listinfo/flightgear-devel > |