Looping through every pdb file in a directory and reading
them into the same Chemistry::MacroMol variable caused a
linear increase in RAM usage, which shouldn't happen since I
was reusing the variable.
The following snippet should reproduce this behaviour, given
that the working directory has several pdb files:
use Chemistry::File::PDB;
use Chemistry::MacroMol;
# The directory should have several pdb files
my @files = <*.pdb>;
for my $file (@files) {
my $structure = Chemistry::MacroMol->read( $files[0] );
}
This also happens if one tries to undef $structure
explicitly. To confirm the leak and have more information on
the actual modules causing it, I used Devel::Leak::Object
with a one atom pdb file:
This is the content of the file 'atom.pdb'
ATOM 1 N VAL L 1 9.069 116.493 216.070 12 1 1.63 -0.15
This is the code to detect the circular references:
use Devel::Leak::Object qw( GLOBAL_bless );
use Chemistry::File::PDB;
use Chemistry::MacroMol;
my $structure = Chemistry::MacroMol->read( 'atom.pdb' );
And this is are the detected objects that were not
garbage-collected:
Chemistry::Atom 1
Chemistry::Domain 1
Config 1
FileHandle 3
Math::VectorReal 1
Even more information comes from applying diagnostics with
the module Devel::Cycle, which shows explicitely what the
circular references are. I applied the following code:
use Devel::Cycle;
use Chemistry::File::PDB;
use Chemistry::MacroMol;
my $structure = Chemistry::MacroMol->read( 'atom.pdb' );
find_cycle($structure);
which produced a 63-line output with 8 cycles (attached).
Repeating the analysis replacing 'find_cycle' with
'find_weakened_cycle' produced a 127-line output with 18
cycles (attached). I am aware that this weakened cycles are
not unintentional, and are not producing the leakage, but
the first ones might be the cause of the observed memory
increase.
Circular references detected by Devel::Cycle when parsing the file atom.pdb
Sorry, the 7th. line of first code snippet should be:
my $structure = Chemistry::MacroMol->read( $file );