Was there a strong reason for not allowing for any in place modification with the operators, with the exception of ncrename?
It's become a real performance burden to create temporary files for every modification, small or large. Our files can be in the Gigabytes, and some post processing can affect 100 Gb at the least. I understand the risk of an occasional failure, but its one I'm willing to take. Thoughts or comments?
No strong reason besides caution.
I'll put it on the wish list.
We'd be happy to accept a patch that allows
manual override of this "feature".
By the way, ncrename also opens a temporary file.
All the operators do. ncrename is distinct in that it
guesses that fl_out = fl_in unless otherwise specified.
But all computations are actually done to a temporary
copy of fl_out for all operators. Only when fl_out_tmp
is closed is it moved to fl_out.
Is there a definite performance enhancement by modifying an existing file over creating a new one? According to the NetCDF documentation:
"This header has no usable extra space; [...] A disadvantage of this organization is that any operation on a netCDF dataset that requires the header to grow (or, less likely, to shrink), for example adding new dimensions or new variables, requires moving the data by copying it. "
My understanding of the file structure is:
header | fixed-size variables | record variables
So a change in size of any of these required copying the data.
Good point. I forgot about that aspect.
I remember reading somewhere that netCDF no longer
needs to rewrite the entire file when the header shrinks.
But it still does anytime the header grows.
FYI: this feature is now in the main trunk, enabled with the -no_tmp_fl switch:
A. Bypass intermediate temporary files
ncks -no_tmp_fl ~/nco/data/in.nc ~/foo.nc
Only took 8 eight years :)
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.