I am right now trying to use it to concatenate several files (about 2000 files). Each file is a tile/slide on a lat, lon with several variables (about 20) so I am concatenating first over longitudes and them over latitudes. I am using ncrcat because the size of each tile is not always the same. So basically I proceed as:
fornumberin{0..40};donpad=$(printf%02d$number)echo$npad'...processing loop--------------------------------'fornumber2in{0..48};don2pad=$(printf%02d$number2)file='slice_'$n2pad'_'$npad'.nc'new="${file/.nc/_rec.nc}"echo' Adjusting'$file' to '$new# Delete unwanted variables and dim3ncks-O-h-x-C-vvars90,vars70,dim3$file$new# Ensure that we reorder to have aggregating dimension firstncpdq-O-h-alon,lat$new$new# Aggregating dimension is defined as unlimited. ncks-O-h--mk_rec_dmnlon$new$newdone# Now we should have a list of *_rec.nc files to aggregate on lon dim.# and aggregate all these files on longitudencrcat-O-h*_rec.nclon_added.ncecho'ok aggregation lon'#this works perfectly!!# now we revert the lon to typical non-record dimensionncks--fix_rec_dmnlonlon_added.nclon_added_fix.nc# now we reorder the variables to be lat,lonncpdq-O-h-alat,lonlon_added_fix.nclon_added_$npad.nc# and finally we define lat as record/unlimited dimensionncks--mk_rec_dmnlat'lon_added_'$npad'.nc''lon_added_'$npad'_rec_lat.nc'rmlon_added.nclon_added_fix.ncrmslice_*_rec.nc# we clean all the files before next loop step.done# all the files named 'lon_added_'$npad'_rec_lat.nc' can be aggreated on lat# and no other files are named with regular expression '*_rec_lat.nc', soncrcat*_rec_lat.nclatlon_added_temp.nc# now we revert the lon to typical non-record dimensionncks--fix_rec_dmnlatlatlon_added_temp.nclatlon_added_fix.nc# finally we ensure that file is lat,lon orderncpdq-O-h-alat,lonlatlon_added_fix.nclatlon_added.nc
Each file is large (about 4Gb) but I simply tried to aggregate two of thems with dimensions
dimensions(sizes): lat(440), lon(43151)
dimensions(sizes): lat(441), lon(43151)
By changing the first loop to
fornumberin{0..1}; do
I waited more than 4 hours and it never finished. I tried also outside of any loop with same results. I tried to compress two of them that I know are mostly zeros (and change to 4G to about 400Mb) and also they seems to be imposible to concatenate.
Maybe I am doing something wrong or simply it is a very slow process?
Thanks in advance,
Ramiro.
Last edit: R. Checa-Garcia 2018-09-06
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Your question is too intricate for me to follow all the information. However, a few general points about large files are in order. 1. Read the manual about the--no_tmp_fl option and use it if warranted. 2. netCDF4 chunking is a two-edged sword. You might try converting to netCDF3 first and then concatenating. 3. It looks from your script like you have an advanced understanding of NCO. Feel free to post a narrower question, realizing there may be no better answer than the two suggestions I just made.
cz
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry for the large question I submitted. The more narrower question would be if there is any option that could potentially accelerate a contatenation of netCDF files over two record dimensions (first one then the other) of large files. I understand from your reply that netCDF3 might be faster which is a very useful information for me. I will also read the manual regarding --no_tmp_fl.
Thanks,
PD. I don't know if I could safely proceed to concatenate netCDF files that are also compressed.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's safe to proceed. But the compression may be slowing things down greatly. converting to netCDF3 (and/or de-compressing with ncks -L 0) might speed things up considerably.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
Firsts, thanks for developing the nco library!
I am right now trying to use it to concatenate several files (about 2000 files). Each file is a tile/slide on a lat, lon with several variables (about 20) so I am concatenating first over longitudes and them over latitudes. I am using ncrcat because the size of each tile is not always the same. So basically I proceed as:
Inital ncinfo of each slide file is like:
After aggregate on lon and reorder dimensions:
Each file is large (about 4Gb) but I simply tried to aggregate two of thems with dimensions
dimensions(sizes): lat(440), lon(43151)
dimensions(sizes): lat(441), lon(43151)
By changing the first loop to
I waited more than 4 hours and it never finished. I tried also outside of any loop with same results. I tried to compress two of them that I know are mostly zeros (and change to 4G to about 400Mb) and also they seems to be imposible to concatenate.
Maybe I am doing something wrong or simply it is a very slow process?
Thanks in advance,
Ramiro.
Last edit: R. Checa-Garcia 2018-09-06
Your question is too intricate for me to follow all the information. However, a few general points about large files are in order. 1. Read the manual about the
--no_tmp_floption and use it if warranted. 2. netCDF4 chunking is a two-edged sword. You might try converting to netCDF3 first and then concatenating. 3. It looks from your script like you have an advanced understanding of NCO. Feel free to post a narrower question, realizing there may be no better answer than the two suggestions I just made.cz
Thanks Charlie for the advises,
Sorry for the large question I submitted. The more narrower question would be if there is any option that could potentially accelerate a contatenation of netCDF files over two record dimensions (first one then the other) of large files. I understand from your reply that netCDF3 might be faster which is a very useful information for me. I will also read the manual regarding --no_tmp_fl.
Thanks,
PD. I don't know if I could safely proceed to concatenate netCDF files that are also compressed.
It's safe to proceed. But the compression may be slowing things down greatly. converting to netCDF3 (and/or de-compressing with
ncks -L 0) might speed things up considerably.