Hi, I just wanted to clarify what I'm doing is what I want to do. I have lots of files, each containing a winter of data for 1 variable. I want to combine these together to obtain the climatology so I can subtract it out.. As an example, let's say I have air.8485.nc, air.8586.nc and air.8687.nc, each file containg daily data for 160 days. So I run:
ncea air.8485.nc air.8586.nc air.8687.nc air.avg.nc
My question is: is it averaging at each grid point and time through the 3 input files? Meaning each 4-d grid point (lat, lon, level, day of year) is treated seperately? From reading the documentation I think that's correct, I just want to make sure.
BTW, Thanks charlie for answering my earlier question about combining files.
Happy holidays.
Brent A. McDaniel
Dept of Earth and Atmospheric Sciences
Ga Tech
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
A followup to my above question. I went ahead and tried doing this for the 40 winters that I have which comprises about 2.3gigs of data. This causes a memory problem:
ncea: ERROR Unable to malloc() 27163008*2 bytes in var_get()
I'm running this on a linux box with 128megs of ram and lots of free disk space (~25gigs). If I increased the swap space to >3gigs would it work then?
Thanks for the help!
Brent
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, I'll continue my thread here as I try (and fail) at more things. I tried ncea with less than my original list of input files (i started with all 40, thinking big!). I tried it with various combinations down to only 2 input files (the input files are:
-rw-rw-r-- 1 gte328r gte328r 54330040 Dec 14 17:13 air.7778.nc
So 2 of them togethere is around 110megs. Looking at my memory info I have 66megs of mem free and 113megs of swap available. How much more do I need to do this?
Thanks in advance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
first, to clarify, ncea memory usage scales with
input file size, not with number of files (which
is arbitrary). thus reducing number of files
will have no effect. read section on memory
usage in the manual and it will become clear
what ncea is doing. basically ncea must hold
the running total of every variable considered in memory, and, on top of that, the current variable
from the current file. There is no way to
perform ensemble averaging more efficiently
than this without opening each file more than
once (a design decision deeply embedded in
NCO).
if memory allocation continues to fail, consider
subsetting the input files by variable (e.g.,
ncea -v wind in1.nc in2.nc out.nc). You may
be able to do ensemble averaging of all required
variables separately and then construct the
combined output file with ncks -A.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
the error message seems to indicate that you would
need at least 54 megabytes per variable to get
ncea to successfully malloc() at this point in the
code. look under "memory usage" in the manual
for a description of ncea memory usage. it is
very memory intensive. it requires about 2*input_file_size to operate successfully on entire
file.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
preface to responses:
i've been travelling without i'net access
for the last week, hence the delay.
ncea will create an output file the same
size as one of the input files. each value
in the output file will be the average of
the points at the corresponding location
in the input files. in your case this
corresponds to the output being a time
average of the input, where the average
is performed over the three input winters.
so no spatial averaging is performed.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I just wanted to clarify what I'm doing is what I want to do. I have lots of files, each containing a winter of data for 1 variable. I want to combine these together to obtain the climatology so I can subtract it out.. As an example, let's say I have air.8485.nc, air.8586.nc and air.8687.nc, each file containg daily data for 160 days. So I run:
ncea air.8485.nc air.8586.nc air.8687.nc air.avg.nc
My question is: is it averaging at each grid point and time through the 3 input files? Meaning each 4-d grid point (lat, lon, level, day of year) is treated seperately? From reading the documentation I think that's correct, I just want to make sure.
BTW, Thanks charlie for answering my earlier question about combining files.
Happy holidays.
Brent A. McDaniel
Dept of Earth and Atmospheric Sciences
Ga Tech
A followup to my above question. I went ahead and tried doing this for the 40 winters that I have which comprises about 2.3gigs of data. This causes a memory problem:
ncea: ERROR Unable to malloc() 27163008*2 bytes in var_get()
I'm running this on a linux box with 128megs of ram and lots of free disk space (~25gigs). If I increased the swap space to >3gigs would it work then?
Thanks for the help!
Brent
Well, I'll continue my thread here as I try (and fail) at more things. I tried ncea with less than my original list of input files (i started with all 40, thinking big!). I tried it with various combinations down to only 2 input files (the input files are:
-rw-rw-r-- 1 gte328r gte328r 54330040 Dec 14 17:13 air.7778.nc
So 2 of them togethere is around 110megs. Looking at my memory info I have 66megs of mem free and 113megs of swap available. How much more do I need to do this?
Thanks in advance.
first, to clarify, ncea memory usage scales with
input file size, not with number of files (which
is arbitrary). thus reducing number of files
will have no effect. read section on memory
usage in the manual and it will become clear
what ncea is doing. basically ncea must hold
the running total of every variable considered in memory, and, on top of that, the current variable
from the current file. There is no way to
perform ensemble averaging more efficiently
than this without opening each file more than
once (a design decision deeply embedded in
NCO).
if memory allocation continues to fail, consider
subsetting the input files by variable (e.g.,
ncea -v wind in1.nc in2.nc out.nc). You may
be able to do ensemble averaging of all required
variables separately and then construct the
combined output file with ncks -A.
the error message seems to indicate that you would
need at least 54 megabytes per variable to get
ncea to successfully malloc() at this point in the
code. look under "memory usage" in the manual
for a description of ncea memory usage. it is
very memory intensive. it requires about 2*input_file_size to operate successfully on entire
file.
preface to responses:
i've been travelling without i'net access
for the last week, hence the delay.
ncea will create an output file the same
size as one of the input files. each value
in the output file will be the average of
the points at the corresponding location
in the input files. in your case this
corresponds to the output being a time
average of the input, where the average
is performed over the three input winters.
so no spatial averaging is performed.