We have been trying to use the ncremap function in 4.6.1, we had been using it just fine in 4.6.0 but now it appears to be broken.
We are trying to run this command
ncremap -s SCRIP_INPUT -i INPUT_FILE -E "--line_type greatcircle" -d DESTINATION_FILE -o OUTPUT_FILE
Like so, notice there is all this "message is" output?
tmp/test_esmf_regrid> ncremap -s ~/noback/Cubed_Sphere_Grids/PE2880x17280-CF.nc4 -i TOTPRES.c2880_1d.nc4 -E "--line_type greatcircle" -d example_nr_file.nc4 -o output.nc4 -D 3
dbg: alg_opt = bilinear
dbg: cln_flg = Yes
dbg: dbg_lvl = 3
dbg: drc_in = /home/bmauer/noback/tmp/test_esmf_regrid
dbg: drc_out = .
dbg: drc_tmp = /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer
dbg: dst_fl = example_nr_file.nc4
dbg: gaa_sng = --gaa remap_script=ncremap --gaa remap_hostname=discover18.prv.cube --gaa remap_version="4.6.1"
message
is
dbg: grd_dst = /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_grd_dst.nc.pid18035
dbg: grd_sng = --rgr grd_ttl='Default internally-generated grid' --rgr grid=/gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_grd_dst.nc.pid18035 --rgr latlon=100,100 --rgr snwe=30.0,70.0,-130.0,-90.0
dbg: grd_src = /home/bmauer/noback/Cubed_Sphere_Grids/PE2880x17280-CF.nc4
dbg: hdr_pad = 1000
dbg: job_nbr = 2
dbg: in_fl = TOTPRES.c2880_1d.nc4
dbg: map_fl = /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_map_esmf_bilinear.nc.pid18035
dbg: map_mk = Yes
dbg: mlt_map = Yes
dbg: mpi_flg = No
dbg: nco_opt = -D 3 -O --no_tmp_fl --gaa remap_script=ncremap --gaa remap_hostname=discover18.prv.cube --gaa remap_version="4.6.1"
message
is --hdr_pad=1000
dbg: nd_nbr = 1
dbg: out_fl = output.nc4
dbg: par_typ = nil
dbg: spt_pid = 18035
dbg: thr_nbr = 2
dbg: unq_sfx = .pid18035
dbg: var_lst =
dbg: var_rgr =
dbg: wgt_usr =
Asked to regrid 1 files:
TOTPRES.c2880_1d.nc4
NCO regridder invoked with command:
ncremap -s /home/bmauer/noback/Cubed_Sphere_Grids/PE2880x17280-CF.nc4 -i TOTPRES.c2880_1d.nc4 -E --line_type greatcircle -d example_nr_file.nc4 -o output.nc4 -D 3
Started processing at Thu Oct 27 16:12:43 EDT 2016.
Running remap script ncremap from directory /gpfsm/dswdev/mathomp4/Baselibs/GMAO-Baselibs-5_0_2/x86_64-unknown-linux-gnu/ifort_16.0.3.210-intelmpi_5.1.3.210/Linux/bin
NCO version "4.6.1"
message
is from directory /gpfsm/dswdev/mathomp4/Baselibs/GMAO-Baselibs-5_0_2/x86_64-unknown-linux-gnu/ifort_16.0.3.210-intelmpi_5.1.3.210/Linux/bin
Input files in or relative to directory /home/bmauer/noback/tmp/test_esmf_regrid
Intermediate/temporary files written to directory /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer
Output files to directory .
Destination grid will be inferred from data-file
ncks -D 3 -O --no_tmp_fl --gaa remap_script=ncremap --gaa remap_hostname=discover18.prv.cube --gaa remap_version="4.6.1" message is --hdr_pad=1000 --rgr nfr=y --rgr grid=/gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_grd_dst.nc.pid18035 example_nr_file.nc4 /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_grd_tmp.nc.pid18035
ncks: ERROR received 4 filenames; need no more than two
That did not help. I then built a Baselibs but reverted back to NCO 4.6.0. That worked! That's a data point!
Next, is it ncremap? Well, no! I copied the ncremap from 4.6.0 and 4.6.1 as well as your latest one. If the ncks it finds in the path is from 4.6.0, success, if it's from 4.6.1, failure. The ncremap script itself always works.
So, it looks like ncks is the culprit. Honestly, this Baselibs version isn't in production yet, so probably no one tried to run ncks yet, or, if they did, never with the options that ncremap requires that is triggering this.
Charlie, let me know what you'd like me to try now.
For some more info, from a good run with -D 3 I see:
with each version of ncks. I suspect the "message is" string comes from the version string that NCO reports. This becomes an argument to the ncks command and breaks ncremap. This string is baked into ncks at compile time. If you verify this then we can devise a fix. It may be possible that 4.6.2-alpha02 (the latest) does not have this problem....
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I will also note that 4.6.0 seems to have avoided the weird -Baselibs- error:
[mathomp4@anvil src]$ ncks --version
NCO netCDF Operators version "4.6.0" built by mathomp4 on anvil.gsfc.nasa.gov at Oct 28 2016 08:58:18
ncks version "4.6.0"
[mathomp4@anvil ncremap]$ ncks --version
NCO netCDF Operators version "4.6.1" last modified 2016/08/08 built Oct 28 2016 on anvil.gsfc.nasa.gov by mathomp4
ncks: WARNING cvs_vrs_prs() reports nco_sng_ptr == NULL
nco_sng_cnv_err(): ERROR an NCO function or main program attempted to convert the user-defined string "-Baselibs-" to an integer-type using the standard C-library function "strtol()". This function stopped converting the input string when it encountered the illegal (i.e., non-numeric or non-integer) character '-'. This probably indicates a syntax error by the user. Please check the argument syntax and re-try the command. Exiting...
nco_err_exit(): ERROR Short NCO-generated message (usually name of function that triggered error): nco_sng_cnv_err()
nco_err_exit(): ERROR Error code is 0. This indicates an error occurred in NCO code or in a system call, not in the netCDF layer.
nco_err_exit(): ERROR NCO will now exit with system call exit(EXIT_FAILURE)
Note the only difference with these builds are:
The NCO version. One is 4.6.1 the other 4.6.0.
The directories they were built in. The 4.6.0 is located in:
[mathomp4@anvil src]$ which ncks
/ford1/share/gmao_SIteam/Baselibs/TmpBaselibs/GMAO_Baselibs_5_0_2_with_NCO460/x86_64-unknown-linux-gnu/ifort_16.0.2.181-openmpi_1.10.2/Linux/bin/ncks
[mathomp4@anvil ncremap]$ which ncks
/ford1/share/gmao_SIteam/Baselibs/TmpBaselibs/GMAO_Baselibs_5_0_2/x86_64-unknown-linux-gnu/ifort_16.0.2.181-openmpi_1.10.2/Linux/bin/ncks
That is it. Same compilers, computer, everything.
So, I took a look around and I have a possibility. Namely:
Could cvs_Name be causing all this? My 4.6.1 build came from a CVS checkout of our Baselibs. Meanwhile, the 4.6.0 was from your tarball on git. Since it was a tarball, the CVS keywords were not expanded unlike when I checked out Baselibs.
I'm going to try a CVS checkout of our Baselibs but with -kk during the checkout. If that solves it, well, Good enough for me, I suppose. It's not like I really care about CVS keyword substitution when I build a model!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yuuuuuuuuuup. By checking out our Baselibs with -kk, ncremap works again. All for a keyword substitution!
Once again, CVS keywords get me. I'm beginning to think I should just add checkout -kk to my .cvsrc.
Thanks for working with us on this, Charlie. I suppose as we move to git here, I wonder if those CVS keywords even do anything anymore? Maybe I can strip them from our code in a joyful manner knowing they'll never bother again...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
We have been trying to use the ncremap function in 4.6.1, we had been using it just fine in 4.6.0 but now it appears to be broken.
We are trying to run this command
ncremap -s SCRIP_INPUT -i INPUT_FILE -E "--line_type greatcircle" -d DESTINATION_FILE -o OUTPUT_FILE
Like so, notice there is all this "message is" output?
tmp/test_esmf_regrid> ncremap -s ~/noback/Cubed_Sphere_Grids/PE2880x17280-CF.nc4 -i TOTPRES.c2880_1d.nc4 -E "--line_type greatcircle" -d example_nr_file.nc4 -o output.nc4 -D 3
dbg: alg_opt = bilinear
dbg: cln_flg = Yes
dbg: dbg_lvl = 3
dbg: drc_in = /home/bmauer/noback/tmp/test_esmf_regrid
dbg: drc_out = .
dbg: drc_tmp = /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer
dbg: dst_fl = example_nr_file.nc4
dbg: gaa_sng = --gaa remap_script=ncremap --gaa remap_hostname=discover18.prv.cube --gaa remap_version="4.6.1"
message
is
dbg: grd_dst = /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_grd_dst.nc.pid18035
dbg: grd_sng = --rgr grd_ttl='Default internally-generated grid' --rgr grid=/gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_grd_dst.nc.pid18035 --rgr latlon=100,100 --rgr snwe=30.0,70.0,-130.0,-90.0
dbg: grd_src = /home/bmauer/noback/Cubed_Sphere_Grids/PE2880x17280-CF.nc4
dbg: hdr_pad = 1000
dbg: job_nbr = 2
dbg: in_fl = TOTPRES.c2880_1d.nc4
dbg: map_fl = /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_map_esmf_bilinear.nc.pid18035
dbg: map_mk = Yes
dbg: mlt_map = Yes
dbg: mpi_flg = No
dbg: nco_opt = -D 3 -O --no_tmp_fl --gaa remap_script=ncremap --gaa remap_hostname=discover18.prv.cube --gaa remap_version="4.6.1"
message
is --hdr_pad=1000
dbg: nd_nbr = 1
dbg: out_fl = output.nc4
dbg: par_typ = nil
dbg: spt_pid = 18035
dbg: thr_nbr = 2
dbg: unq_sfx = .pid18035
dbg: var_lst =
dbg: var_rgr =
dbg: wgt_usr =
Asked to regrid 1 files:
TOTPRES.c2880_1d.nc4
NCO regridder invoked with command:
ncremap -s /home/bmauer/noback/Cubed_Sphere_Grids/PE2880x17280-CF.nc4 -i TOTPRES.c2880_1d.nc4 -E --line_type greatcircle -d example_nr_file.nc4 -o output.nc4 -D 3
Started processing at Thu Oct 27 16:12:43 EDT 2016.
Running remap script ncremap from directory /gpfsm/dswdev/mathomp4/Baselibs/GMAO-Baselibs-5_0_2/x86_64-unknown-linux-gnu/ifort_16.0.3.210-intelmpi_5.1.3.210/Linux/bin
NCO version "4.6.1"
message
is from directory /gpfsm/dswdev/mathomp4/Baselibs/GMAO-Baselibs-5_0_2/x86_64-unknown-linux-gnu/ifort_16.0.3.210-intelmpi_5.1.3.210/Linux/bin
Input files in or relative to directory /home/bmauer/noback/tmp/test_esmf_regrid
Intermediate/temporary files written to directory /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer
Output files to directory .
Destination grid will be inferred from data-file
ncks -D 3 -O --no_tmp_fl --gaa remap_script=ncremap --gaa remap_hostname=discover18.prv.cube --gaa remap_version="4.6.1" message is --hdr_pad=1000 --rgr nfr=y --rgr grid=/gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_tmp_grd_dst.nc.pid18035 example_nr_file.nc4 /gpfsm/dnb02/tdirs/login/discover18.16366.bmauer/ncremap_grd_tmp.nc.pid18035
ncks: ERROR received 4 filenames; need no more than two
where as with nco 4.6.0
/ford1/share/gmao_SIteam/Baselibs/TmpBaselibs/GMAO-Baselibs-5_0_1_with_NCO460/x86_64-unknown-linux-gnu/gfortran_6.1.0-openmpi_1.10.2/Linux/bin/ncremap -i moist_72.nc4 -d pchem_144.nc4 -o yaya.nc4 -D 3
dbg: alg_opt = bilinear
dbg: cln_flg = Yes
dbg: dbg_lvl = 3
dbg: drc_in = /home/mathomp4/ncremap
dbg: drc_out = .
dbg: drc_tmp = /tmp
dbg: dst_fl = pchem_144.nc4
dbg: gaa_sng = --gaa remap_script=ncremap --gaa remap_hostname=anvil.gsfc.nasa.gov --gaa remap_version="4.6.0"
dbg: grd_dst = /tmp/ncremap_tmp_grd_dst.nc.pid22640
dbg: grd_sng = --rgr grd_ttl='Default internally-generated grid' --rgr grid=/tmp/ncremap_tmp_grd_dst.nc.pid22640 --rgr latlon=100,100 --rgr snwe=30.0,70.0,-130.0,-90.0
dbg: grd_src = /tmp/ncremap_tmp_grd_src.nc.pid22640
dbg: hdr_pad = 1000
dbg: job_nbr = 2
dbg: in_fl = moist_72.nc4
dbg: map_fl = /tmp/ncremap_tmp_map_esmf_bilinear.nc.pid22640
dbg: map_mk = Yes
dbg: mlt_map = Yes
dbg: mpi_flg = No
dbg: nco_opt = -D 3 -O --no_tmp_fl --gaa remap_script=ncremap --gaa remap_hostname=anvil.gsfc.nasa.gov --gaa remap_version="4.6.0" --hdr_pad=1000
dbg: nd_nbr = 1
dbg: out_fl = yaya.nc4
dbg: par_typ = nil
dbg: spt_pid = 22640
dbg: thr_nbr = 2
dbg: unq_sfx = .pid22640
dbg: var_lst =
dbg: var_rgr =
dbg: wgt_usr =
Asked to regrid 1 files:
moist_72.nc4
NCO regridder invoked with command:
ncremap -i moist_72.nc4 -d pchem_144.nc4 -o yaya.nc4 -D 3
ncremap: Removing PET0.RegridWeightGen.Log file from current directory before running
Started processing at Thu Oct 27 16:01:41 EDT 2016.
NCO ncremap version is "4.6.0"
Destination grid will be inferred from data-file
ncks -D 3 -O --no_tmp_fl --gaa remap_script=ncremap --gaa remap_hostname=anvil.gsfc.nasa.gov --gaa remap_version="4.6.0" --hdr_pad=1000 --rgr nfr=y --rgr grid=/tmp/ncremap_tmp_grd_dst.nc.pid22640 pchem_144.nc4 /tmp/ncremap_grd_tmp.nc.pid22640
As a first test, please download and try the current ncremap
http://dust.ess.uci.edu/tmp/ncremap
and let us know how well it works...
cz
Charlie,
Nope. That didn't solve it. Same issue.
I tried out some different tests on my end to see if I'm building NCO oddly. First, I built a Baselibs that doesn't have the dashes in it that seemed to trigger this: https://sourceforge.net/p/nco/discussion/9830/thread/8409f58d/?limit=25#372b.
That did not help. I then built a Baselibs but reverted back to NCO 4.6.0. That worked! That's a data point!
Next, is it ncremap? Well, no! I copied the ncremap from 4.6.0 and 4.6.1 as well as your latest one. If the ncks it finds in the path is from 4.6.0, success, if it's from 4.6.1, failure. The ncremap script itself always works.
So, it looks like ncks is the culprit. Honestly, this Baselibs version isn't in production yet, so probably no one tried to run ncks yet, or, if they did, never with the options that ncremap requires that is triggering this.
Charlie, let me know what you'd like me to try now.
For some more info, from a good run with -D 3 I see:
From a bad run:
Gentlemen,
It appears to me that, on your system, 4.6.1 builds a "corrupt" version string into the ncks executable.
Please execute
nco_vrs=$(ncks --version 2>&1 >/dev/null | grep NCO | awk '{print $5}')
with each version of ncks. I suspect the "message is" string comes from the version string that NCO reports. This becomes an argument to the ncks command and breaks ncremap. This string is baked into ncks at compile time. If you verify this then we can devise a fix. It may be possible that 4.6.2-alpha02 (the latest) does not have this problem....
Charlie,
You are correct:
I will also note that 4.6.0 seems to have avoided the weird -Baselibs- error:
Note the only difference with these builds are:
The NCO version. One is 4.6.1 the other 4.6.0.
The directories they were built in. The 4.6.0 is located in:
That is it. Same compilers, computer, everything.
So, I took a look around and I have a possibility. Namely:
Could cvs_Name be causing all this? My 4.6.1 build came from a CVS checkout of our Baselibs. Meanwhile, the 4.6.0 was from your tarball on git. Since it was a tarball, the CVS keywords were not expanded unlike when I checked out Baselibs.
I'm going to try a CVS checkout of our Baselibs but with -kk during the checkout. If that solves it, well, Good enough for me, I suppose. It's not like I really care about CVS keyword substitution when I build a model!
Yuuuuuuuuuup. By checking out our Baselibs with -kk, ncremap works again. All for a keyword substitution!
Once again, CVS keywords get me. I'm beginning to think I should just add
checkout -kk
to my .cvsrc.Thanks for working with us on this, Charlie. I suppose as we move to git here, I wonder if those CVS keywords even do anything anymore? Maybe I can strip them from our code in a joyful manner knowing they'll never bother again...
Glad you found how to fix it. You may be the only site building NCO with CVS keyword expansion. I should remove the tokens. It's on the list.
cz
We build some libraries that are old enough that CVS was still a thing when they were around. Surprising this never happened any place else! Oh CVS...