MAN page from Fedora 26 netcdf-mpich-126.96.36.199-4.fc26.i686.rpm
Section: UNIDATA UTILITIES (1)
nccopy - Copy a netCDF file, optionally changing format, compression, or chunking in the output.
][-u][-w][-[v|V] var1,...][-[g|G] grp1,...][-m bufsize
][-r] infile outfile
The nccopy utility copies an input netCDF file in any supportedformat variant to an output netCDF file, optionally converting theoutput to any compatible netCDF format variant, compressing the data,or rechunking the data. For example, if built with the netCDF-3library, a netCDF classic file may be copied to a netCDF 64-bit offsetfile, permitting larger variables. If built with the netCDF-4library, a netCDF classic file may be copied to a netCDF-4 file or toa netCDF-4 classic model file as well, permitting data compression,efficient schema changes, larger variable sizes, and use of othernetCDF-4 features.
If no output format is specified, with either -k kind_nameor -kind_code, then the output will use the sameformat as the input, unless the input is classic or 64-bit offsetand either chunking or compression is specified, in which case theoutput will be netCDF-4 classic model format. Attemptingsome kinds of format conversion will result in an error, if theconversion is not possible. For example, an attempt to copy anetCDF-4 file that uses features of the enhanced model, such asgroups or variable-length strings, to any of the other kinds of netCDFformats that use the classic model will result in an error.
nccopy also serves as an example of a generic netCDF-4 program,with its ability to read any valid netCDF file and handle nestedgroups, strings, and user-defined types, including arbitrarilynested compound types, variable-length types, and data of any validnetCDF-4 type.
If DAP support was enabled when nccopy was built, the file name mayspecify a DAP URL. This may be used to convert data on DAP servers tolocal netCDF files.
- -k kind_name
- Use format name to specify the kind of file to be createdand, by inference, the data model (i.e. netcdf-3 (classic) ornetcdf-4 (enhanced)). The possible arguments are:
- 'nc3' or 'classic' => netCDF classic format
- 'nc6' or '64-bit offset' => netCDF 64-bit format
- 'nc4' or 'netCDF-4' => netCDF-4 format (enhanced data model)
- 'nc7' or 'netCDF-4 classic model' => netCDF-4 classic model format
- Note: The old format numbers '1', '2', '3', '4', equivalentto the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, arealso still accepted but deprecated, due to easy confusion betweenformat numbers and format names.
- Use format numeric code (instead of format name) to specify the kind of file to be createdand, by inference, the data model (i.e. netcdf-3 (classic) versusnetcdf-4 (enhanced)). The numeric codes are:
The numeric code "7" is used because "7=3+4", specifying the formatthat uses the netCDF-3 data model for compatibility with the netCDF-4storage format for performance. Credit is due to NCO for use of thesenumeric codes instead of the old and confusing format numbers.
- 3 => netcdf classic format
- 6 => netCDF 64-bit format
- 4 => netCDF-4 format (enhanced data model)
- 7 => netCDF-4 classic model format
- -d n
- For netCDF-4 output, including netCDF-4 classic model, specifydeflation level (level of compression) for variable data output. 0corresponds to no compression and 9 to maximum compression, withhigher levels of compression requiring marginally more time tocompress or uncompress than lower levels. Compression achieved mayalso depend on output chunking parameters. If this option isspecified for a classic format or 64-bit offset format input file, itis not necessary to also specify that the output should be netCDF-4classic model, as that will be the default. If this option is notspecified and the input file has compressed variables, the compressionwill still be preserved in the output, using the same chunking as inthe input by default.
- Note that nccopy requires all variables to be compressed using thesame compression level, but the API has no such restriction. Witha program you can customize compression for each variable independently.
- For netCDF-4 output, including netCDF-4 classic model, specifyshuffling of variable data bytes before compression or afterdecompression. Shuffling refers to interlacing of bytes in a chunk sothat the first bytes of all values are contiguous in storage, followedby all the second bytes, and so on, which often improves compression.This option is ignored unless a non-zero deflation level is specified.Using -d0 to specify no deflation on input data that has beencompressed and shuffled turns off both compression and shuffling inthe output.
- Convert any unlimited size dimensions in the input to fixed sizedimensions in the output. This can speed up variable-at-a-timeaccess, but slow down record-at-a-time access to multiple variablesalong an unlimited dimension.
- Keep output in memory (as a diskless netCDF file) until output isclosed, at which time output file is written to disk. This cangreatly speedup operations such as converting unlimited dimension tofixed size (-u option), chunking, rechunking, or compressing theinput. It requires that available memory is large enough to hold theoutput file. This option may provide a larger speedup than carefultuning of the -m, -h, or -e options, and it's certainly a lot simpler.
- -c chunkspec
- For netCDF-4 output, including netCDF-4 classic model, specifychunking (multidimensional tiling) for variable data in the output.This is useful to specify the units of disk access, compression, orother filters such as checksums. Changing the chunking in a netCDFfile can also greatly speedup access, by choosing chunk shapes thatare appropriate for the most common access patterns.
- The chunkspec argument is a string of comma-separated associations,each specifying a dimension name, a '/' character, and optionally thecorresponding chunk length for that dimension. No blanks shouldappear in the chunkspec string, except possibly escaped blanks thatare part of a dimension name. A chunkspec names at least onedimension, and may omit dimensions which are not to be chunked or forwhich the default chunk length is desired. If a dimension name isfollowed by a '/' character but no subsequent chunk length, the actualdimension length is assumed. If copying a classic model file to anetCDF-4 output file and not naming all dimensions in the chunkspec,unnamed dimensions will also use the actual dimension length for thechunk length. An example of a chunkspec for variables that use 'm'and 'n' dimensions might be 'm/100,n/200' to specify 100 by 200chunks. To see the chunking resulting from copying with a chunkspec,use the '-s' option of ncdump on the output file.
- The chunkspec '/' that omits all dimension names andcorresponding chunk lengths specifies that no chunking is to occur inthe output, so can be used to unchunk all the chunked variables.To see the chunking resulting from copying with a chunkspec,use the '-s' option of ncdump on the output file.
- As an I/O optimization, nccopy has a threshold for the minimum size ofnon-record variables that get chunked, currently 8192 bytes. In the future,use of this threshold and its size may be settable in an option.
- Note that nccopy requires variables that share a dimension to alsoshare the chunk size associated with that dimension, but theprogramming interface has no such restriction. If you need tocustomize chunking for variables independently, you will need to usethe library API in a custom utility program.
- -v var1,...
- The output will include data values for the specified variables, inaddition to the declarations of all dimensions, variables, andattributes. One or more variables must be specified by name in thecomma-delimited list following this option. The list must be a singleargument to the command, hence cannot contain unescaped blanks orother white space characters. The named variables must be valid netCDFvariables in the input-file. A variable within a group in a netCDF-4file may be specified with an absolute path name, such as"/GroupA/GroupA2/var". Use of a relative path name such as 'var' or"grp/var" specifies all matching variable names in the file. Thedefault, without this option, is to include data values for all variablesin the output.
- -V var1,...
- The output will include the specified variables only but all dimensions andglobal or group attributes. One or more variables must be specified by name in thecomma-delimited list following this option. The list must be a single argumentto the command, hence cannot contain unescaped blanks or other white spacecharacters. The named variables must be valid netCDF variables in theinput-file. A variable within a group in a netCDF-4 file may be specified withan absolute path name, such as '/GroupA/GroupA2/var'. Use of a relative pathname such as 'var' or 'grp/var' specifies all matching variable names in thefile. The default, without this option, is to include all variables in theoutput.
- -g grp1,...
- The output will include data values only for the specified groups.One or more groups must be specified by name in the comma-delimitedlist following this option. The list must be a single argument to thecommand. The named groups must be valid netCDF groups in theinput-file. The default, without this option, is to include data values for allgroups in the output.
- -G grp1,...
- The output will include only the specified groups.One or more groups must be specified by name in the comma-delimitedlist following this option. The list must be a single argument to thecommand. The named groups must be valid netCDF groups in theinput-file. The default, without this option, is to include all groups in theoutput.
- -m bufsize
- An integer or floating-point number that specifies the size, in bytes,of the copy buffer used to copy large variables. A suffix of K, M, G,or T multiplies the copy buffer size by one thousand, million,billion, or trillion, respectively. The default is 5 Mbytes,but will be increased if necessary to hold at least one chunk ofnetCDF-4 chunked variables in the input file. You may want to specifya value larger than the default for copying large files over highlatency networks. Using the '-w' option may provide betterperformance, if the output fits in memory.
- -h chunk_cache
- For netCDF-4 output, including netCDF-4 classic model, an integer orfloating-point number that specifies the size in bytes of chunk cacheallocated for each chunked variable. This is not a property of the file, but merelya performance tuning parameter for avoiding compressing ordecompressing the same data multiple times while copying and changingchunk shapes. A suffix of K, M, G, or T multiplies the chunk cachesize by one thousand, million, billion, or trillion, respectively.The default is 4.194304 Mbytes (or whatever was specified for theconfigure-time constant CHUNK_CACHE_SIZE when the netCDF library wasbuilt). Ideally, the nccopy utility should accept only one memorybuffer size and divide it optimally between a copy buffer and chunkcache, but no general algorithm for computing the optimum chunk cachesize has been implemented yet. Using the '-w' option may providebetter performance, if the output fits in memory.
- -e cache_elems
- For netCDF-4 output, including netCDF-4 classic model, specifiesnumber of chunks that the chunk cache can hold. A suffix of K, M, G,or T multiplies the number of chunks that can be held in the cacheby one thousand, million, billion, or trillion, respectively. This is not aproperty of the file, but merely a performance tuning parameter foravoiding compressing or decompressing the same data multiple timeswhile copying and changing chunk shapes. The default is 1009 (orwhatever was specified for the configure-time constantCHUNK_CACHE_NELEMS when the netCDF library was built). Ideally, thenccopy utility should determine an optimum value for this parameter,but no general algorithm for computing the optimum number of chunkcache elements has been implemented yet.
- Read netCDF classic or 64-bit offset input file into a diskless netCDFfile in memory before copying. Requires that input file be smallenough to fit into memory. For nccopy, this doesn't seem to provideany significant speedup, so may not be a useful option.
Make a copy of foo1.nc, a netCDF file of any type, to foo2.nc, anetCDF file of the same type:
- nccopy foo1.nc foo2.nc
Note that the above copy will not be as fast as use of cp or othersimple copy utility, because the file is copied using only the netCDFAPI. If the input file has extra bytes after the end of the netCDFdata, those will not be copied, because they are not accessiblethrough the netCDF interface. If the original file was generated in"No fill" mode so that fill values are not stored for padding for dataalignment, the output file may have different padding bytes.
Convert a netCDF-4 classic model file, compressed.nc, that uses compression,to a netCDF-3 file classic.nc:
- nccopy -k classic compressed.nc classic.nc
Note that 'nc3' could be used instead of 'classic'.
Download the variable 'time_bnds' and its associated attributes froman OPeNDAP server and copy the result to a netCDF file named 'tb.nc':
- nccopy 'http://test.opendap.org/opendap/data/nc/sst.mnmean.nc.gz?time_bnds' tb.nc
Note that URLs that name specific variables as command-line argumentsshould generally be quoted, to avoid the shell interpreting specialcharacters such as '?'.
Compress all the variables in the input file foo.nc, a netCDF file of anytype, to the output file bar.nc:
- nccopy -d1 foo.nc bar.nc
If foo.nc was a classic or 64-bit offset netCDF file, bar.nc will be anetCDF-4 classic model netCDF file, because the classic and 64-bitoffset format variants don't support compression. If foo.nc was anetCDF-4 file with some variables compressed using various deflationlevels, the output will also be a netCDF-4 file of the same type, butall the variables, including any uncompressed variables in the input,will now use deflation level 1.
Assume the input data includes gridded variables that use time, lat,lon dimensions, with 1000 times by 1000 latitudes by 1000 longitudes,and that the time dimension varies most slowly. Also assume thatusers want quick access to data at all times for a small set oflat-lon points. Accessing data for 1000 times would typically requireaccessing 1000 disk blocks, which may be slow.
Reorganizing the data into chunks on disk that have all the time ineach chunk for a few lat and lon coordinates would greatly speed upsuch access. To chunk the data in the input file slow.nc, a netCDFfile of any type, to the output file fast.nc, you could use;
- nccopy -c time/1000,lat/40,lon/40 slow.nc fast.nc
to specify data chunks of 1000 times, 40 latitudes, and 40 longitudes.If you had enough memory to contain the output file, you could speedup the rechunking operation significantly by creating the output inmemory before writing it to disk on close:
- nccopy -w -c time/1000,lat/40,lon/40 slow.nc fast.nc
- SEE ALSO
This document was created byman2html,using the manual pages.