Computational Science Community Wiki

netCDF data compression

By default WRF uses netCDF-3 format for data files (or 64-bit offset format when data files could be larger than 2Gb). These formats don't include data compression, so can result in much larger files than needed. For storage purposes these files can be zipped - however this is inconvenient for accessing the data. Instead of using external compression you can, however, convert the files to netCDF-4 format (with the right libraries), which has built in data compression options.



nccopy -d [n] -s [original] [new_file]

The -d flag indicates that compression is to be used (where [n] indicates the level of compression to be used, between 1 (low) and 9 (high)). The -s flag indicates that shuffling should be used (DL - I don't know what this does, other than help reduce file size). If compression and/or shuffling is used then nccopy will automatically choose netcdf-4 as the format to use for the new file - otherwise you need to specify the format to use by the flag -k (use this for converting files back to netcdf-3 format).

Results and Comparisons

For a 64-bit offset data file of size 1.4Gb we achieved the following reductions in file size:

(DL - these results are for data files which include a large number of empty data arrays - the presence of these may skew the results, perhaps making the savings greater than they would otherwise be. I recommend performing the same tests on your data files before deciding which level of compression to use)