Doug Lowe (8/6/2011): We have to come up with a method of ensuring that we know (or can relatively quickly determine) what settings and model versions were used to generate each model output file that we store. So I think we need a check list of chunks of information that have to be stored with each output file.
The information that we need to store is:
- model code (and computing environment) used
- input files
- domain setup files
- mozart emission files
- namelist options
- for running real.exe
- for running wrf.exe
Storing the namelist options should be straight-forward - we simply copy the namelist.input file which is used to a safe location.
To record information on the model code I think we can use the svn server. As long as we only use code that has been stored on the svn server then we can create an info file with the svn revision number, and some architecture information, in order to keep track of this information.
The main information I don't know how to store is the input file info. Could someone (Steve?) suggest a manner in which we can record this information with traceability (and without using large amounts of disk space)?
- Compiled model executable (copy from /run directory in WRFCHEM source directory)
- Input files:
- Emission data files:
- Meterological data files:
- Run settings:
namelist.input (see the namelist page)
- Emission data files:
The emission data files contain hourly emissions data - split into two files (although these can be stored as one data file).
When first running WRF-Chem you must:
- Make sure all of the files needed for WRF-Chem to run, listed above, are in the working directory.
Set restart to .false.in the namelist.input file.
Edit the value of start_* and end_* to simulation length you desire. The run_* variables can be left at 0.
interval_seconds should be equal to the time period between met_em* files in minutes. e.g. for met files every 6hours, set this to 21600.
Set history_interval to the time interval between output of data in minutes. restart_interval is the time between output of restart files.
frames_per_outfile is the number of outputs per file. We recommend setting this to 1.
Using Restart Files
To use a restart file you must:
set the value of restart to .true.
and set the values of the start_* variables to match the restart file that you wish to use
You can change the values of history_interval and restart_interval when using restart.
Issues with restart files:
- Restart files can only be used if they're at 11am or 11pm - as the emissions are updated at the next hour after the run starts, and so our restart must be within the hour before 12am or 12pm.
restart_interval shouldn't be set to 1440 minutes. DL has had issues of the restart file being written after 23 hours (instead of 24 hours) when doing this. Instead you should use either 720 or 360 minutes, in order to get to 24 hours.
Solution (DL 18/7/2011):
- Soft symlinks to the emissions file can be created for each hour of the day, e.g.:
- ln -s wrfchemi_d01_2007-07-20_00:00:00 wrfchemi_d01_2007-07-20_01:00:00
- WRFCHEM will then search through the emissions file for the appropriate time, allowing restart files to be used for any time of the day.
Reinitialising Meteorology with Previous Chemistry
When running large domains, it is recommended to reinitialise meteorology roughly every 3-7 days (depending on size of domain and meteorological conditions) as the meteorology within the domain with diverge from the operational/reanalysis over time. There are options with WRF-Chem to reinitialise the meteorology, while using the chemistry data from the previous days WRF-Chem run:
Make following changes to namelist &time_control section:
auxinput12_inname = "wrf_chem_input",
io_form_auxinput12 = 2,
- Create a link to the wrfout* or wrfrst* file from which you want to take your chemistry information, e.g.:
- ln -s restart_files/domain1/wrfrst_d01_2010-07-19_00:00:00 wrf_chem_input_d01
- If you have more than 1 domain then you must do this for each domain (replacing d01 with d02, or d03, etc).
- Run real.exe to generate wrfbdy_d*, wrfinput_d*, etc files.
Make a copy of the wrfinput_d* files.
Once you have run real.exe with these settings then the wrfinput_d01 file created by this process will contain the chemistry information from your previous model run. You have to make sure that you take a copy of this file before running the mozbc script, as this will overwrite the chemistry data in that file with data from MOZART (or MACC).
- Run mozbc (or maccbc) script to apply MOZART (or MACC) chemistry inputs to wrfbdy_d* and wrfinput_d* files
- Copy your original wrfinput_d* file back to replace the wrfinput_d* which was modified by mozbc (or maccbc).
Use ncview to check everything looks ok before running wrf.exe.
Nesting using ndown
The process of nesting using ndown is covered in the WRF users manual - so the instructions here are supplemental to that documentation.
For running off-line nesting using ndown it's best to use separate folders for each domain. Within each of these folders you should link to the met_em and wrfchemi input files for the relevant domain as if they are domain 1 (i.e. ln -s /src_fldr/met_em.d02.2010-07-10_00:00:00.nc met_em.d01.2010-07-10_00:00:00.nc)
The procedural summary is:
Run real.exe in the inner domain folder to get wrfinput_d01 and wrflowinput_d01 (if needed)
Move wrfinput_d01 from the inner domain folder to the outer domain folder (as wrfndi_d02) - leave wrflowinput_d01 where it is
Run ndown.exe in the outer domain folder to get wrfinput_d02 and wrfbdy_d02 (making sure your wrfout_d01* files covering the period of interest are still in this folder)
Move wrfinput_d02 and wrfbdy_d02 from the outer domain folder to the inner domain folder as wrfinput_d01 and wrfbdy_d01
- Run wrf.exe in the inner domain folder
Key namelist.input requirements are:
- namelist for running real.exe in inner domain folder:
in &time_control, interval_seconds should be set to the standard interval you use for your met_em* files (usually 21600 seconds - 6 hours)
modify the namelist file so that the domain 2 settings are now used for domain 1 (and set max_dom to 1)
in particular make sure that grid_id, parent_id, i_parent_start, and j_parent_start are copied, as these will be need to be in the wrfndi_d02 for ndown to work properly
- namelist for running ndown.exe in outer domain folder:
in &time_control, interval_seconds should be set to the output interval of wrfout_d01* files (usually 3600 seconds)
- namelist for running wrf.exe in inner domain folder:
in &time_control, interval_seconds should be set to the same interval as you used for ndown.exe (this is the only major difference to the namelist for running real)
WRF outputs times taken for each model step in the rsl.error.0000 data file. These can provide guidance on the speed up of the model on different architecture, and with different numbers of nodes and cores.
My educated guess (DL) is that:
- the 1st step contains the aerosol module calculations (which have a timestep of 4 minutes)
- the 2nd step contains the gas-phase chemistry step (also a timestep of 4 minutes, this has the most variable time, longer round dawn/dusk)
- the remaining 6 steps are transport (which has a timestep of 30 seconds).
From the timings for the specific model run given below we see that:
- the decrease in computational time of the aerosol module matches the increase in computational power
- the decrease in computational time of the transport steps is ~1.5 for every doubling of computational power
- the decrease in computational time between the 16 and 64 node jobs is ~3.2 --- it is more efficient to run on 16 nodes, but the saving is not immense.
HECToR phase2b, 64 nodes (24 cores per node) - organic aerosol model, 400x380x27 UK domain (9/6/2011)
- Timing for main: time 2010-07-19_11:56:00 on domain 1: 20.97302 elapsed seconds.
- Timing for main: time 2010-07-19_11:56:30 on domain 1: 4.08100 elapsed seconds.
- Timing for main: time 2010-07-19_11:57:00 on domain 1: 2.53178 elapsed seconds.
- Timing for main: time 2010-07-19_11:57:30 on domain 1: 2.19795 elapsed seconds.
- Timing for main: time 2010-07-19_11:58:00 on domain 1: 2.29149 elapsed seconds.
- Timing for main: time 2010-07-19_11:58:30 on domain 1: 2.26509 elapsed seconds.
- Timing for main: time 2010-07-19_11:59:00 on domain 1: 2.21798 elapsed seconds.
- Timing for main: time 2010-07-19_11:59:30 on domain 1: 2.44287 elapsed seconds.
HECToR phase2b, 32 nodes (24 cores per node) - organic aerosol model, 400x380x27 UK domain (9/6/2011)
- Timing for main: time 2010-07-17_12:48:00 on domain 1: 38.08195 elapsed seconds.
- Timing for main: time 2010-07-17_12:48:30 on domain 1: 10.85686 elapsed seconds.
- Timing for main: time 2010-07-17_12:49:00 on domain 1: 3.61491 elapsed seconds.
- Timing for main: time 2010-07-17_12:49:30 on domain 1: 3.58132 elapsed seconds.
- Timing for main: time 2010-07-17_12:50:00 on domain 1: 3.59975 elapsed seconds.
- Timing for main: time 2010-07-17_12:50:30 on domain 1: 3.56546 elapsed seconds.
- Timing for main: time 2010-07-17_12:51:00 on domain 1: 3.60200 elapsed seconds.
- Timing for main: time 2010-07-17_12:51:30 on domain 1: 3.57059 elapsed seconds.
HECToR phase2b, 16 nodes (24 cores per node) - organic aerosol model, 400x380x27 UK domain (9/6/2011)
- Timing for main: time 2010-07-17_12:56:00 on domain 1: 79.58258 elapsed seconds.
- Timing for main: time 2010-07-17_12:56:30 on domain 1: 12.23233 elapsed seconds.
- Timing for main: time 2010-07-17_12:57:00 on domain 1: 5.52302 elapsed seconds.
- Timing for main: time 2010-07-17_12:57:30 on domain 1: 5.52994 elapsed seconds.
- Timing for main: time 2010-07-17_12:58:00 on domain 1: 5.49635 elapsed seconds.
- Timing for main: time 2010-07-17_12:58:30 on domain 1: 5.52519 elapsed seconds.
- Timing for main: time 2010-07-17_12:59:00 on domain 1: 5.49668 elapsed seconds.
- Timing for main: time 2010-07-17_12:59:30 on domain 1: 5.54156 elapsed seconds.