-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathchange_data_stream_files.txt
253 lines (186 loc) · 14.8 KB
/
change_data_stream_files.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
CESM1.2 Data Model v8: User's Guide
<<< Previous Next >>>
Input Data Streams
Overview
An input data stream is a time-series of input data files where all the fields in the stream are located in the same data file and all share the same spatial and temporal coordinates (ie. are all on the same grid and share the same time axis). Normally a time axis has a uniform dt, but this is not a requirement.
The data models can have multiple input streams.
The data for one stream may be all in one file or may be spread over several files. For example, 50 years of monthly average data might be contained all in one data file or it might be spread over 50 files, each containing one year of data.
The data models can loop over stream data -- repeatedly cycle over some subset of an input stream's time axis. When looping, the models can only loop over whole years. For example, an input stream might have SST data for years 1950 through 2000, but a model could loop over the data for years 1960 through 1980. A model cannot loop over partial years, for example, from 1950-Feb-10 through 1980-Mar-15.
The input data must be in a netcdf file and the time axis in that file must be CF-1.0 compliant.
There are two main categories of information that the data models need to know about a stream:
data that describes what a user wants -- what streams to use and how to use them -- things that can be changed by a user.
data that describes the stream data -- meta-data about the inherent properties of the data itself -- things that cannot be changed by a user.
Generally, information about what streams a user wants to use and how to use them is input via the strdata ("stream data") Fortran namelist, while meta-data that describes the stream data itself is found in an xml-like text file called a "stream description file."
Stream Data
The strdata (short for "stream data") input is set via a fortran namelist called shr_strdata_nml. That namelist, the strdata datatype, and the methods are contained in the share source code file, models/csm_share/shr/shr_strdata_mod.F90. In general, strdata input defines an array of input streams and operations to perform on those streams. Therefore, many namelist inputs are arrays of character strings. Different variable of the same index are associated. For instance, mapalgo(1) spatial interpolation will be performed between streams(1) and the target domain.
The following namelist are available with the strdata namelist.
dataMode - component specific mode
domainFile- final domain
streams - input files
vectors - paired vector field names
fillalgo - fill algorithm
fillmask - fill mask
fillread - fill mapping file to read
fillwrite - fill mapping file to write
mapalgo - spatial interpolation algorithm
mapmask - spatial interpolation mask
mapread - spatial interpolation mapping file to read
mapwrite - spatial interpolation mapping file to write
tintalgo - time interpolation algorithm
taxMode - time interpolation mode
dtlimit - delta time axis limit
The set of shr_strdata_nml namelist keywords are the same for all data models. As a result, any of the data model namelist documentation can be used to view a full description. For example, see stream specific namelist settings .
Specifying What Streams to Use
The data models have a namelist variable that specifies which input streams to use and, for each input stream, the name of the corresponding stream description file, what years of data to use, and how to align the input stream time axis with the model run time axis. This input is set in the strdata namelist input.
General format:
&shr_strdata_nml
streams = 'stream1.txt year_align year_first year_last ',
'stream2.txt year_align year_first year_last ',
...
'streamN.txt year_align year_first year_last '
/
where:
streamN.txt
the stream description file, a plain text file containing details about the input stream (see below)
year_first
the first year of data that will be used
year_last
the last year of data that will be used
year_align
a model year that will be aligned with data for year_first
The stream text files for a given data model mode are automatically generated by the corresponding data model build-namelist with present names. As an example we refer to the following datm_atm_in example file (that would appear in both $CASEROOT/CaseDocs and $RUNDIR):
datamode = 'CLMNCEP'
domainfile = '/glade/proj3/cseg/inputdata/share/domains/domain.lnd.fv1.9x2.5_gx1v6.090206.nc'
dtlimit = 1.5,1.5,1.5,1.5
fillalgo = 'nn','nn','nn','nn'
fillmask = 'nomask','nomask','nomask','nomask'
mapalgo = 'bilinear','bilinear','bilinear','bilinear'
mapmask = 'nomask','nomask','nomask','nomask'
streams = "datm.streams.txt.CLM_QIAN.Solar 1895 1948 1972 ",
"datm.streams.txt.CLM_QIAN.Precip 1895 1948 1972 ",
"datm.streams.txt.CLM_QIAN.TPQW 1895 1948 1972 ",
"datm.streams.txt.presaero.trans_1850-2000 1849 1849 2006"
taxmode = 'cycle','cycle','cycle','cycle'
tintalgo = 'coszen','nearest','linear','linear'
vectors = 'null'
As is discussed in the CESM1.2 User's Guide, to change the contents of datm_atm_in, you can edit $CASEROOT/user_nl_datm to change any of the above settings EXCEPT FOR THE NAMES datm.streams.txt.CLM_QIAN.Solar, datm.streams.txt.CLM_QIAN.Precip, datm.streams.txt.CLM_QIAN.TPQW and datm.streams.txt.presaero.trans_1850-2000. Note that any namelist variable from shr_strdata_nml and datm_nml can be modified by adding the appropriate keyword/value pairs to user_nl_datm. As an example, the following could be the contents of $CASEROOT/user_nl_datm:
!------------------------------------------------------------------------
! Users should ONLY USE user_nl_datm to change namelists variables
! Users should add all user specific namelist changes below in the form of
! namelist_var = new_namelist_value
! Note that any namelist variable from shr_strdata_nml and datm_nml can
! be modified below using the above syntax
! User preview_namelists to view (not modify) the output namelist in the
! directory $CASEROOT/CaseDocs
! To modify the contents of a stream txt file, first use preview_namelists
! to obtain the contents of the stream txt files in CaseDocs, and then
! place a copy of the modified stream txt file in $CASEROOT with the string
! user_ prepended.
!------------------------------------------------------------------------
streams = "datm.streams.txt.CLM_QIAN.Solar 1895 1948 1900 ",
"datm.streams.txt.CLM_QIAN.Precip 1895 1948 1900 ",
"datm.streams.txt.CLM_QIAN.TPQW 1895 1948 1900 ",
"datm.streams.txt.presaero.trans_1850-2000 1849 1849 2006"
and the contents of shr_strdata_nml (in both $CASEROOT/CaseDocs and $RUNDIR) would be
datamode = 'CLMNCEP'
domainfile = '/glade/proj3/cseg/inputdata/share/domains/domain.lnd.fv1.9x2.5_gx1v6.090206.nc'
dtlimit = 1.5,1.5,1.5,1.5
fillalgo = 'nn','nn','nn','nn'
fillmask = 'nomask','nomask','nomask','nomask'
mapalgo = 'bilinear','bilinear','bilinear','bilinear'
mapmask = 'nomask','nomask','nomask','nomask'
streams = "datm.streams.txt.CLM_QIAN.Solar 1895 1948 1900 ",
"datm.streams.txt.CLM_QIAN.Precip 1895 1948 1900 ",
"datm.streams.txt.CLM_QIAN.TPQW 1895 1948 1900 ",
"datm.streams.txt.presaero.trans_1850-2000 1849 1849 2006"
taxmode = 'cycle','cycle','cycle','cycle'
tintalgo = 'coszen','nearest','linear','linear'
vectors = 'null'
As is discussed in the User's Guide, you should use preview_namelists to view (not modify) the output namelist in CaseDocs.
Stream Description File
The stream description file is not a Fortran namelist, but a locally built xml-like parsing implementation. Sometimes it is called a "stream dot-text file" because it has a ".txt." in the filename. Stream description files contain data that specifies the names of the fields in the stream, the names of the input data files, and the file system directory where the data files are located. In addition, a few other options are available such as the time axis offset parameter.
In CESM1.2, each data model's build-namelist utility (e.g. models/atm/datm/bld/build-namelist) automatically generates these stream description files. The directory contents of each data model will look like the following (using DATM as an example)
models/atm/datm/bld/build-namelist
models/atm/datm/bld/namelist_files/namelist_definition_datm.xml
models/atm/datm/bld/namelist_files/namelist_defaults_datm.xml
The namelist_definition_datm.xml file defines all the namelist variables and associated groups. The namelist_defaults_datm.xml provides the out of the box settings for the target data model and target stream. build-namelist utilizes these two files to construct the stream files for the given compset settings. You can modify the generated stream files for your particular needs by doing the following:
Call setup OR preview_namelists.
Copy the relevant description file from $CASEROOT/CaseDocs to $CASEROOT and pre-pend a "user_" string to the filename. Change the permission of the file to write. For example, assuming you are in $CASEROOT
cp CaseDocs/datm.streams.txt.CLM_QIAN.Solar user_datm.streams.txt.CLM_QIAN.Solar
chmod u+w user_datm.streams.txt.CLM_QIAN.Solar
Edit user_datm.streams.txt.CLM_QIAN.Solar with your desired changes.
Be sure not to put any tab characters in the file: use spaces instead.
In contrast to other user_nl_xxx files, be sure to set all relevant data model settings in the xml files, issue the preview_namelist command and THEN edit the user_datm.streams.txt.CLM_QIAN.Solar file.
Once you have created a user_xxx.streams.txt.* file, further modifications to the relevant data model settings in the xml files will be ignored.
If you later realize that you need to change some settings in an xml file, you should remove the user_xxx.streams.txt.* file(s), make the modifications in the xml file, rerun preview_namelists, and then reintroduce your modifications into a new user_xxx.streams.txt.* stream file(s).
Call preview_namelists
Verify that your changes do indeed appear in the resultant stream description file appear in CaseDocs/datm.streams.txt.CLM_QIAN.Solar. These changes will also appear in $RUNDIR/datm.streams.txt.CLM_QIAN.Solar.
The data elements found in the stream description file are:
dataSource
A comment about the source of the data -- always set to GENERIC in CESM1.2 and not used by the model. This is there only for backwards compatibility.
fieldInfo
Information about the field data for this stream...
variableNames
A list of the field variable names. This is a paired list with the name of the variable in the netCDF file on the left and the name of the corresponding model variable on the right. This is the list of fields to read in from the data file, there may be other fields in the file which are not read in (ie. they won't be used).
filePath
The file system directory where the data files are located.
fileNames
The list of data files to use. If there is more than one file, the files must be in chronological order, that is, the dates in time axis of the first file are before the dates in the time axis of the second file.
tInterpAlgo
The option is obsolete and no longer performs a function. Control of the time interpolation algorithm is in the strdata namelists, tinterp_algo and taxMode .
offset
The offset allows a user to shift the time axis of a data stream by a fixed and constant number of seconds. For instance, if a data set contains daily average data with timestamps for the data at the end of the day, it might be appropriate to shift the time axis by 12 hours so the data is taken to be at the middle of the day instead of the end of the day. This feature supports only simple shifts in seconds as a way of correcting input data time axes without having to modify the input data time axis manually. This feature does not support more complex shifts such as end of month to mid-month. But in conjunction with the time interpolation methods in the strdata input, hopefully most user needs can be accommodated with the two settings. Note that a positive offset advances the input data time axis forward by that number of seconds.
The data models advance in time discretely. At a given time, they read/derive fields from input files. Those input files have data on a discrete time axis as well. Each data point in the input files are associated with a discrete time (as opposed to a time interval). Depending whether you pick lower, upper, nearest, linear, or coszen; the data in the input file will be "interpolated" to the time in the model.
The offset shifts the time axis of the input data the given number of seconds. so if the input data is at 0, 3600, 7200, 10800 seconds (hourly) and you set an offset of 1800, then the input data will be set at times 1800, 5400, 9000, and 12600. so a model at time 3600 using linear interpolation would have data at "n=2" with offset of 0 will have data at "n=(2+3)/2" with an offset of 1800. n=2 is the 2nd data in the time list 0, 3600, 7200, 10800 in this example. n=(2+3)/2 is the average of the 2nd and 3rd data in the time list 0, 3600, 7200, 10800. offset can be positive or negative.
domainInfo
Information about the domain data for this stream...
variableNames
A list of the domain variable names. This is a paired list with the name of the variable in the netCDF file on the left and the name of the corresponding model variable on the right. This data models require five variables in this list. The names of model's variables (names on the right) must be: "time," "lon," "lat," "area," and "mask."
filePath
The file system directory where the domain data file is located.
fileNames
The name of the domain data file. Often the domain data is located in the same file as the field data (above), in which case the name of the domain file could simply be the name of the first field data file. Sometimes the field data files don't contain the domain data required by the data models, in this case, one new file can be created that contains the required data.
Actual example:
<stream>
<dataSource>
GENERIC
</dataSource>
<domainInfo>
<variableNames>
time time
lon lon
lat lat
area area
mask mask
</variableNames>
<filePath>
/glade/proj3/cseg/inputdata/atm/datm7/NYF
</filePath>
<fileNames>
nyf.ncep.T62.050923.nc
</fileNames>
</domainInfo>
<fieldInfo>
<variableNames>
dn10 dens
slp_ pslv
q10 shnum
t_10 tbot
u_10 u
v_10 v
</variableNames>
<filePath>
/glade/proj3/cseg/inputdata/atm/datm7/NYF
</filePath>
<offset>
0
</offset>
<fileNames>
nyf.ncep.T62.050923.nc
</fileNames>
</fieldInfo>
</stream>
<<< Previous Home Next >>>
namelist_definition_datm
https://github.com/E3SM-Project/E3SM/blob/master/components/data_comps/datm/cime_config/namelist_definition_datm.xml#L21
namelist_definition_datm.xml