Automated Rapid Climatological Azimuth Deviation Estimation
This project aims to understand the feasibility of a simplified rapid azimuth deviation prediction scheme for improving the source location when implementing an association and location cross-bearings, grid-search method like IMS_vASC [Matoza et al., 2017]. This implies using "offline", portable, and available for research purposes tools, so the calculations could be performed by any scientist in their personal workstation machine, but could also scale to a large research computation center.
The project itself is a collection of Python scripts plus some Fortran 90 layers that communicate with
available open source climatologies and a 3D ray-tracing algorithm.
ARCADE_main.py
contains the functions that serve to find and estimate the azimuth deviations, representing the core of the used methodology.
ARCADE is cabable of different types of ray propagation modelling. The most basic model is range independent (spherically layered atmosphere), meaning that uses average atmospheric descriptions in height for a source to receiver (station) direction slice ('profile'). This case uses only climatologies to determine the properties of the atmosphere for a given datetime and profile. This case can be tuned to use "perturbed" climatological profiles.
The second range intependent modeling type is a hybrid calculation that merges ECMWF ERA 5 descriptions for the lower ~80 km, with climatologies for the upper atmosphere. This case needs internet connection, and could fail if the download process hangs.
The third type of modeling is using range dependent climatological atmospheric descriptions. This case is more realistic than the range independent, as it estimates the 3D atmosphere by interpolating nodal columns in height that describe the area where the profile is. Note that this case is not 'rapid' anymore, so it should be used as a better, more accurate estimate of the atmospheric effect on the propagating signal. In theory it will take at least 10 times more time to calculate each one of this cases. Pertubed profileas have not been implemented in this case.
The fourth type of modeling is a range dependent hybrid set of descriptions as in the range independent hybrid case, but for nodal columns that merge ECMWF ERA 5 descriptions from 0 to ~80 km, with climatologies for >80 km heights. This case also depends on an internet connection, so it's the most realistic case, but could take much more time (or even hang). Use it when you want to narrow down some effects, or you want more accurate values after identiying interesting cases.
- De Negri, R., & Matoza, R. S. (2023). Rapid location of remote volcanic infrasound using 3D ray tracing and empirical climatologies: Application to the 2011 Cordón Caulle and 2015 Calbuco eruptions, Chile. Journal of Geophysical Research: Solid Earth, 128, e2022JB025735. https://doi.org/10.1029/2022JB025735
- De Negri, R. S., Rose, K. M., Matoza, R. S., Hupe, P., & Ceranna, L. (2022). Long-range multi-year infrasonic detection of eruptive activity at Mount Michael volcano, South Sandwich Islands. Geophysical Research Letters, 49, e2021GL096061. https://doi.org/10.1029/2021GL096061
-
All external repos (see below) in place and hopefully individually tested (through makefiles inside each one).
-
Fortran and c++ compilers (check
makefile
to define Fortran compiler in first line)-
For NRMLSIS2.0 you'll need gfortran version 4.8.5 or 6.3.0 or 8.1.0.
Note for Ubuntu: this means using Ubuntu 20.04 to be able to install the gfortran-8 libraries. I have not figured out a way to install older libraries in the latest stable version (22.04) without risking your workstation.
-
For HWM14 you'll need fftw3 (libfftw3-dev in Ubuntu; fftw-3 with port in MacOS).
-
-
Python 3.9 Conda: follow instructions here https://docs.conda.io/en/latest/miniconda.html
- Create the
arcade
envinroment with$ conda env create -f environment_cross_platform.yml
- When running the calculations, activate the environment with
$ conda activate arcade
. NOTE: If using hybrid models, you need to install the dependencies here too (see below).
- Create the
-
A Dockerfile and Docker image are available to ease the installation process. Please refer to the section Using Docker if you want to use this solution.
The climatologies used are the Horizontal Wind Model (HWM14) [Drob et al., 2014] to estimate horizonal winds in height, and NRLMSIS2.0 [Emmert et al., 2021] to estimate other properties of the atmosphere in height, like density, temperature and pressure. The 3D ray-tracing algorithm used is infraGA [Blom and Waxler, 2012].
The method has two main external components: the atmospheric model (HWM14+NRLMSIS2.0) and the 3D ray tracing algorithm (infraGA/geoAC). Each one of these components has an associated repository that needs to be inside repos
folder, with the names as defined in the makefile
. That is, ./repos/HWM14
, ./repos/NRLMSIS2.0
and ./repos/infraGA-master
. Below are the instructions to access each one of the components.
NRLMSIS 2.0 Code and all data samples used in this work are available at https://map.nrl.navy.mil/map/pub/nrl/NRLMSIS/NRLMSIS2.0 The associated publication is Emmert, J. T., Drob, D. P., Picone, J. M., Siskind, D. E., Jones, M., Mlynczak, M. G., et al. (2021). NRLMSIS 2.0: A whole-atmosphere empirical model of temperature and neutral species densities. Earth and Space Science, 8, e2020EA001321. https://doi.org/10.1029/2020EA001321
Available to download from the supporting information of Drob, D. P., Emmert, J. T., Meriwether, J. W., Makela, J. J., Doornbos, E., Conde, M., Hernandez, G., Noto, J., Zawdie, K. A., McDonald, S. E., et al. (2015), An update to the Horizontal Wind Model (HWM): The quiet time thermosphere, Earth and Space Science, 2, 301– 319, doi:10.1002/2014EA000089.
Available from the public seosmoacoustic repository of Los Alamos National Laboratory https://github.com/LANL-Seismoacoustics/infraGA
In addition to the empirical climatologies (NRMSLSIS2.0/HWM14), you can also use hybrid atmospheric descriptions that are a combination of ERA 5 ECMWF below ~80 km and empirical climatologies at higher altitudes. These models were used to compare and validate the climatological estimations, but can still be used for realistic aproaches.
The hybrid models need the Climate Data Store (CDS) infrastructure and API module, cdsapi
, provided by the European Centre for Medium-Range Weather Forecasts (ECMWF, https://www.ecmwf.int/). Please visit https://cds.climate.copernicus.eu/api-how-to to ensure the module is installed in your system, and properly configured API user and key to be able to download the atmospheric descriptions.
In order to decode the downloaded data and use it automatically for ray tracing with infraGA, it is necessary to install ecCodes
provided by ECMWF. Please follow the instructions here https://confluence.ecmwf.int/display/ECC/ecCodes+installation. I recommend to use the Python binding installation (https://confluence.ecmwf.int/display/UDOC/How+to+install+ecCodes+with+Python+bindings+in+conda+-+ecCodes+FAQ).
- Inside repo, run
$ make all-prep
. - Modify the input files
arcade_config.toml
anddiscretize_parameters.toml
to suit your model. (Currently it is setup with an example run for Puyehue-Cordon Caulle to IS02 on 2011-06-04 at 19:00:00 UTC) - Run
$ make run-arcade
to calculate the azimuth deviations. - After the modeling ends, the input configuration files and output figures
and results will be saved in the directory
path
/name
, which is set inarcade_config.toml
. Inside./output
, the text fileazimuth_deviation_table.txt
has the summary of results.
Here you set up the ARCADE_main.py run parameters.
run_type
:norm
- will use default non-perturbed atmospheric descriptionspert
- will use perturbed atmospheric descriptionsuse_thermo
:true
orfalse
, determines if using thermospheric arrivals or notmax_run
: maximum number of tries before giving up in searching for arrivals near each target stationatten_th
: attenuation threshold filter. Negative number representing the minimum attenuaton an arrival can have before not being considered in the estimations.min_dist_arrv
: distance in km that sets the maximum distance at which an arrival can be to be associated with a station in the first search process (later it gets reduces to 5 km)daz
: maximum distance in degrees to consider an arrival associated with a station. Measures azimuth difference.dphi
: determines the aperture to launch rays around an azimuth valuescale
: this parameter scales the azimuth steps (the bigger the smaller the step). CheckARCADE_main.py
to find relationship.thresh
: this is the final distance in degrees to declare the search as completed.k
: parameter involved in iterative search to look for arrivals around station before reachingthresh
.
Below [launch_parameters]
some other parameters related with the specific launch are set. These are related with the 3D ray tracing process:
incl_min
: minimum vertical launch angle in degreesincl_max
: maximum vertical launch angle in degreesincl_step
: step in degrees for launch angles, determine also the number of rays that are launcheddrng
: maximum extra range given a source-station geographical distance to modelbounces
: maximum number of 'bounces' to model. A bounce is then the ray reaches a turning height and comes back to the ground.src_alt
: altitude of source in kilometerswrite_atmo
:false
ortrue
, depending if you want to write the atmospheric profiles (it slows down calculations so starts asfalse
).calc_amp
:false
ortrue
, if calculating the amplitudes for the rays. As it slows down calculations so it's should befalse
for testing.perc_cpu
: from all available cpu/threads, how many will you use? Calculated as number of cpus/perc_cpu, so 1=100% of total, 2=50% of total, 3=33.333% of total and so on.
Below [project_info]
a couple of parameters about the project name and location are set. These are important to save the results and running parameters after the modeling has been completed (check wrap_proj.sh
).
name
: name of project as string between quotes. Example "Test_0".path
: path of project as string. Example:/home/yourname/Desktop/
. The project will be save inside this path after running with the name set inname
. That is: `/home/yourname/Desktop/Test_0".
Here you set up the parameters used in the discretization process. That is, when the area is defined thorugh the station locations, souce locations, grid steps, etc.
year
: year as YYYY. Example2011
.doys
: list of days of year to be calculated. Example[123, 124, 125, 145]
. It can also be just one day like[155]
.sec
: time of day in seconds. Example:68400
(19:00).hmin
: minimum altitude considered in kilometers for atmospheric descriptionshmax
: maximum altitude in kilometersdh
: altitude step to calculate atmospheric descriptions in kilometersf107a
: determines the solar activity for the climatologies, set as 'moderate' by default (value150
).f107
: related withf107a
, set as same value.apd
: determines the geomagnetic activity for the climatologies, set as 'quiet' by default (value4
).aph
: related withapd
, set as same number.sou_pos
: list of source locations in latitude-longitude pairs. Example:[[-41.33, -72.62], [-40.59, -72.117]]
. It can be just one source as[[-40.59, -72.117]]
.sta_pos
: list of station locations. Analogous tosou_pos
.sta_nam
: list of names of stations. Example:["IS02", "IS08", "IS09"]
. It should correspond tosta_pos
.sta_skp
: Not used in this version. List of station numbers from list you want to skip in the plotting process.ds
: step in kilometers to discretize along each path for calculating the average per height
Below [range_dependent]
there are parameters to determine is using a range-dependent calculation or not plus the grid discretization.
use_rng_dep
:false
ortrue
. By default is false as the range-repdendent calculations where not used in the paper associated with this repo. It is also much slower than the range-independent cases.dlat
anddlon
: the step size in degrees for the grids.
Below [ecmwf]
there are parameters related with the use of ERA 5 ECMWF data to construct the hybrid atmospheric descriptions.
use_ecmwf
:false
ortrue
, depending if using hybrid model. Of coure,true
means using it, whilefalse
not using it.auto_area
:true
orfalse
. Set asfalse
if setting the area to be downloaded manually ortrue
if determined by the source and station locations.min_lon
,max_lon
,min_lat
,max_lat
: sets the area ifauto_area
isfalse
.dlon
,dlat
: sets the grid step of the data. It depends on ERA 5 so I recomment 1 degree.h1
: sets minimum altitude to merge with climatologies.h2
: sets maximum altitude to merge with climatologies. The merging process starts ath1
with puse ERA 5 values, and ends ath2
with puse climatology values.
This is the main summary of results. It contains a header and each row represents the results of a specific source-station pair. Each column of this file is described below:
- Column 1,
Year
: the year as YYYY. - Column 2,
DOY
: Day of Year as three digit numbed (1 to 356). - Column 3,
SouNum
: source number starting from 1. Follows order fromsou_pos
fromdiscretize_parameters.toml
. - Column 4,
StaNum
: station number starting from 1. Follows order fromsta_pos
fromdiscretize_parameters.toml
. - Column 5,
TrueBaz
: geographical (true) backazimuth from station to particular source in degrees. - Column 6,
BazDevS
: stratospheric-only calculated backazimuth deviation from true. - Column 7,
#BazDS
: number of stratospheric ground intercepts found for this calculation. - Column 8,
BazDevT
: thermospheric-only calculated backazimuth deviation from true. - Column 9,
#BazDT
: number of thermospheric ground intercepts found for this calculation. - Column 10,
BazDevA
: average ofBazDevS
andBazDevT
to calculate the azimuth deviation estimation. - Column 11,
StdBDA
: standard deviation ofBazDevA
. - Column 12,
Ill
:True
if particular source-station pair could not be found, ofFalse
in the contrary case. A successful calculation should haveFalse
in this column.
Example: calculating the expected azimuth deviation at IS41 for infrasound signals coming from 2011 Puyehue-Cordón Caulle, Chile eruption
- For this example you should keep the configuration file
./input/arcade_config.toml
untouched, except for the last line (l. 52). Set the parameterpath = "FILLME"
with the path in your workstation where you would like to save your results (Example:path = /Users/rodrum/Desktop/
).
- Note 1: the project name is set in line 51 as
name = "PCCVC-Clim-Norm-StratoThermo"
. This will be the folder name that will contain your results. - Note 2:
perc_cpu
(line 47) is set as1
, which means that in theory the calculations will use all the cores in your workstation. You don't have to worry about this as this example calculates just one source to one station, which will use only one core (each profile takes one core).
- In the parameters file
./input/discretize_parameters.toml
you need to uncomment the coordinates of the station IS41 (or I41PY), and comment all the other predefined station locations. That is, comment line 26 and uncomment line 33. Everything else is already set-up for the example. - Once the input files have been modified and saved in place, just run
make run-arcade
and wait... With a 3.5 Ghz Intel Core i7 processor, this calculation should take about 19 minutes. - When the calculations are completed, you should have a folder named
name
in the path defined withpath
in the config. file./input/arcade_config.toml
.
The results are organized as follows:
input/
will contain the same configuration files used for the model to allow for reproducibility. It should also contain the filesdoys.txt
,sources.txt
, andstations.txt
that are automatically generated. These files are merely used for plotting during the process and not really necessary as the values come from the filesarcade_config.toml
anddiscretize_parameters.toml
.output/
-
azimuth_deviation_table.txt
- file containing the summary of the azimuth deviations. The colums are explained in detail in the previous section. The file should contain these two lines:Year DOY SouNum StaNum TrueBaz BazDevS #BazDS BazDevT #BazDT BazDevA StdBDA Ill 2011 155 1 1 217.16 6.65 15 6.67 17 6.66 6.37e-02 False
Meaning that the average azimuth deviation is 6.66 degrees, calculated with 15 stratospheric arrivals and 17 thermospheric arrivals.
-
figures/
: this folder contains useful figures that serve to have a visual idea of the locations of the source and station as well as the atmospheric winds for each profile.glob_XXX_YYYYY_ZZZZ.png
- a global map of the location of the station numberZZZZ
and source numberYYYYY
. In this example the numbers should be0001
and00001
for the station and source, respectively.XXX
describes the day of the year number (DOY) and in this case is set as155
(June 4 of 2011).prof_XXX_YYYYY_ZZZZ.png
- Top: a map that contains the last figure in the top. Middle: Temperature (K), zonal winds(E-W direction, m/s), meridional winds (N-S direction, m/s), density (g/cm^3), and pressure (mbar) in height. Bottom: along-profile (source-station) winds, and across-winds (perpendicular to source-station).arrv_XXX_YYYYY_ZZZZ.png
- Snapshot of last iteration of ground intecept (arrivals) for the profile. Top-left: colored by Transmission Loss (dB). Top-right: colored by travel time (hours). Bottom-left: colored by celerity (km/s). Bottom-right: colored by turnin height (km).
-
nodes/
contains files related with the calculations, basically middle-ground files to help obtain atmospheric descriptions with HWM14/NRMLSIS2.0 -
proc/
contains important output of the calculations that can be used lated to a deeper analysisrun_arcade_out.txt
- is the output of the main calculations. Check this file if anything goes wrong and the calculations fail. It should also contain the time it took to calculate the model in the last three lines (is the output oftime
in Unix).prof_XXX_YYYYY_ZZZZ_SOMENAME.txt
- contains the output of each iteration search done with ARCADE. The nameSOMENAME
will change depending in the kind of model you choose to calculate. In this case it will be just_out
, as we are using raw empirical climatologies. Use this file to check how the iterations turn out.arrv/
- contains the arrivals for each iteration and profile (output of 3d rat tracing infraGA). As the number of files could be huge, this feature should be disabled for large combinations of sources and stations. The format of each file name isNUM-A_prof_XXX_YYYYY_ZZZZ.arrivals.dat
, whereNUM
is the run number (iteration number), andA
could be1
or2
depending on the launch azimuth used to enclose the target station (see PAPER TBD). Use these files to understand the iteration process, or even make a video as each files is a snapshot of the resulting 3d ray tracing calculations. Note that the plotarrv_XXX_YYYYY_ZZZZ.png
is actually the last snapshot of the profileXXX_YYYYY_ZZZZ
.rays/
- contains the 3D ray paths for each iteration and profile (also output of infraGA). The format is similar to the files inarrv/
. You can use this files to plot the 3D rays.
-
profiles/
contains the atmospheric descriptions for the source-station direction. The files.met
are the descriptions that infraGA uses to run 3D ray tracing.
-
We published a Docker image to simplify the installation process of ARCADE. By using Docker, you can quickly set up the required environment without worrying about the dependency installations.
Before you begin, ensure that you have Docker installed on your system. You can download and install Docker by following the instructions provided in the Docker documentation.
- Pull the Docker image from the GitHub Container Registry (ghcr.io) by running the following command in your terminal:
docker pull ghcr.io/username/docker-image:tag
- Once the Docker image is successfully downloaded, you can create a new container and start the software using the following command:
docker run -it --name arcade ghcr.io/username/docker-image:tag
This will start interactively a new docker container based on the previously downloaded image and named "arcade". All the dependencies are already installed and you can start using the software. Nano is installed by defaults and you can use it to modify the input files. **Note that you don't need to run `conda activate arcade` before `make run-arcade`, the environment is activated by defaults.**
- After exiting the container, you can restart it using:
docker start -i arcade
To retrieve the results of ARCADE, you have two solutions.
- Using
docker cp
: This solution should be used if you want to retrieve results from small runs or only the result table. - Using bind mounts: This solution should be used if you want to retrieve all the data from the output folder in case of long runs.
In a terminal type:
docker cp arcade:/app/results/path/to/folder /path/to/destination
This will copy the file or folder specified to the given destination.
docker cp
can be quite especially when copying large folders/files. We propose another solution: bind mounts.
Note that bind mounts may slow down file access on MacOS and Windows with WSL2 platforms.
Bind mounts allow the docker container to directly write on the host instead of inside the docker container. To use a bind mount, you need to run the container with the following command:
docker run -it --name arcade --mount type=bind,source=/path/on/the/host,target=/app/results ghcr.io/username/docker-image:tag
- Robin Matoza: concept, analisis, and many ideas
- Jeremy Francoeur: proofreading, testing