-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path02-vectormapping.Rmd
132 lines (93 loc) · 10.2 KB
/
02-vectormapping.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# Vector Data Mapping
## Introduction
This part will introduce you to the geographic boundaries of our air quality study area, as well as the locations of key pollution and meteorological sensors. We close with a presentation of point source pollution locations, such as factories and other manufacturing facilities.
## Environment Setup
The follow packages must be installed on your computer before proceeding.
* `tmap`, for flexible thematic mapping
* `sf`, for spatial vector data manipulation
* `dplyr`, for `data.frame` manipulation
`tmap` will be set to interactive mode for this tutorial.
Let's library the above packages
```{r message=TRUE, warning=FALSE}
library(tmap)
library(sf)
library(dplyr)
tmap_mode('view')
```
## County Boundaries
The Chicago metropolitan area is the extent of the air quality study area. The study area is centered on Chicago, a major American city of 2.7 million people (2018) located along the Eastern shore of Lake Michigan and within Cook County. However, the study spans a total of 21 counties along Lake Michigan, covering Northeastern Illinois, Northwestern Indiana, and Southern Wisconsin. Included in our study area is the Midwestern city of Milwaukee, approximately 100 miles due North of Chicago.
The first order of business is to load spatial data. The 21 county boundaries are contained in an ESRI Shapefile ending in `.shp`. The code snippet below imports the shapefile as an `sf` object in R.
```{r message=FALSE, warning=FALSE, results = 'hide'}
counties = sf::st_read('./data/LargeAreaCounties/LargeAreaCounties.shp')
```
Next, the spatial data is plotted using the `tmap` map making package. Navigate through this interactive map using the mouse and scroll wheel. Counties located in the same state are colored similarly. Hover over a county and a popup will appear displaying the two-letter state abbreviation. Notice how the study area extends to areas beyond just the city boundaries of Chicago.
```{r}
tm_shape(counties) +
tm_borders() +
tm_text("COUNTYNAME", size = 0.7, along.lines = TRUE) +
tm_fill(col = "STATE", alpha = 0.5)
```
## State Boundaries
To produce accurate analyses across the 21 county study area, it is necessary to get an understanding of the geographic context of the large geographic area that our counties are located in. This is due to region-wide weather and pollution patterns impacting local pollution levels observed in our study area. Since our counties are situated in the Midwestern region of the United States, data is actually collected across the four U.S. states of Illinois, Indiana, Michigan, and Wisconsin.
As before, data on the geographic boundaries of these four states can be loaded into R as shown below.
```{r message=FALSE, warning=FALSE, results = 'hide'}
states = sf::st_read('./data/FourStates/FourStates.shp')
```
The code snippet below produces an interactive mapping of the four states. For reference, the original 21 county study area is depicted in gray, at the center of the map. Each state is depicted in a different color for easy differentiation. This map provides an estimate of how much air quality must be collected just to predict pollution levels in our relatively small 21 county study area.
```{r}
tm_shape(states) +
tm_borders() +
tm_text("NAME", size = 0.8, auto.placement = TRUE) +
tm_fill(col = "NAME", alpha = 0.5) +
tm_shape(counties) +
tm_fill(col = "black", alpha = 0.25) + tm_borders(col = "black", alpha = 0.25)
```
```{r, include = FALSE}
basemap = tm_shape(states) +
tm_borders() +
tm_text("NAME", size = 0.8, auto.placement = TRUE) +
tm_fill(col = "NAME", alpha = 0.5) +
tm_shape(counties) +
tm_fill(col = "black", alpha = 0.25) + tm_borders(col = "black", alpha = 0.25)
```
## EPA Particulate Matter Stations
Over 127 individual PM2.5 pollution monitoring stations are located across the four state data collection area. These sensors are maintained by state-level environmental protection agencies. Collected data are then reported to the federal Environmental Protection Agency (EPA). Some sensors are digital and provide pollution observations at least every day, and sometimes every hour. However, the vast majority are analog and only report data every three to twelve days. Analog sensors work by sucking air through a paper wafer designed to trap particular matter. For each observation, a technician removes the wafer from the instrument and places a new one for the next observation. The used wafer is then sent to a lab where it is weighed, creating an estimate of average PM2.5 concentration during the monitoring period.
Location information of sensors that were active between 2014 and 2018 is imported into R from a pre-compiled `.geojson` file.
```{r message=FALSE, warning=FALSE, results = 'hide'}
sensors = sf::st_read('./data/PM25_4States_2014.2018_POINTLOCS.geojson') %>% dplyr::mutate(rec_duration = as.numeric(lastRec - firstRec))
```
The map below is identical to the map of the four state area from before, except it now includes the location of EPA sensors. The colored circles count the number of sensors in each local region. Zoom into the map using the scroll wheel to discover the exact location of EPA pollution sensors across the four state area, represented by the EPA logo. Notice how they are primarily concentrated around large urban areas. This is by design, as the EPA's mission is to monitor pollution levels where people work and live, which also happens to be around cities. The area around Chicago contains approximately 17 sensors while the Milwaukee area has 6 sensors. Clicking on a given sensor reveals details about when the sensor was active. Note that not all sensors were active during the entire five year study period between 2014 and 2018.
```{r}
basemap +
tm_shape(sensors) +
tm_markers(shape = tmap_icons('https://github.com/GeoDaCenter/OpenAirQ-toolkit/blob/master/data/assets/EPA_logo.png?raw=true'))
```
## Weather Stations
Air quality is directly affected by regional and local weather patterns, so a dense network of weather stations is an important tool in our data toolbox. High temporal resolution weather data were sourced from a large network of ground-based weather stations typically located at airports. These sensors form the Automated Surface Observing System (or ASOS) and provide hourly data an many weather characteristics, such as temperature, pressure, wind velocity, and wind direction. ASOS sensors are maintained by a variety of agencies like the National Oceanic and Atmospheric Administration (NOAA), the Federal Aviation Administration (FAA), and the Department of Defense (DoD). Their primary purpose is to provide real-time, accurate weather information for pilots landing or departing at a given airport. Additionally, they serve as primary weather sensors for a variety of NOAA weather models. All sensors are digital, with data being continuously monitored in real-time for errors and inconsistencies.
The locations of all sensors in the study area are loaded as shown below.
```{r message=FALSE, warning=FALSE, results = 'hide'}
asos = sf::st_read('./data/4States_ASOS_2018_Locations.geojson')
```
The map below describes the distribution of sensors in the four state data collection area. There are a total of 249 sensors in the four state area. Similarly to the EPA sensor map, use the scroll wheel to zoom in and the mouse to explore the map. Individual sensors are represented with a blue airplane icon. Clicking on an airport will reveal the three-letter airport identifier and full name for that airport. As these weather sensors are used for aviation purposes, they report data continuously throughout the day once installed.
```{r}
basemap +
tm_shape(asos) +
tm_markers(shape = tmap_icons('https://github.com/GeoDaCenter/OpenAirQ-toolkit/blob/master/data/assets/airport_icon.png?raw=true'))
```
## Point Sources of Pollution
The last vector dataset in our toolkit are point source emission locations, sourced from the EPA National Emissions Inventory of 2014. These locations represent known sources of pollution. Locations in this inventory represent various factories, processing plants, heavy industrial installations, powerplants, major HVAC systems, and more. Each location comes with information on the quantity of pollution per year released at the site. However, these number are self-reported by the polluting firm and are likely underestimates of true pollution occurring at that location. We load the location dataset as shown in the code snippet below. Each pollution sourced is represented as a point feature.
```{r message=FALSE, warning=FALSE, results = 'hide'}
points.pollution = sf::st_read('./data/Point_Source_Emissions_4States_2014.geojson')
```
This map describes the spatial distribution of point source emissions across the four state area. There are over 33,000 individual locations in the inventory. Navigate through this interactive map by clicking on the colored circles to zoom into smaller and smaller areas. Each individual point source emissions is represented by a blue icon. Click on this icon to reveal more information about pollution at that location, including the firm creating the pollution and the amount of pollution generated per year at that site.
```{r}
basemap +
tm_shape(points.pollution) +
tm_markers()
```
## Further Resources {-}
You made it to the end of the first chapter, great job! If you would like to learn more about how to source these datasets, see appendix E. Additional resources describing the origin of these datasets are provided below.
* See the [EPA's air quality information page](https://www.epa.gov/outdoor-air-quality-data/air-data-basic-information) for more information on air quality measuring in the United States.
* Refer to [Iowa State University](https://mesonet.agron.iastate.edu/ASOS/) for more information on the ASOS weather station network.
* See the [EPA's National Emissions Inventory homepage](https://www.epa.gov/air-emissions-inventories/2014-national-emissions-inventory-nei-data) for more information on point source emissions.
* More information about mapping in R using `tmap` and other packages can be found in [Chapter 8 Making maps with R](https://geocompr.robinlovelace.net/adv-map.html) from Geocomputation with R by Robin Lovelace, Jakub Nowosad, and Jannes Muenchow.