-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path09-sensor2raster.Rmd
105 lines (75 loc) · 6.63 KB
/
09-sensor2raster.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# Merging Satellite and Point Sensor Data
There's a fundamental challenge in our project to merge AOD data and predictor variables because the data capture techniques are very different. The satellite-based AOD data is measured continuously across the surface of the earth on a 1km by 1km grid system. Meanwhile, sensor data is only captured locally at the sensor location. Therefore, a method to interpolate local sensor data to generate a continuous surface of data is required.
An 'Optimized' IDW interpolation was used to estimate sensor values across a 1km by 1km grid system. This method takes into account the sensor locations and value for each variable to estimate values in grid cells without a sensor based on a linear interpolation of nearby sensor values. The specific number of sensors to take into account and the distance decay power function were optimized by medinimizing the RMSE (Error). This method was adapted from an [RSpatial Tutorial](https://rspatial.org/raster/analysis/4-interpolation.html#calfornia-air-pollution-data) on IDW interpolation with pollution and weather data.
To simplify implementation and replication, the entire workflow was coded in R and bundled into a packaged named `sensor2raster`. The next sections demonstrate how to apply this package to sensor data.
```{r message=FALSE, warning=FALSE}
# devtools::install_local('../data/sensor2raster_0.4.tar.gz')
library(sensor2raster, quietly = TRUE)
```
```{r include=FALSE}
weather.data = readr::read_csv('./data/sensor2raster/2014_ASOS_Data.csv.gz')
#AOD.grid = sensor2raster::read_raster('https://uchicago.box.com/shared/static/itoegqbp37lrmkdzqobvyx7dw2h61u0m.zip')
AOD.grid = raster::raster('./data/AOD_21Counties_MasterGrid/AOD_21Counties_MasterGrid.grd')
```
## Generate Rasters
Creating raster surfaces is easy using the `sensor2raster` function. This function takes the raw output from the `riem` or `aqsr` packages and identifies the sensor locations and data values to interpolate. The underlying IDW interpolation is performed by the `gstat` package.
The code chunk below demonstrates how to take ASOS weather data and convert it to a Raster format. The ``weather data`` variable is a data frame containing temperature data measured at airports in the Greater Chicago and Milwaukee Areas in 2018. We also pass the ``AOD.grid``, a RasterLayer object representing the grid cells where we want to predict temperature. These grid cells correspond exactly to the pixels of satellite AOD data.
```{r echo=TRUE, message=FALSE, warning=FALSE, results = 'hide'}
temp.rasters = sensor2raster(sensor.data = weather.data, # Input the raw data.frame from riem
data.type = 'ASOS', # Specify data type ('ASOS' or 'EPA')
reference.grid = AOD.grid, # Grid to interpolate over
subvariable = 'tmpf') # Column name of variable to interpolate
```
```{r echo=FALSE, fig.height=12, fig.width=12, message=FALSE, warning=FALSE}
library(dplyr)
library(sf)
library(sp)
library(rasterVis)
Lake.MI = st_read('./data/Lake_Michigan_Shoreline.geojson', quiet = T) %>% st_transform(4326) %>% as('Spatial')
counties = st_read('./data/LargeAreaCounties/LargeAreaCounties.shp', quiet = T) %>% st_transform(4326)
data = temp.rasters[[1]]
names(data) = month.name
levelplot(data %>% mask(counties),
par.settings = BuRdTheme,
at = seq(minValue(data) %>% min() - .1, maxValue(data) %>% max() + 2, by = 2),
main = 'Evolution of Monthly Average Temperature (ºF) in 2018 across the Chicago/Milwaukee Metro Areas'
) + layer(sp.polygons(Lake.MI, fill = 'cadetblue2', col = 'transparent')) + layer(sp.lines(as(counties, 'Spatial'), alpha = .5))
```
## Export to CSV
While Raster data is helpful for spatial data analysis and geovisualizations, it is sometimes helpful to store the interpolation in a non-spatial format. The ```grid2csv``` function allows you to convert Raster data to CSV either cell-by-cell, or by aggregating to a vector geometry.
The exported data.frame is halfway between Long and Wide format due to the 3-dimensional nature of our data. The table below describes how these CSVs are structured in the cell-by-cell case.
| Var_Name | Raster.Cell | M1.2018 | M2.2018 | ... | M12.2018 |
|---------------------|-------------|---------|---------|-----|----------|
| Monthly_Temperature | 1 | 23 | 25 | ... | 20 |
| Monthly_Temperature | 2 | 23 | 25 | ... | 20 |
| ... | ... | ... | ... | ... | ... |
| Monthly_Temperature | 100 | 10 | 15 | ... | 11 |
The length of the table equals the number of cells in the RasterStack. Each cell is given a unique identifier stored in the `Raster.Cell` column. The `Var_Name` colums represent the variable of interest. When there are multiple variables, they are row binded together, giving a long table format. The rest of the columns represent the names given to the layers within each RasterStack. Im this case, each column represents a month and year combination. Additional time periods are appended column-wise to the table.
The following table describes the outputted data frame from the monthly temperature Rasters generated earlier.
```{r message=FALSE}
temp.export = grid2csv(rasters = list(temp.rasters[[1]]),
var.names = 'Monthly_Temperature')
```
```{r echo=FALSE}
knitr::kable(head(temp.export))
```
The format of the table changes slightly when subsetting using a vecor object. A new column `sj_join` appears in the table, representing a unique allowing the table to be joined back to the origin `sf` object if needed. When subsetting by point features, the `Raster_Cell` column describes the cell that overlapped with each point feature. When subsettting by line or polygon features, the `Raster_Cell` column describes the cell that overlapped with the centroid of each geometry.
```{r message=FALSE, warning=FALSE}
temp.export.sf = grid2csv(rasters = list(temp.rasters[[1]]),
var.names = 'Monthly_Temperature',
sf.obj = counties)
```
```{r echo=FALSE}
knitr::kable(head(temp.export.sf)[,1:5])
```
The code chunk below demonstrates how to exploit the `sf_join` field to join the table data back to the spatial `sf` object.
```{r message=FALSE, warning=FALSE}
# counties.join = counties %>% tibble::rowid_to_column() %>% dplyr::rename(sf_join = rowid)
#
# counties.join = grid2csv(rasters = list(temp.rasters[[1]]), var.names = 'Monthly_Temperature', sf.obj = counties.join) %>%
# left_join(counties.join) %>%
# st_as_sf()
```
```{r echo=FALSE}
# plot(counties.join['M1.2018'], main = "Average Temperature in January 2018 by County")
```