Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README #3

Merged
merged 1 commit into from
Dec 7, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 58 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,14 @@
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/blackmarbler)](https://cran.r-project.org/package=blackmarbler)
[![R-CMD-check](https://github.com/dime-worldbank/googletraffic/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/worldbank/blackmarbler/actions/workflows/R-CMD-check.yaml)
![downloads](http://cranlogs.r-pkg.org/badges/grand-total/blackmarbler)
[![GitHub Repo stars](https://img.shields.io/github/stars/worldbank/blackmarbler)](https://github.com/worldbank/blackmarbler)
[![activity](https://img.shields.io/github/commit-activity/m/worldbank/blackmarbler)](https://github.com/worldbank/blackmarbler/graphs/commit-activity)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

<!-- badges: end -->

__BlackMarbleR__ is an R package for working with Black Marble data. [Black Marble](https://blackmarble.gsfc.nasa.gov/) is a [NASA Earth Observatory](https://earthobservatory.nasa.gov/) project that provides global nighttime lights data. The package automates the process of downloading all relevant tiles from the NASA LAADS archive to cover a region of interest, converting the raw files (in H5 format) to georeferenced rasters, and mosaicing rasters together when needed.
**BlackMarbleR** is an R package that provides a simple and efficient way to retrieve and extract nighttime lights data from NASA's Black Marble project. [Black Marble](https://blackmarble.gsfc.nasa.gov) is a [NASA Earth Observatory](https://earthobservatory.nasa.gov) project that provides a product suite of daily, monthly and yearly global nighttime lights. This package automates the process of downloading all relevant tiles from the [NASA LAADS archive](https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/5000/VNP46A3/) to cover a region of interest, converting the raw files (in HDF5 format), to georeferenced rasters, and mosaicing rasters together when needed.

* [Installation](#installation)
* [Bearer token](#token)
* [Usage](#usage)
* [Setup](#setup)
Expand Down Expand Up @@ -42,7 +45,7 @@ The function requires using a **Bearer Token**; to obtain a token, follow the be
3. Click "See wget Download Command" (bottom near top, in the middle)
4. After clicking, you will see text that can be used to download data. The "Bearer" token will be a long string in red.

__After logging in, the below will show the bearer token in red instead of `INSERT_DOWNLOAD_TOKEN_HERE`.__ Sometimes, after logging in, the NASA website will redirect to another part of the website. To obtain the bearer token, just navigate to the [NASA LAADS Archive](https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/5000/VNP46A3/) after logging in.
**After logging in, the below will show the bearer token in red instead of `INSERT_DOWNLOAD_TOKEN_HERE`.** Sometimes, after logging in, the NASA website will redirect to another part of the website. To obtain the bearer token, just navigate to the [NASA LAADS Archive](https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/5000/VNP46A3/) after logging in.

<p align="center">
<img src="man/figures/nasa_laads_login.png" alt="NASA LAADS Bearer Token" width="800"/>
Expand All @@ -68,7 +71,7 @@ bearer <- "BEARER-TOKEN-HERE"

### ROI
# Define region of interest (roi). The roi must be (1) an sf polygon and (2)
# in the WGS84 (epsg:4326) coordinate reference system. Here, we use the
# in the WGS84 (epsg:4326) coordinate reference system. Here, we use the
# getData function to load a polygon of Ghana
roi_sf <- gadm(country = "GHA", level=1, path = tempdir()) |> st_as_sf()
```
Expand Down Expand Up @@ -132,30 +135,30 @@ r <- bm_raster(roi_sf = roi_sf,
product_id = "VNP46A3",
date = "2021-10-01",
bearer = bearer)

#### Prep data
r <- r |> mask(roi_sf)
r <- r |> mask(roi_sf)

r_df <- rasterToPoints(r, spatial = TRUE) |> as.data.frame()
names(r_df) <- c("value", "x", "y")

## Remove very low values of NTL; can be considered noise
## Remove very low values of NTL; can be considered noise
r_df$value[r_df$value <= 2] <- 0

## Distribution is skewed, so log
r_df$value_adj <- log(r_df$value+1)

##### Map
##### Map
p <- ggplot() +
geom_raster(data = r_df,
aes(x = x, y = y,
geom_raster(data = r_df,
aes(x = x, y = y,
fill = value_adj)) +
scale_fill_gradient2(low = "black",
mid = "yellow",
high = "red",
midpoint = 4.5) +
labs(title = "Nighttime Lights: October 2021") +
coord_quickmap() +
coord_quickmap() +
theme_void() +
theme(plot.title = element_text(face = "bold", hjust = 0.5),
legend.position = "none")
Expand Down Expand Up @@ -187,7 +190,7 @@ ntl_df |>
y = "NTL Luminosity",
title = "Ghana Admin Level 1: Annual Average Nighttime Lights") +
theme_minimal() +
theme(strip.text = element_text(face = "bold"))
theme(strip.text = element_text(face = "bold"))
```

<p align="center">
Expand All @@ -205,11 +208,11 @@ The below code produces a dataframe of nighttime lights for each date, where ave
dir.create(file.path(getwd(), "bm_files"))
dir.create(file.path(getwd(), "bm_files", "daily"))

# Extract daily-level nighttime lights data for Ghana's first administrative divisions.
# Save a separate dataset for each date in the `"~/Desktop/bm_files/daily"` directory.
# The code extracts data from January 1, 2023 to today. Given that daily nighttime lights
# data is produced on roughly a week delay, the function will only extract data that exists;
# it will skip extracting data for dates where data has not yet been produced by NASA Black Marble.
# Extract daily-level nighttime lights data for Ghana's first administrative divisions.
# Save a separate dataset for each date in the `"~/Desktop/bm_files/daily"` directory.
# The code extracts data from January 1, 2023 to today. Given that daily nighttime lights
# data is produced on roughly a week delay, the function will only extract data that exists;
# it will skip extracting data for dates where data has not yet been produced by NASA Black Marble.
bm_extract(roi_sf = roi_sf,
product_id = "VNP46A2",
date = seq.Date(from = ymd("2023-01-01"), to = Sys.Date(), by = 1),
Expand All @@ -229,69 +232,66 @@ file.path(getwd(), "bm_files", "daily") |>

### Functions <a name="functions">

The package provides two functions.
The package provides two functions.

* `bm_raster` produces a raster of Black Marble nighttime lights.
* `bm_extract` produces a dataframe of aggregated nighttime lights to a region of interest (e.g., average nighttime lights within US States).
* `bm_raster` produces a raster of Black Marble nighttime lights.
* `bm_extract` produces a dataframe of aggregated nighttime lights to a region of interest (e.g., average nighttime lights within US States).

Both functions take the following arguments:

### Required arguments <a name="args-required">

* __roi_sf:__ Region of interest; sf polygon. Must be in the [WGS 84 (epsg:4326)](https://epsg.io/4326) coordinate reference system. For `bm_extract`, aggregates nighttime lights within each polygon of `roi_sf`.
* **roi_sf:** Region of interest; sf polygon. Must be in the [WGS 84 (epsg:4326)](https://epsg.io/4326) coordinate reference system. For `bm_extract`, aggregates nighttime lights within each polygon of `roi_sf`.

* __product_id:__ One of the following:
* **product_id:** One of the following:

- `"VNP46A1"`: Daily (raw)
- `"VNP46A2"`: Daily (corrected)
- `"VNP46A3"`: Monthly
- `"VNP46A4"`: Annual
* `"VNP46A1"`: Daily (raw)
* `"VNP46A2"`: Daily (corrected)
* `"VNP46A3"`: Monthly
* `"VNP46A4"`: Annual

* __date:__ Date of raster data. Entering one date will produce a raster. Entering multiple dates will produce a raster stack.
* **date:** Date of raster data. Entering one date will produce a raster. Entering multiple dates will produce a raster stack.

- For `product_id`s `"VNP46A1"` and `"VNP46A2"`, a date (eg, `"2021-10-03"`).
- For `product_id` `"VNP46A3"`, a date or year-month (e.g., `"2021-10-01"`, where the day will be ignored, or `"2021-10"`).
- For `product_id` `"VNP46A4"`, year or date (e.g., `"2021-10-01"`, where the month and day will be ignored, or `2021`).
* For `product_id`s `"VNP46A1"` and `"VNP46A2"`, a date (eg, `"2021-10-03"`).
* For `product_id` `"VNP46A3"`, a date or year-month (e.g., `"2021-10-01"`, where the day will be ignored, or `"2021-10"`).
* For `product_id` `"VNP46A4"`, year or date (e.g., `"2021-10-01"`, where the month and day will be ignored, or `2021`).

* __bearer:__ NASA bearer token. For instructions on how to create a token, see [here](https://github.com/worldbank/blackmarbler#bearer-token-).
* **bearer:** NASA bearer token. For instructions on how to create a token, see [here](https://github.com/worldbank/blackmarbler#bearer-token-).

### Optional arguments <a name="args-optional">

* __variable:__ Variable to used to create raster (default: `NULL`). For information on all variable choices, see [here](https://ladsweb.modaps.eosdis.nasa.gov/api/v2/content/archives/Document%20Archive/Science%20Data%20Product%20Documentation/VIIRS_Black_Marble_UG_v1.2_April_2021.pdf); for `VNP46A1`, see Table 3; for `VNP46A2` see Table 6; for `VNP46A3` and `VNP46A4`, see Table 9. If `NULL`, uses the following default variables:
* **variable:** Variable to used to create raster (default: `NULL`). For information on all variable choices, see [here](https://ladsweb.modaps.eosdis.nasa.gov/api/v2/content/archives/Document%20Archive/Science%20Data%20Product%20Documentation/VIIRS_Black_Marble_UG_v1.2_April_2021.pdf); for `VNP46A1`, see Table 3; for `VNP46A2` see Table 6; for `VNP46A3` and `VNP46A4`, see Table 9. If `NULL`, uses the following default variables:

* For `product_id` `"VNP46A1"`, uses `DNB_At_Sensor_Radiance_500m`.
* For `product_id` `"VNP46A2"`, uses `Gap_Filled_DNB_BRDF-Corrected_NTL`.
* For `product_id`s `"VNP46A3"` and `"VNP46A4"`, uses `NearNadir_Composite_Snow_Free`.

- For `product_id` `"VNP46A1"`, uses `DNB_At_Sensor_Radiance_500m`.
- For `product_id` `"VNP46A2"`, uses `Gap_Filled_DNB_BRDF-Corrected_NTL`.
- For `product_id`s `"VNP46A3"` and `"VNP46A4"`, uses `NearNadir_Composite_Snow_Free`.
* **quality_flag_rm:** Quality flag values to use to set values to `NA`. Each pixel has a quality flag value, where low quality values can be removed. Values are set to `NA` for each value in ther `quality_flag_rm` vector. (Default: `c(255)`).

* __quality_flag_rm:__ Quality flag values to use to set values to `NA`. Each pixel has a quality flag value, where low quality values can be removed. Values are set to `NA` for each value in ther `quality_flag_rm` vector. (Default: `c(255)`).
* For `VNP46A1` and `VNP46A2` (daily data):
* `0`: High-quality, Persistent nighttime lights
* `1`: High-quality, Ephemeral nighttime Lights
* `2`: Poor-quality, Outlier, potential cloud contamination, or other issues
* `255`: No retrieval, Fill value (masked out on ingestion)

- For `VNP46A1` and `VNP46A2` (daily data):
- `0`: High-quality, Persistent nighttime lights
- `1`: High-quality, Ephemeral nighttime Lights
- `2`: Poor-quality, Outlier, potential cloud contamination, or other issues
- `255`: No retrieval, Fill value (masked out on ingestion)

- For `VNP46A3` and `VNP46A4` (monthly and annual data):
- `0`: Good-quality, The number of observations used for the composite is larger than 3
- `1`: Poor-quality, The number of observations used for the composite is less than or equal to 3
- `2`: Gap filled NTL based on historical data
- `255`: Fill value
* For `VNP46A3` and `VNP46A4` (monthly and annual data):
* `0`: Good-quality, The number of observations used for the composite is larger than 3
* `1`: Poor-quality, The number of observations used for the composite is less than or equal to 3
* `2`: Gap filled NTL based on historical data
* `255`: Fill value

* __output_location_type:__ Where output should be stored (default: `r_memory`). Either:
* **output_location_type:** Where output should be stored (default: `r_memory`). Either:

- `r_memory` where the function will return an output in R
- `file` where the function will export the data as a file. For `bm_raster`, a `.tif` file will be saved; for `bm_extract`, a `.Rds` file will be saved. A file is saved for each date. Consequently, if `date = c(2018, 2019, 2020)`, three datasets will be saved: one for each year. Saving a dataset for each date can facilitate re-running the function later and only downloading data for dates where data have not been downloaded.
* `r_memory` where the function will return an output in R
* `file` where the function will export the data as a file. For `bm_raster`, a `.tif` file will be saved; for `bm_extract`, a `.Rds` file will be saved. A file is saved for each date. Consequently, if `date = c(2018, 2019, 2020)`, three datasets will be saved: one for each year. Saving a dataset for each date can facilitate re-running the function later and only downloading data for dates where data have not been downloaded.

If `output_location_type = "file"`, the following arguments can be used:

* __file_dir:__ The directory where data should be exported (default: `NULL`, so the working directory will be used)
* __file_prefix:__ Prefix to add to the file to be saved. The file will be saved as the following: `[file_prefix][product_id]_t[date].[tif/Rds]`
* __file_skip_if_exists:__ Whether the function should first check wither the file already exists, and to skip downloading or extracting data if the data for that date if the file already exists (default: `TRUE`). If the function is first run with `date = c(2018, 2019, 2020)`, then is later run with `date = c(2018, 2019, 2020, 2021)`, the function will only download/extract data for 2021. Skipping existing files can facilitate re-running the function at a later date to download only more recent data.
* **file_dir:** The directory where data should be exported (default: `NULL`, so the working directory will be used)
* **file_prefix:** Prefix to add to the file to be saved. The file will be saved as the following: `[file_prefix][product_id]_t[date].[tif/Rds]`
* **file_skip_if_exists:** Whether the function should first check wither the file already exists, and to skip downloading or extracting data if the data for that date if the file already exists (default: `TRUE`). If the function is first run with `date = c(2018, 2019, 2020)`, then is later run with `date = c(2018, 2019, 2020, 2021)`, the function will only download/extract data for 2021. Skipping existing files can facilitate re-running the function at a later date to download only more recent data.

### Argument for `bm_extract` only <a name="args-extract">

* __aggregation_fun:__ A vector of functions to aggregate data (default: `"mean"`). The `exact_extract` function from the `exactextractr` package is used for aggregations; this parameter is passed to `fun` argument in `exactextractr::exact_extract`.
* __add_n_pixels:__ Whether to add a variable indicating the number of nighttime light pixels used to compute nighttime lights statistics (eg, number of pixels used to compute average of nighttime lights). When `TRUE`, it adds three values: `n_non_na_pixels` (the number of non-`NA` pixels used for computing nighttime light statistics); `n_pixels` (the total number of pixels); and `prop_non_na_pixels` the proportion of the two. (Default: `TRUE`).



* **aggregation_fun:** A vector of functions to aggregate data (default: `"mean"`). The `exact_extract` function from the `exactextractr` package is used for aggregations; this parameter is passed to `fun` argument in `exactextractr::exact_extract`.
* **add_n_pixels:** Whether to add a variable indicating the number of nighttime light pixels used to compute nighttime lights statistics (eg, number of pixels used to compute average of nighttime lights). When `TRUE`, it adds three values: `n_non_na_pixels` (the number of non-`NA` pixels used for computing nighttime light statistics); `n_pixels` (the total number of pixels); and `prop_non_na_pixels` the proportion of the two. (Default: `TRUE`).
Loading