Skip to content

BigelowLab/biooracle

Repository files navigation

biooracle

Extra R-language tools to supplement biooracler R package. This package serves up R scripts to create a local data repository.

Requirements for package

From CRAN

Installation

# install.packages(remotes)
remotes::install_github("BigelowLab/biooracle")

Set up a data directory

You can store the path to your chosen data directory. It will persist between R sessions so you don’t have to do it each time.

suppressPackageStartupMessages({
  library(biooracle)
  library(dplyr)
})
set_biooracle_root("~/Library/CloudStorage/Dropbox/data/biooracle")

We’ll be creating a new local dataset for the Northwest Atlantic (nwa) which has the bounding box {r} bb = c(xmin = -77, xmax = -42.5, ymin = 36.5, ymax = 56.7).

nwa_path = biooracle_path("nwa") |> make_path()
biooracle_path() |> dir(full.names = TRUE)
## [1] "/Users/ben/Library/CloudStorage/Dropbox/data/biooracle/nwa" 
## [2] "/Users/ben/Library/CloudStorage/Dropbox/data/biooracle/temp"

Fetch some data to a temporary directory

We’ll set that aside for right now and fetch some data for that region, but note that this downloaded as a NetCDF file in a temporary directory. Keep in mind that we are specifying the bounding box with a vector of the corners, but we can also provide any object from which a bounding box can be determined using the sf package, such as a polygon, raster or collection of points.

dataset_id = "thetao_ssp119_2020_2100_depthmin"
newfile = fetch_biooracle(dataset_id, 
                          bb = c(xmin = -77, xmax = -42.5, ymin = 36.5, ymax = 56.7))

NOTE that you can make subselections of variable and times to download. See ?fetch_biooracle for the details.

Now we can read the file.

x = stars::read_stars(newfile, quiet = TRUE)
x
## stars object with 3 dimensions and 7 attributes
## attribute(s), summary of first 1e+05 cells:
##                          Min.    1st Qu.    Median      Mean   3rd Qu.
## thetao_ltmax [°C] -0.36475005 1.96057870 2.3428427 2.9704422 3.3483083
## thetao_ltmin [°C] -1.90398343 1.57293375 1.8961157 1.7156003 2.3728029
## thetao_max [°C]    0.21924963 2.17144280 2.7790994 3.7453738 4.5603769
## thetao_mean [°C]  -0.72299476 1.80542342 2.0918568 2.2443568 2.7558258
## thetao_min [°C]   -2.00000000 0.91433330 1.6623332 1.1105045 2.0303362
## thetao_range [°C]  0.22811718 0.47419300 0.7938143 2.6947997 4.0197336
## thetao_sd [°C]     0.03341453 0.09708313 0.1335515 0.2456717 0.4161988
##                         Max.  NA's
## thetao_ltmax [°C] 17.8440917 53161
## thetao_ltmin [°C]  6.0058540 53161
## thetao_max [°C]   21.4847577 53161
## thetao_mean [°C]   6.8363208 53161
## thetao_min [°C]    4.7977722 53161
## thetao_range [°C] 23.8074380 53161
## thetao_sd [°C]     0.7385261 53161
## dimension(s):
##      from  to offset delta  refsys                    values x/y
## x       1 691    -77  0.05      NA                      NULL [x]
## y       1 405  56.75 -0.05      NA                      NULL [y]
## time    1   8     NA    NA POSIXct 2020-01-01,...,2090-01-01

Archiving in a local database

We often save the data in a directory structure aong with a simple table that catalogs the contents of the directory. The archive_biooracle() function will split up the fecthed data and save in a logical data structure. We provide the data path, in this case for the Northwest Atlantic (nwa).

archive_biooracle(newfile, path = nwa_path)
## # A tibble: 56 × 5
##    scenario year  z        param  trt  
##    <chr>    <chr> <chr>    <chr>  <chr>
##  1 ssp119   2020  depthmin thetao ltmax
##  2 ssp119   2020  depthmin thetao ltmin
##  3 ssp119   2020  depthmin thetao max  
##  4 ssp119   2020  depthmin thetao mean 
##  5 ssp119   2020  depthmin thetao min  
##  6 ssp119   2020  depthmin thetao range
##  7 ssp119   2020  depthmin thetao sd   
##  8 ssp119   2030  depthmin thetao ltmax
##  9 ssp119   2030  depthmin thetao ltmin
## 10 ssp119   2030  depthmin thetao max  
## # ℹ 46 more rows

Alternatively, it is possible to fetch and archive in one step, and this is likely the most convenient usage.

newfile = fetch_biooracle(dataset_id, 
                          bb = c(xmin = -77, xmax = -42.5, ymin = 36.5, ymax = 56.7),
                          archive = TRUE,
                          data_dir = nwa_path)

Read the database catalog

Once you have established a database of files, your can read the database catalog.

db = read_database(nwa_path) |>
  print()
## # A tibble: 56 × 5
##    scenario year  z        param  trt  
##    <chr>    <chr> <chr>    <chr>  <chr>
##  1 ssp119   2020  depthmin thetao ltmax
##  2 ssp119   2020  depthmin thetao ltmin
##  3 ssp119   2020  depthmin thetao max  
##  4 ssp119   2020  depthmin thetao mean 
##  5 ssp119   2020  depthmin thetao min  
##  6 ssp119   2020  depthmin thetao range
##  7 ssp119   2020  depthmin thetao sd   
##  8 ssp119   2030  depthmin thetao ltmax
##  9 ssp119   2030  depthmin thetao ltmin
## 10 ssp119   2030  depthmin thetao max  
## # ℹ 46 more rows

Read in data from the database

You can use a portion of the database to read in a stars object. Keep in mind that if you are reading multiple over multiple decades, then each variable must have the same number of time steps.

x = db |>
  dplyr::mutate(year = as.numeric(year)) |>
  dplyr::filter(year >= 2070) |>
  read_biooracle(, path = nwa_path) |>
  print()
## stars object with 3 dimensions and 7 attributes
## attribute(s):
##                      Min.    1st Qu.    Median      Mean   3rd Qu.      Max.
## thetao_ltmax  -0.64297539 1.89211679 2.1213403 4.1633015 4.1325927 31.248983
## thetao_ltmin  -2.00000000 1.80401134 1.8825923 2.4666165 2.7054288 12.447174
## thetao_max     0.46041700 1.94233787 2.4119213 5.0367358 5.3864827 33.908306
## thetao_mean   -1.06481194 1.86144698 1.9767619 3.2299667 3.4484773 18.037924
## thetao_min    -2.00000000 1.41676116 1.8234118 1.7105341 1.9514574  9.975613
## thetao_range   0.04843726 0.12690720 0.6352606 3.3517626 5.0061388 33.088707
## thetao_sd      0.02830037 0.06138093 0.2785121 0.4090969 0.6726204  3.425986
##                 NA's
## thetao_ltmax  287604
## thetao_ltmin  287604
## thetao_max    287604
## thetao_mean   287604
## thetao_min    287604
## thetao_range  287604
## thetao_sd     287604
## dimension(s):
##      from  to offset delta x/y
## x       1 691    -77  0.05 [x]
## y       1 405  56.75 -0.05 [y]
## time    1   3   2070    10

And of course you can plot.

plot(x['thetao_mean'])
## downsample set to 1

About

R tools to access Bio-Oracle v3

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages