Skip to content

Commit

Permalink
Merge pull request #9 from ESDS-Leipzig/ee
Browse files Browse the repository at this point in the history
Adding GEE support
  • Loading branch information
davemlz authored Jan 21, 2024
2 parents bc1ee4d + 07f8995 commit f0bb7a3
Show file tree
Hide file tree
Showing 7 changed files with 1,585 additions and 54 deletions.
42 changes: 37 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<a href="https://github.com/davemlz/cubo"><img src="https://github.com/davemlz/cubo/raw/main/docs/_static/logo.png" alt="cubo"></a>
</p>
<p align="center">
<em>On-demand Earth System Data Cubes (ESDCs) from STAC in Python</em>
<em>On-Demand Earth System Data Cubes (ESDCs) in Python</em>
</p>
<p align="center">
<a href='https://pypi.python.org/pypi/cubo'>
Expand Down Expand Up @@ -59,10 +59,12 @@
[SpatioTemporal Asset Catalogs (STAC)](https://stacspec.org/) provide a standardized format that describes
geospatial information. Multiple platforms are using this standard to provide clients several datasets.
Nice platforms such as [Planetary Computer](https://planetarycomputer.microsoft.com/) use this standard.
Additionally, [Google Earth Engine (GEE)](https://developers.google.com/earth-engine/datasets/)
also provides a gigantic catalogue that users can harness for different tasks in Python.

`cubo` is a Python package that provides users of STAC objects an easy way to create On-demand Earth System Data Cubes (ESDCs). This is perfectly suitable for Machine Learning (ML) / Deep Learning (DL) tasks. You can easily create a lot of ESDCs by just knowing a pair of coordinates and the edge size of the cube in pixels!
`cubo` is a Python package that provides users of STAC and GEE an easy way to create On-Demand Earth System Data Cubes (ESDCs). This is perfectly suitable for Deep Learning (DL) tasks. You can easily create a lot of ESDCs by just knowing a pair of coordinates and the edge size of the cube in pixels!

Check the simple usage of `cubo` here:
Check the simple usage of `cubo` with STAC here:

```python
import cubo
Expand All @@ -84,15 +86,39 @@ da = cubo.create(

This chunk of code just created an `xr.DataArray` object given a pair of coordinates, the edge size of the cube (in pixels), and additional information to get the data from STAC (Planetary Computer by default, but you can use another provider!). Note that you can also use the resolution you want (in meters) and the bands that you require.

Now check the simple usage of `cubo` with GEE here:

```python
import cubo
import xarray as xr

da = cubo.create(
lat=51.079225, # Central latitude of the cube
lon=10.452173, # Central longitude of the cube
collection="COPERNICUS/S2_SR_HARMONIZED", # Id of the GEE collection
bands=["B2","B3","B4"], # Bands to retrieve
start_date="2016-06-01", # Start date of the cube
end_date="2017-07-01", # End date of the cube
edge_size=128, # Edge size of the cube (px)
resolution=10, # Pixel size of the cube (m)
gee=True # Use GEE instead of STAC
)
```

This chunk of code is very similar to the STAC-based cubo code. Note that the `collection`
is now the ID of the GEE collection to use, and note that the `gee` argument must be set to
`True`.

## How does it work?

The thing is super easy and simple.

1. You have the coordinates of a point of interest. The cube will be created around these coordinates (i.e., these coordinates will be approximately the spatial center of the cube).
2. Internally, the coordinates are transformed to the projected UTM coordinates [x,y] in meters (i.e., local UTM CRS). They are rounded to the closest pair of coordinates that are divisible by the resolution you requested.
3. The edge size you provide is used to create a Bounding Box (BBox) for the cube in the local UTM CRS given the exact amount of pixels (Note that the edge size should be a multiple of 2, otherwise it will be rounded, usual edge sizes for ML are 64, 128, 256, 512, etc.).
4. Additional information is used to retrieve the data from the STAC catalogue: starts and end dates, name of the collection, endpoint of the catalogue, etc.
5. Then, by using `stackstac` and `pystac_client` the mini cube is retrieved as a `xr. DataArray`.
4. Additional information is used to retrieve the data from the STAC catalogue or from GEE: starts and end dates, name of the collection, endpoint of the catalogue (ignored for GEE), etc.
5. Then, by using `stackstac` and `pystac_client` the cube is retrieved as a `xr. DataArray`. In the case of GEE, the cube is retrieved
via `xee`.
6. Success! That's what `cubo` is doing for you, and you just need to provide the coordinates, the edge size, and the additional info to get the cube.

## Installation
Expand All @@ -103,6 +129,12 @@ Install the latest version from PyPI:
pip install cubo
```

Install `cubo` with the required GEE dependencies from PyPI:

```
pip install cubo[ee]
```

Upgrade `cubo` by running:

```
Expand Down
160 changes: 121 additions & 39 deletions cubo/cubo.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ def create(
edge_size: Union[float, int] = 128.0,
resolution: Union[float, int] = 10.0,
stac: str = "https://planetarycomputer.microsoft.com/api/stac/v1",
gee: bool = False,
**kwargs,
) -> xr.DataArray:
"""Creates a data cube from a STAC Catalogue as a :code:`xr.DataArray` object.
Expand Down Expand Up @@ -52,6 +53,11 @@ def create(
Pixel size in meters.
stac : str, default = 'https://planetarycomputer.microsoft.com/api/stac/v1'
Endpoint of the STAC Catalogue to use.
gee : bool, default = True
Whether to use Google Earth Engine. This ignores the 'stac' argument.
.. versionadded:: 2024.1.0
kwargs :
Additional keyword arguments passed to :code:`pystac_client.Client.search()`.
Expand All @@ -77,51 +83,127 @@ def create(
... resolution=10,
... )
<xarray.DataArray (time: 3, band: 3, x: 32, y: 32)>
Create a Sentinel-2 L2A data cube with an edge size of 128 px from Google Earth Engine:
>>> import cubo
>>> cubo.create(
... lat=51.079225,
... lon=10.452173,
... collection="COPERNICUS/S2_SR_HARMONIZED",
... bands=["B2","B3","B4"],
... start_date="2016-06-01",
... end_date="2017-07-01",
... edge_size=128,
... resolution=10,
... gee=True,
... )
<xarray.DataArray (time: 27, band: 3, x: 128, y: 128)>
"""
# Get the BBox and EPSG
bbox_utm, bbox_latlon, utm_coords, epsg = _central_pixel_bbox(
lat, lon, edge_size, resolution
)

# Convert UTM Bbox to a Feature
bbox_utm = rasterio.features.bounds(bbox_utm)

# Open the Catalogue
CATALOG = pystac_client.Client.open(stac)

# Do a search
SEARCH = CATALOG.search(
intersects=bbox_latlon,
datetime=f"{start_date}/{end_date}",
collections=[collection],
**kwargs,
)

# Get all items and sign if using Planetary Computer
items = SEARCH.item_collection()

if stac == "https://planetarycomputer.microsoft.com/api/stac/v1":
items = pc.sign(items)

# Put the bands into list if not a list already
if not isinstance(bands, list) and bands is not None:
bands = [bands]

# Create the cube
cube = stackstac.stack(
items,
assets=bands,
resolution=resolution,
bounds=bbox_utm,
epsg=epsg,
)

# Delete attributes
attributes = ["spec", "crs", "transform", "resolution"]

for attribute in attributes:
if attribute in cube.attrs:
del cube.attrs[attribute]
# Use Google Earth Engine
if gee:

# Try to import ee, otherwise raise an ImportError
try:
import xee
import ee
except ImportError:
raise ImportError(
'"earthengine-api" and "xee" could not be loaded. Please install them, or install "cubo" using "pip install cubo[ee]"'
)

# Initialize Google Earth Engine with the high volume endpoint
ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')

# Get BBox values in latlon
west = bbox_latlon['coordinates'][0][0][0]
south = bbox_latlon['coordinates'][0][0][1]
east = bbox_latlon['coordinates'][0][2][0]
north = bbox_latlon['coordinates'][0][2][1]

# Create the BBox geometry in GEE
BBox = ee.Geometry.BBox(west,south,east,north)

# If the collection is string then access the Image Collection
if isinstance(collection,str):
collection = ee.ImageCollection(collection)

# Do the filtering: Bounds, time, and bands
collection = (
collection
.filterBounds(BBox)
.filterDate(start_date,end_date)
.select(bands)
)

# Return the cube via xee
cube = xr.open_dataset(
collection,
engine="ee",
geometry=BBox,
scale=resolution,
crs=f"EPSG:{epsg}",
chunks=dict()
)

# Rename the coords to match stackstac names, also rearrange
cube = cube.rename(Y="y",X="x").to_array("band").transpose("time","band","y","x")

# Delete all attributes
cube.attrs = dict()

# Get the name of the collection
collection = collection.get('system:id').getInfo()

# Override the stac argument using the GEE STAC
stac = "https://earthengine-stac.storage.googleapis.com/catalog/catalog.json"

else:

# Convert UTM Bbox to a Feature
bbox_utm = rasterio.features.bounds(bbox_utm)

# Open the Catalogue
CATALOG = pystac_client.Client.open(stac)

# Do a search
SEARCH = CATALOG.search(
intersects=bbox_latlon,
datetime=f"{start_date}/{end_date}",
collections=[collection],
**kwargs,
)

# Get all items and sign if using Planetary Computer
items = SEARCH.item_collection()

if stac == "https://planetarycomputer.microsoft.com/api/stac/v1":
items = pc.sign(items)

# Put the bands into list if not a list already
if not isinstance(bands, list) and bands is not None:
bands = [bands]

# Create the cube
cube = stackstac.stack(
items,
assets=bands,
resolution=resolution,
bounds=bbox_utm,
epsg=epsg,
)

# Delete attributes
attributes = ["spec", "crs", "transform", "resolution"]

for attribute in attributes:
if attribute in cube.attrs:
del cube.attrs[attribute]

# New attributes
cube.attrs = dict(
Expand Down
11 changes: 11 additions & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
Changelog
=========

v2024.1.0
---------

- Added support for Google Earth Engine.
- Added the :code:`gee` argument to :code:`cubo.create()`.

v2023.12.0
---------

- Added preservation via Zenodo.

v2023.7.2
---------

Expand Down
47 changes: 39 additions & 8 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Cubo
<a href="https://github.com/davemlz/cubo"><img src="https://github.com/davemlz/cubo/raw/main/docs/_static/logo.png" alt="cubo"></a>
</p>
<p align="center">
<em>On-demand Earth System Data Cubes (ESDCs) from STAC in Python</em>
<em>On-Demand Earth System Data Cubes (ESDCs) in Python</em>
</p>
<p align="center">
<a href='https://pypi.python.org/pypi/cubo'>
Expand Down Expand Up @@ -59,13 +59,15 @@ Overview
SpatioTemporal Asset Catalogs (STAC) provide a standardized format that describes
geospatial information. Multiple platforms are using this standard to provide clients
several datasets. Nice platforms such as Planetary Computer use this standard.
Additionally, Google Earth Engine (GEE) also provides a gigantic catalogue that users can
harness for different tasks in Python.

`cubo` is a Python package that provides users of STAC objects an easy way to create
On-demand Earth System Data Cubes (ESDCs). This is perfectly suitable for Machine Learning (ML) /
`cubo` is a Python package that provides users of STAC and GEE an easy way to create
On-demand Earth System Data Cubes (ESDCs). This is perfectly suitable for
Deep Learning (DL) tasks. You can easily create a lot of ESDCs by just knowing a pair
of coordinates and the edge size of the cube in pixels!

Check the simple usage of `cubo` here:
Check the simple usage of `cubo` with STAC here:

.. code-block:: python
Expand Down Expand Up @@ -96,6 +98,29 @@ coordinates, the edge size of the cube (in pixels), and additional information t
data from STAC (Planetary Computer by default, but you can use another provider!). Note
that you can also use the resolution you want (in meters) and the bands that you require.

Now check the simple usage of `cubo` with GEE here:

.. code-block:: python
import cubo
import xarray as xr
da = cubo.create(
lat=51.079225, # Central latitude of the cube
lon=10.452173, # Central longitude of the cube
collection="COPERNICUS/S2_SR_HARMONIZED", # Id of the GEE collection
bands=["B2","B3","B4"], # Bands to retrieve
start_date="2016-06-01", # Start date of the cube
end_date="2017-07-01", # End date of the cube
edge_size=128, # Edge size of the cube (px)
resolution=10, # Pixel size of the cube (m)
gee=True # Use GEE instead of STAC
)
This chunk of code is very similar to the STAC-based cubo code. Note that the :code:`collection`
is now the ID of the GEE collection to use, and note that the :code:`gee` argument must be set to
:code:`True`.

How does it work?
-----------------

Expand All @@ -110,10 +135,10 @@ that are divisible by the resolution you requested.
local UTM CRS given the exact amount of pixels (Note that the edge size should be a
multiple of 2, otherwise it will be rounded, usual edge sizes for ML are 64, 128, 256,
512, etc.).
4. Additional information is used to retrieve the data from the STAC catalogue: starts
and end dates, name of the collection, endpoint of the catalogue, etc.
5. Then, by using `stackstac` and `pystac_client` the mini cube is retrieved as a
`xr.DataArray`.
4. Additional information is used to retrieve the data from the STAC catalogue or from GEE: starts
and end dates, name of the collection, endpoint of the catalogue (ignored for GEE), etc.
5. Then, by using :code:`stackstac` and :code:`pystac_client` the cube is retrieved as a
:code:`xr.DataArray`. In the case of GEE, the cube is retrieved via :code:`xee`.
6. Success! That's what `cubo` is doing for you, and you just need to provide the
coordinates, the edge size, and the additional info to get the cube.

Expand All @@ -127,6 +152,12 @@ Install the latest version from PyPI:
pip install cubo
Install `cubo` with the required GEE dependencies from PyPI:

.. code-block::
pip install cubo[ee]
Upgrade `cubo` by running:

.. code-block::
Expand Down
3 changes: 2 additions & 1 deletion docs/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,5 @@ Tutorials
tutorials/getting_started.ipynb
tutorials/cube_visualization.ipynb
tutorials/using_collections.ipynb
tutorials/visualization_lexcube.ipynb
tutorials/visualization_lexcube.ipynb
tutorials/using_gee.ipynb
Loading

0 comments on commit f0bb7a3

Please sign in to comment.