Skip to content

Commit

Permalink
Amendments to KDE calculation for rare but possible 0.0 distance value.
Browse files Browse the repository at this point in the history
Removed mypy as dev dependency, added coverage.
Added tests - >=95% coverage on _utils and geokde.
Added lint, test, and publish GH Actions YAML files.
  • Loading branch information
duncanmartyn committed Mar 25, 2024
1 parent f085da6 commit b4e4e38
Show file tree
Hide file tree
Showing 14 changed files with 412 additions and 112 deletions.
27 changes: 27 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
name: Lint
on:
push:
paths:
- geokde/**
- .github/**
- tests/**
jobs:
lint:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Set-up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'

- name: Install Poetry and dependencies
run: |
pip install poetry
poetry install
- name: Run pre-commit
run: poetry run pre-commit run --all-files
29 changes: 29 additions & 0 deletions .github/workflows/publish.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
name: Publish
on:
release:
types: [published]
permissions:
contents: read
jobs:
publish:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3

- name: Set-up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install Poetry
run: pip install poetry

- name: Set PyPI token
run: poetry config pypi-token.pypi ${{ secrets.PYPI_TOKEN }}

- name: Install dependencies
run: poetry install

- name: Publish to PyPI
run: poetry publish --build
62 changes: 62 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
name: Test
on:
push:
paths:
- geokde/**
- .github/**
- tests/**
jobs:
test:
name: ${{ matrix.os }} / ${{ matrix.python-version }}
runs-on: ${{ matrix.image }}
strategy:
matrix:
os:
- Ubuntu
- macOS
- Windows
python-version:
- '3.10'
include:
- os: Ubuntu
image: ubuntu-latest
- os: Windows
image: windows-latest
- os: macOS
image: macos-latest
fail-fast: false
defaults:
run:
shell: bash
steps:
- name: Checkout repository
uses: actions/checkout@v3

- name: Set-up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install Poetry
run: curl -sL https://install.python-poetry.org | python - -y

- name: Update PATH for Ubuntu and MacOS
if: ${{ matrix.os != 'Windows' }}
run: echo "$HOME/.local/bin" >> $GITHUB_PATH

- name: Update Path for Windows
if: ${{ matrix.os == 'Windows' }}
run: echo "$APPDATA\Python\Scripts" >> $GITHUB_PATH

- name: Configure Poetry
run: poetry config virtualenvs.in-project true

- name: Check Poetry lock
run: poetry check --lock

- name: Install dependencies
run: poetry install

- name: Run pytest
run: poetry run pytest -s
13 changes: 6 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,13 @@ repos:
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- id: check-toml
- repo: https://github.com/PyCQA/bandit
rev: 1.7.7
hooks:
- id: bandit
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v0.971
hooks:
- id: mypy
types: [python]
args: [--strict]
# - repo: https://github.com/pre-commit/mirrors-mypy
# rev: v0.971
# hooks:
# - id: mypy
# types: [python]
# args: [--strict]
53 changes: 48 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,49 @@
## Roadmap
-------
![pypi version](https://img.shields.io/pypi/v/geokde)
![pypi downloads](https://img.shields.io/pypi/dm/geokde)
[![publish](https://github.com/duncanmartyn/geokde/actions/workflows/publish.yaml/badge.svg?branch=main)](https://github.com/duncanmartyn/geokde/actions/workflows/publish.yaml)
[![test](https://github.com/duncanmartyn/geokde/actions/workflows/test.yaml/badge.svg?branch=main)](https://github.com/duncanmartyn/geokde/actions/workflows/test.yaml)
[![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)

# GeoKDE
Package for geospatial kernel density estimation (KDE).

Written in Python 3.10.11 (though compatible with 3.10.11+), GeoKDE depends on the following:
- `geopandas`
- `numpy` (itself a dependency of `geopandas`)

# Examples
Perform KDE on a GeoJSON of point geometries and write the result to a GeoTIFF raster file with `rasterio`:
```
gdf = geopandas.read_file("vector_points.geojson")
kde_array, array_bounds = geokde.kde(gdf, 1, 0.1)
transform = rasterio.transform.from_bounds(
*array_bounds,
kde_array.shape[1],
kde_array.shape[0],
)
with rasterio.open(
fp="raster.tif",
mode="w",
driver="GTiff",
width=kde_array.shape[1],
height=kde_array.shape[0],
count=1,
crs=gdf.crs,
transform=transform,
dtype=kde_array.dtype,
nodata=0.0,
) as dst:
dst.write(kde_array, 1)
```

# Roadmap
- Add more kernels.
- Implement other methods of distance measurement, e.g. haversine, manhattan.
- Investigate possible alternatives to iterating over points.
- Enable use of single radius and weight values without filling array of the same length as the points GeoDataFrame/GeoSeries. Results in marginal speed up but the current approach may become an issue with large point datasets.
- Finish tests - coverage is >=95% for _utils.py and geokde.py as is.
- Implement other methods of distance measurement, e.g. haversine, Manhattan.
- Investigate alternatives to iterating over points.
- Enable use of single radius and weight values without filling array of the same length as the points GeoDataFrame/GeoSeries. Results in marginal speed up but the current approach may become an issue with very large point datasets.
- Integrate mypy in pre-commit, possibly also linter and formatter though flake8 and black used locally.

# Contributions
Feel free to raise any issues, especially bugs and feature requests!
20 changes: 9 additions & 11 deletions geokde/_kernels.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,6 @@
Specified kernel's density estimation value.
"""

from math import pi

import numpy as np


Expand All @@ -38,7 +36,7 @@ def quartic_raw(
weight: int | float,
) -> float:
"""Raw Quartic kernel."""
if distance:
if distance < radius:
value = weight * pow(1 - pow(distance / radius, 2), 2)
else:
value = 0.0
Expand All @@ -52,8 +50,8 @@ def quartic_scaled(
weight: int | float,
) -> float:
"""Scaled Quartic kernel."""
if distance:
norm_const = 116 / (5 * pi * pow(radius, 2))
if distance < radius:
norm_const = 116 / (5 * np.pi * pow(radius, 2))
value = weight * (norm_const * (15 / 16) * pow(1 - pow(distance / radius, 2), 2))
else:
value = 0.0
Expand All @@ -67,7 +65,7 @@ def epanechnikov_raw(
weight: int | float,
) -> float:
"""Raw Epanechnikov kernel."""
if distance:
if distance < radius:
value = weight * (1 - pow(distance / radius, 2))
else:
value = 0.0
Expand All @@ -81,8 +79,8 @@ def epanechnikov_scaled(
weight: int | float,
) -> float:
"""Scaled Epanechnikov kernel."""
if distance:
norm_const = 8 / (3 * pi * pow(radius, 2))
if distance < radius:
norm_const = 8 / (3 * np.pi * pow(radius, 2))
value = weight * (norm_const * (3 / 4) * (1 - pow(distance / radius, 2)))
else:
value = 0.0
Expand All @@ -96,7 +94,7 @@ def triweight_raw(
weight: int | float,
) -> float:
"""Raw triweight kernel."""
if distance:
if distance < radius:
value = weight * pow(1 - pow(distance / radius, 2), 3)
else:
value = 0.0
Expand All @@ -110,8 +108,8 @@ def triweight_scaled(
weight: int | float,
) -> float:
"""Scaled triweight kernel."""
if distance:
norm_const = 128 / (35 * pi * pow(radius, 2))
if distance < radius:
norm_const = 128 / (35 * np.pi * pow(radius, 2))
value = weight * (norm_const * (35 / 32) * pow(1 - pow(distance / radius, 2), 3))
else:
value = 0.0
Expand Down
19 changes: 8 additions & 11 deletions geokde/_utils.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
from typing import Callable

import geopandas as gpd
import numpy as np
import pandas as pd

from _kernels import (
from geokde._kernels import (
epanechnikov_raw,
epanechnikov_scaled,
quartic_raw,
Expand Down Expand Up @@ -193,18 +191,18 @@ def calculate_kde(
array : numpy.ndarray
Array to which KDE values will be added.
kernel : str
Kernel function with which to perform KDE.
Kernel with which to perform KDE.
scale : bool
Whether to calculate raw or scaled KDE values.
"""
kernel_funcs = {
kernel_vfuncs = {
"epanechnikov": epanechnikov_scaled if scale else epanechnikov_raw,
"quartic": quartic_scaled if scale else quartic_raw,
"triweight": triweight_scaled if scale else triweight_raw,
}
# challenge to vectorise as needs to operate on >1 array element
for point in points:
add_point_kde(*point, array, kernel_funcs[kernel])
add_point_kde(*point, array, kernel_vfuncs[kernel])


def add_point_kde(
Expand All @@ -213,7 +211,7 @@ def add_point_kde(
window: float,
weight: int | float,
array: np.ndarray,
kernel_func: Callable,
kernel_vfunc: np.vectorize,
) -> None:
"""Perform KDE for a given point, window, and weight, adding the result to an array.
Expand All @@ -229,15 +227,14 @@ def add_point_kde(
Value with which the KDE value for a point will be weighted.
array : numpy.ndarray
Array to which KDE values will be added.
kernel_func : typing.Callable
Kernel function with which to perform KDE.
kernel_vfunc : numpy.vectorize
Vectorised kernel function with which to perform KDE.
"""
minx = round(x - window)
miny = round(y - window)
maxx = round(x + window)
maxy = round(y + window)
y_idx, x_idx = np.ogrid[miny + .5:maxy + .5, minx + .5:maxx + .5]
dist_array = np.sqrt(pow(x_idx - x, 2) + pow(y_idx - y, 2))
dist_array[dist_array >= window] = 0.0
kde_array = kernel_func(dist_array, window, weight)
kde_array = kernel_vfunc(dist_array, window, weight)
array[miny:maxy, minx:maxx] += kde_array
Loading

0 comments on commit b4e4e38

Please sign in to comment.