This repository contains a set of codes to perform adaptive geographical masking.
Geographical masking alters the precision or accuracy of geographical data for the purpose of anonymization.
In adaptive geographical masking the degree of "alteration" or "masking" is not a fixed value but is adapted based on the density of the underlying risk of re-identification information (RORI). Lower density requires higher masking degree and vice versa.
RORI can be the number of people, number of households, number of residential addresses, or other. If RORI is not considered and applied in geographical masking, then data linkage can occur between a private-sensitive-confidential information and the RORI. Such linkage could lead to re-identification.
-
Original data: a point shapefile with the locations of private-sensitive-confidential information (e.g., locations of domestic violence events, addresses of patients with a desease).
-
RORI polygons: a polygon shapefile with the attribute RORI (e.g., postcodes with the number of households in each polygon)
-
Streets: a line shapefile that represents the road network of the study area. This is needed only for the Voronoi Masking Method.
Scope: aggregate polygons to create new polygons that have attribute values that are equal or greater than a minimum value
-
AdaptiveElimination: Creates spatial K-anonymized polygons by eliminating irregular polygons; iterates through each set of polygons of the same RORI value starting from the minimum value
-
AdaptiveDissolvingID: Creates spatial K-anonymized polygons by dissolving regular polygons; iterates through each polygon based on its ID attribute
-
AdaptiveDissolvingMin: Creates spatial K-anonymized polygons by dissolving regular polygons; iterates through each set of polygons of the same RORI value starting from the minimum value
-
PointAggregation: original points are displaced to the centroid of their corresponding SKApoly.
-
RandomPerturbation: original points are randomly displaced (distance + direction) within their corresponding SKApoly.
-
AdaptiveVoronoiMasking: original data are displaced to the closest segment of their corresponding Voronoi polygon which is laying within their corresponding SKApoly. Two exceptions apply. If a Voronoi segment lies outside its SKApoly, the point is displaced to the boundary of the SKApoly. If there is only one point within the SKApoly, then it is randomly displaced within the SKApoly. Last, displaced points are further displaced to the closest street intersection.
-
The codes are written in Python and use the ArcPy package: https://pro.arcgis.com/en/pro-app/arcpy/get-started/what-is-arcpy-.htm
-
Data should be in a shapefile format: https://desktop.arcgis.com/en/arcmap/10.3/manage-data/shapefiles/what-is-a-shapefile.htm
-
Data should additionaly be copied into a personal database (.mdb): https://desktop.arcgis.com/en/arcmap/latest/manage-data/administer-file-gdbs/create-personal-geodatabase.htm
-
For AVM (script 6) data shall be in a shapefile format (NOT .mdb).
-
Sample data to test the codes are provided (point, polygon, and roads files); data are located in Saxony, Germany.
-
There is an ArcGIS toolbox that has been developed for Adaptive Areal Anonymization. The toolbox performs two methods: a) a version of Adaptive Areal Elimination and b) Adaptive Areal Masking. The toolbox can be found here - Adaptive Areal Anonymization Toolbox
Kounadi, O., & Leitner, M. (2016). Adaptive areal elimination (AAE): A transparent way of disclosing protected spatial datasets. Computers, Environment and Urban Systems, 57, 59-67
Polzin, Fiona (2020) Adaptive Voronoi Masking: A method to protect confidential discrete spatial data. MSc Thesis, GIMA – Geographical Information Management and Applications. University of Utrecht – TU Delft – Wageningen University – University of Twente.
Charleux, L., & Schofield, K. (2020). True spatial k-anonymity: adaptive areal elimination vs. adaptive areal masking. Cartography and Geographic Information Science, 1-13.