Skip to content

Commit

Permalink
Merge pull request #15 from UCA-Datalab/develop
Browse files Browse the repository at this point in the history
Develop
  • Loading branch information
daniprec authored Apr 7, 2022
2 parents 317e362 + 0c72a0a commit a2a6fb7
Show file tree
Hide file tree
Showing 14 changed files with 905 additions and 96 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ notebooks/
plots/
output*/
*.csv
*.ipynb

# Log
log.out
Expand Down
130 changes: 109 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,67 @@
# NILM: classification VS regression
<!-- README template: https://github.com/othneildrew/Best-README-Template -->

<!-- PROJECT SHIELDS -->
<!--
*** I'm using markdown "reference style" links for readability.
*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
*** See the bottom of this document for the declaration of the reference variables
*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
*** https://www.markdownguide.org/basic-syntax/#reference-style-links
-->
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![LinkedIn][linkedin-shield]][linkedin-url]

<!-- PROJECT LOGO -->
<br />
<p align="center">
<a href="https://github.com/UCA-Datalab">
<img src="images/logo.png" alt="Logo" width="400" height="80">
</a>

<h3 align="center">NILM: classification VS regression</h3>
</p>


<!-- TABLE OF CONTENTS -->
<details open="open">
<summary>Table of Contents</summary>
<ol>
<li>
<a href="#about-the-project">About The Project</a>
</li>
<li>
<a href="#getting-started">Getting Started</a>
<ul>
<li><a href="#create-the-environment">Create the Environment</a></li>
</ul>
</li>
<li>
<a href="#datasets">Datasets</a>
<ul>
<li><a href="#uk-dale">UK-DALE</a></li>
</ul>
<ul>
<li><a href="#pecan-street-dataport">Pecan Street Dataport</a></li>
</ul>
<li><a href="#preprocess-the-data">Preprocess the Data</a></li>
</li>
<li>
<a href="#train">Train</a>
<ul>
<li><a href="#reproduce-the-paper">Reproduce the Paper</a></li>
<li><a href="#thresholding-methods">Thresholding Methods</a></li>
</ul>
</li>
<li><a href="#publications">Publications</a></li>
<li><a href="#contact">Contact</a></li>
<li><a href="#acknowledgements">Acknowledgements</a></li>
</ol>
</details>

## About the project

Non-Intrusive Load Monitoring (NILM) aims to predict the status
or consumption of domestic appliances in a household only by knowing
Expand All @@ -14,10 +77,10 @@ deep learning state-of-the-art architectures on both the regression and
classification problems, introducing criteria to select the most convenient
thresholding method.

Source: [see publications](#publications)
## Getting started
### Create the Environment

## Set up
### Create the environment using Conda
To create the environment using Conda:

1. Install miniconda

Expand All @@ -43,12 +106,10 @@ Source: [see publications](#publications)
conda activate nilm-thresholding
```
## Data
## Datasets
### UK-DALE
#### Download UK-DALE
UK-DALE dataset is hosted on the following link:
[https://data.ukedc.rl.ac.uk/browse/edc/efficiency/residential
/EnergyConsumption/Domestic/UK-DALE-2017/UK-DALE-FULL-disaggregated](https://data.ukedc.rl.ac.uk/browse/edc/efficiency/residential/EnergyConsumption/Domestic/UK-DALE-2017/UK-DALE-FULL-disaggregated)
Expand All @@ -69,7 +130,15 @@ nilm-thresholding
Credit: [Jack Kelly](https://jack-kelly.com/data/)
### Preprocess
### Pecan Street Dataport
We are aiming to include this dataset in a future release. You can check the issue here: [https://github.com/UCA-Datalab/nilm-thresholding/issues/8](https://github.com/UCA-Datalab/nilm-thresholding/issues/8)
Any help and suggestions are welcome!
Credit: [Pecan Street](https://dataport.pecanstreet.org/)
## Preprocess the Data
Once downloaded the raw data from any of the sources above,
you must preprocess it.
Expand Down Expand Up @@ -106,23 +175,23 @@ If you want to use your own set of parameters, duplicate the aforementioned
configuration file and modify the paremeters you want to change (without deleting any
parameter). You can then use that config file with the following command:
```
```
python nilmth/train.py --path_config <path to your config file>
```
For more information about the script, run:
```
```
python nilmth/train.py --help
```
Once the models are trained, test them with:
```
```
python nilmth/test.py --path_config <path to your config file>
```
#### Reproduce paper
### Reproduce the Paper
To reproduce the results shown in [our paper](#publications), activate the
environment and then run:
Expand All @@ -136,11 +205,11 @@ models are stored. Then, the script `train.py` will be called, using each
configuration each. This will store the model weights, which will be used
again during the test phase:
```
```
nohup sh test_sequential.sh > log.out &
```
### Thresholding methods
### Thresholding Methods
There are three threshold methods available. Read [our paper](#publications)
to understand how each threshold works.
Expand All @@ -151,13 +220,32 @@ to understand how each threshold works.
## Publications
[NILM as a regression versus classification problem:
* [NILM as a regression versus classification problem:
the importance of thresholding](https://www.researchgate.net/project/Non-Intrusive-Load-Monitoring-6)
## Contact information
## Contact
Daniel Precioso - [daniprec](https://github.com/daniprec) - daniel.precioso@uca.es
Project link: [https://github.com/UCA-Datalab/nilm-thresholding](https://github.com/UCA-Datalab/nilm-thresholding)
ResearhGate link: [https://www.researchgate.net/project/NILM-classification-VS-regression](https://www.researchgate.net/project/NILM-classification-VS-regression)
## Acknowledgements
* [UCA DataLab](http://datalab.uca.es/)
* [David Gómez-Ullate](https://www.linkedin.com/in/david-g%C3%B3mez-ullate-oteiza-87a820b/?originalSubdomain=en)
Author: Daniel Precioso, PhD student at Universidad de Cádiz
- Email: daniel.precioso@uca.es
- [Github](https://github.com/daniprec)
- [LinkedIn](https://www.linkedin.com/in/daniel-precioso-garcelan/)
- [ResearchGate](https://www.researchgate.net/profile/Daniel_Precioso_Garcelan)
<!-- MARKDOWN LINKS & IMAGES -->
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
[contributors-shield]: https://img.shields.io/github/contributors/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge
[contributors-url]: https://github.com/UCA-Datalab/nilm-thresholding/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge
[forks-url]: https://github.com/UCA-Datalab/nilm-thresholding/network/members
[stars-shield]: https://img.shields.io/github/stars/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge
[stars-url]: https://github.com/UCA-Datalab/nilm-thresholding/stargazers
[issues-shield]: https://img.shields.io/github/issues/UCA-Datalab/nilm-thresholding.svg?style=for-the-badge
[issues-url]: https://github.com/UCA-Datalab/nilm-thresholding/issues
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
[linkedin-url]: https://www.linkedin.com/in/daniel-precioso-garcelan/
Binary file added images/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
160 changes: 160 additions & 0 deletions nilmth/data/clustering.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
import itertools
from typing import Optional, Tuple

import matplotlib.pyplot as plt
import numpy as np
from scipy.cluster.hierarchy import cophenet, dendrogram, fcluster, linkage
from scipy.spatial.distance import pdist


class HierarchicalClustering:
def __init__(
self, distance: str = "average", n_cluster: int = 2, criterion: str = "maxclust"
):
"""This object is able to perform Hierarchical Clustering on a given set of points
Parameters
----------
distance : str, optional
Clustering distance criteria, by default "average"
n_cluster : int, optional
Number of clusters to form, by default 2
criterion : str, optional
Criterion used to compute the clusters, by default "maxclust"
"""
self.distance = distance
self.n_cluster = n_cluster
self.criterion = criterion

# Attributes filled with `perform_clustering`
self.x = np.empty(0) # Set of data points
self.z = np.empty(0) # The hierarchical clustering encoded as a linkage matrix
# z[i] will tell us which clusters were merged in the i-th iteration

# Attributes filled with `plot_dendogram`
self.dendrogram = {}
# A dictionary of data structures computed to render the dendrogram

# Attributes filled with `compute_thresholds_and_centroids`
self.thresh = np.empty(0)
self.centroids = np.empty(0)

def perform_clustering(
self, ser: np.array, distance: Optional[str] = None
) -> np.array:
"""Performs the actual clustering, using the linkage function
Parameters
----------
ser : np.array
Series of points to group in clusters
distance : str, optional
Clustering distance criteria, by default None (takes the one from the class)
"""
self.distance = distance if distance is not None else self.distance
# The shape of our X matrix must be (n, m)
# n = samples, m = features
self.x = np.expand_dims(ser, axis=1)
self.z = linkage(self.x, method=self.distance)

@property
def cophenet(self):
# Cophenet correlation coefficient
c, coph_dists = cophenet(self.z, pdist(self.x))
return c

def plot_dendrogram(
self, p: int = 6, max_d: Optional[float] = None, figsize: Tuple[int] = (3, 3)
):
"""Plots the dendrogram
Parameters
----------
p : int, optional
Last split, by default 6
max_d : Optional[float], optional
Maximum distance between splits, by default None
figsize : Tuple[int], optional
Figure size, by default (3, 3)
"""
fig, ax = plt.subplots(figsize=figsize)
self.dendrogram = dendrogram(
self.z,
p=p,
orientation="right",
truncate_mode="lastp",
labels=self.x[:, 0],
ax=ax,
)
if max_d is not None:
ax.axvline(x=max_d, c="k")
return fig, ax

@property
def dendrogram_distance(self):
return sorted(set(itertools.chain(*self.dendrogram["dcoord"])), reverse=True)

def plot_dendrogram_distance(self, figsize: Tuple[int] = (10, 3)):
"""Plots the dendrogram distances
Parameters
----------
figsize : Tuple[int], optional
Size of the figure, by default (10, 3)
"""
# Initialize plots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=figsize)
# Dendrogram distance
ax1.scatter(
range(2, len(self.dendrogram_distance) + 1), self.dendrogram_distance[:-1]
)
ax1.set_ylabel("Distance")
ax1.set_xlabel("Number of clusters")
ax1.grid()
# Dendrogram distance difference
diff = np.divide(
-np.diff(self.dendrogram_distance), self.dendrogram_distance[:-1]
)
ax2.scatter(range(3, len(self.dendrogram_distance) + 1), diff[:-1])
ax2.set_ylabel("Gradient")
ax2.set_xlabel("Number of clusters")
ax2.grid()
return fig, (ax1, ax2)

def compute_thresholds_and_centroids(
self,
n_cluster: Optional[int] = None,
criterion: Optional[str] = None,
centroid: str = "median",
):
"""Computes the thresholds and centroids of each group
Parameters
----------
n_cluster : Optional[int], optional
Number of clusters, by default None
criterion : Optional[str], optional
Criterion used to compute the clusters, by default None
centroid : str, optional
Method to compute the centroids (median or mean), by default "median"
"""
self.n_cluster = n_cluster if n_cluster is not None else self.n_cluster
self.criterion = criterion if criterion is not None else self.criterion
clusters = fcluster(self.z, self.n_cluster, self.criterion)
# Get centroids
if centroid == "median":
fun = np.median
elif centroid == "mean":
fun = np.mean
self.centroids = np.array(
sorted([fun(self.x[clusters == (c + 1)]) for c in range(self.n_cluster)])
)
# Sort clusters by power
x_max = sorted(
[np.max(self.x[clusters == (c + 1)]) for c in range(self.n_cluster)]
)
x_min = sorted(
[np.min(self.x[clusters == (c + 1)]) for c in range(self.n_cluster)]
)
thresh = np.divide(np.array(x_min[1:]) + np.array(x_max[:-1]), 2)
self.thresh = np.insert(thresh, 0, 0, axis=0)
Loading

0 comments on commit a2a6fb7

Please sign in to comment.