Skip to content

Commit

Permalink
Add support for drawing histogram with aggregated (buckets + counts) …
Browse files Browse the repository at this point in the history
…data (#51)
  • Loading branch information
Kami authored Nov 29, 2022
1 parent 589fe49 commit 8bb6e0e
Show file tree
Hide file tree
Showing 5 changed files with 118 additions and 13 deletions.
50 changes: 50 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,56 @@ In [9]: print(plotille.hist(np.random.normal(size=10000)))

![Example hist](https://github.com/tammoippen/plotille/raw/master/imgs/hist.png)

#### Hist (aggregated)

This function allows you to create a histogram when your data is already aggregated (aka you don't have access to raw values, but you have access to bins and counts for each bin).

This comes handy when working with APIs such as [OpenTelemetry Metrics API](https://opentelemetry-python.readthedocs.io/en/latest/api/metrics.html)
where views such as [ExplicitBucketHistogramAggregation](https://opentelemetry-python.readthedocs.io/en/latest/sdk/metrics.view.html#opentelemetry.sdk.metrics.view.ExplicitBucketHistogramAggregation)
only expose access to aggregated values (counts for each bin / bucket).

```python
In [8]: plotille.hist_aggregated?
Signature:
plotille.hist_aggregated(
counts,
bins,
width=80,
log_scale=False,
linesep='\n',
lc=None,
bg=None,
color_mode='names',
)
Docstring:
Create histogram for aggregated data.

Parameters:
counts: List[int] Counts for each bucket.
bins: List[float] Limits for the bins for the provided counts: limits for
bin `i` are `[bins[i], bins[i+1])`.
Hence, `len(bins) == len(counts) + 1`.
width: int The number of characters for the width (columns).
log_scale: bool Scale the histogram with `log` function.
linesep: str The requested line seperator. default: os.linesep
lc: multiple Give the line color.
bg: multiple Give the background color.
color_mode: str Specify color input mode; 'names' (default), 'byte' or 'rgb'
see plotille.color.__docs__
Returns:
str: histogram over `X` from left to right.
In [9]: counts = [1945, 0, 0, 0, 0, 0, 10555, 798, 0, 28351, 0]
In [10]: bins = [float('-inf'), 10, 50, 100, 200, 300, 500, 800, 1000, 2000, 10000, float('+inf')]
In [11]: print(plotille.hist_aggregated(counts, bins))
```
Keep in mind that there must always be n+1 bins (n is a total number of count values, 11 in the example above).
In this example the first bin is from [-inf, 10) with a count of 1945 and the last bin is from [10000, +inf] with a count of 0.
![Example hist](https://github.com/tammoippen/plotille/raw/master/imgs/hist_aggregated.png)
#### Histogram
There is also another more 'usual' histogram function available:
Expand Down
Binary file added imgs/hist_aggregated.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion plotille/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
from ._cmaps import Colormap, ListedColormap
from ._colors import color, hsl
from ._figure import Figure
from ._graphs import hist, histogram, plot, scatter
from ._graphs import hist, hist_aggregated, histogram, plot, scatter


__all__ = [
Expand All @@ -36,6 +36,7 @@
'Colormap',
'Figure',
'hist',
'hist_aggregated',
'histogram',
'hsl',
'ListedColormap',
Expand Down
51 changes: 40 additions & 11 deletions plotille/_graphs.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,24 +32,23 @@
from ._util import hist as compute_hist


def hist(X, bins=40, width=80, log_scale=False, linesep=os.linesep,
lc=None, bg=None, color_mode='names'):
"""Create histogram over `X` from left to right
The values on the left are the center of the bucket, i.e. `(bin[i] + bin[i+1]) / 2`.
The values on the right are the total counts of this bucket.
def hist_aggregated(counts, bins, width=80, log_scale=False, linesep=os.linesep,
lc=None, bg=None, color_mode='names'):
"""
Create histogram for aggregated data.
Parameters:
X: List[float] The items to count over.
bins: int The number of bins to put X entries in (rows).
counts: List[int] Counts for each bucket.
bins: List[float] Limits for the bins for the provided counts: limits for
bin `i` are `[bins[i], bins[i+1])`.
Hence, `len(bins) == len(counts) + 1`.
width: int The number of characters for the width (columns).
log_scale: bool Scale the histogram with `log` function.
linesep: str The requested line seperator. default: os.linesep
lc: multiple Give the line color.
bg: multiple Give the background color.
color_mode: str Specify color input mode; 'names' (default), 'byte' or 'rgb'
see plotille.color.__docs__
Returns:
str: histogram over `X` from left to right.
"""
Expand All @@ -58,14 +57,18 @@ def _scale(a):
return log(a)
return a

h = counts
b = bins

ipf = InputFormatter()
h, b = compute_hist(X, bins)
h_max = _scale(max(h)) or 1
delta = b[-1] - b[0]

bins_count = len(h)

canvas = [' bucket | {} {}'.format('_' * width, 'Total Counts')]
lasts = ['', '⠂', '⠆', '⠇', '⡇', '⡗', '⡷', '⡿']
for i in range(bins):
for i in range(bins_count):
hight = int(width * 8 * _scale(h[i]) / h_max)
canvas += ['[{}, {}) | {} {}'.format(
ipf.fmt(b[i], delta=delta, chars=8, left=True),
Expand All @@ -77,6 +80,32 @@ def _scale(a):
return linesep.join(canvas)


def hist(X, bins=40, width=80, log_scale=False, linesep=os.linesep,
lc=None, bg=None, color_mode='names'):
"""Create histogram over `X` from left to right
The values on the left are the center of the bucket, i.e. `(bin[i] + bin[i+1]) / 2`.
The values on the right are the total counts of this bucket.
Parameters:
X: List[float] The items to count over.
bins: int The number of bins to put X entries in (rows).
width: int The number of characters for the width (columns).
log_scale: bool Scale the histogram with `log` function.
linesep: str The requested line seperator. default: os.linesep
lc: multiple Give the line color.
bg: multiple Give the background color.
color_mode: str Specify color input mode; 'names' (default), 'byte' or 'rgb'
see plotille.color.__docs__
Returns:
str: histogram over `X` from left to right.
"""
counts, bins = compute_hist(X, bins)
return hist_aggregated(counts=counts, bins=bins, width=width, log_scale=log_scale,
linesep=linesep, lc=lc, bg=bg, color_mode=color_mode)


def histogram(X, bins=160, width=80, height=40, X_label='X', Y_label='Counts', linesep=os.linesep,
x_min=None, x_max=None, y_min=None, y_max=None,
lc=None, bg=None, color_mode='names'):
Expand Down
27 changes: 26 additions & 1 deletion tests/test_hist.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from pendulum import datetime, duration
import pytest

from plotille import hist
from plotille import hist, hist_aggregated

try:
import numpy as np
Expand All @@ -30,6 +30,24 @@ def expected_hist(cleandoc):
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾""") # noqa: E501


@pytest.fixture()
def expected_hist_aggregated(cleandoc):
return cleandoc("""
bucket | ________________________________________________________________________________ Total Counts
[-inf , 10) | ⣿⣿⣿⣿⣿⠇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 1945
[10 , 50) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[50 , 100) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[100 , 200) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[200 , 300) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[300 , 500) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[500 , 800) | ⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⡷⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 10555
[800 , 1000) | ⣿⣿⠆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 798
[1000 , 2000) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
[2000 , 10000) | ⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀ 28351
[10000 , inf) | ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 0
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾""") # noqa: E501


@pytest.mark.skipif(not have_numpy, reason='No numpy installed.')
def test_timehist_numpy(expected_hist):
day = np.timedelta64(1, 'D')
Expand Down Expand Up @@ -65,3 +83,10 @@ def test_timehist_orig_dt(expected_hist):
# print()
# print(res)
assert expected_hist == res


def test_hist_aggregated(expected_hist, expected_hist_aggregated):
counts = [1945, 0, 0, 0, 0, 0, 10555, 798, 0, 28351, 0]
bins = [float('-inf'), 10, 50, 100, 200, 300, 500, 800, 1000, 2000, 10000, float('+inf')]
res = hist_aggregated(counts=counts, bins=bins)
assert expected_hist_aggregated == res

0 comments on commit 8bb6e0e

Please sign in to comment.