Skip to content

Latest commit

 

History

History
50 lines (38 loc) · 1.53 KB

README.md

File metadata and controls

50 lines (38 loc) · 1.53 KB

python-rle

Run-length encoding (wikipedia link) for data analysis in Python. Install with pip install python-rle or conda install -c tnwei python-rle. No dependencies required other than tqdm for visualizing a progress bar.

Usage

Encode any iterable (tuples, lists, pd.Series etc.) with rle.encode.

# rle.encode(iterable) returns (values, counts)
>>> import rle 
>>> rle.encode((10, 10, 10, 20, 20, 20, 30, 30, 30))
([10, 20, 30], [3, 3, 3])

Decode (values, counts) back into a sequence with rle.decode.

>>> rle.decode([10, 20, 30], [3, 3, 3])
[10, 10, 10, 20, 20, 20, 30, 30, 30]

Set progress_bar == True for long sequences :

progress_bar_anim

Motivation

Base R contains a simple rle function that "computes the lengths and values of runs of equal values in a vector", as described by its docstring. I found it useful for calculating streaks in collected data, and is especially wonderful for compiling and summarizing categorical data that describes status over time. Hence this little utility.