Skip to content

Commit

Permalink
release: v2.6.0 (#266)
Browse files Browse the repository at this point in the history
release: v2.6.0
  • Loading branch information
eonu authored Dec 30, 2024
2 parents a54dcdb + a769620 commit 37ce9f4
Show file tree
Hide file tree
Showing 33 changed files with 273 additions and 413 deletions.
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,13 @@ repos:
pass_filenames: false
# ruff check (w/autofix)
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.3 # should match version in pyproject.toml
rev: v0.8.4 # should match version in pyproject.toml
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
# ruff format
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.3 # should match version in pyproject.toml
rev: v0.8.4 # should match version in pyproject.toml
hooks:
- id: ruff-format
# # pydoclint - docstring formatting
Expand Down
13 changes: 12 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -388,11 +388,17 @@ Nothing, initial release!

</details>

## [v2.5.0](https://github.com/eonu/sequentia/releases/tag/v2.5.0) - 2024-12-27
## [v2.6.0](https://github.com/eonu/sequentia/releases/tag/v2.6.0) - 2024-12-30

### Bug Fixes

- enable `joblib.Parallel` memory mapping ([#262](https://github.com/eonu/sequentia/issues/262))

### Documentation

- update copyright notice ([#255](https://github.com/eonu/sequentia/issues/255))
- fix `KNNRegressor.window` docstring typo ([#261](https://github.com/eonu/sequentia/issues/261))
- update `README.md` features ([#265](https://github.com/eonu/sequentia/issues/265))

### Features

Expand All @@ -402,6 +408,11 @@ Nothing, initial release!
- add `model_selection` sub-package for hyper-parameters ([#257](https://github.com/eonu/sequentia/issues/257))
- add model spec support to `HMMClassifier.__init__` ([#258](https://github.com/eonu/sequentia/issues/258))
- add `HMMClassifier.fit` multiprocessing ([#259](https://github.com/eonu/sequentia/issues/259))
- set `use_c=True` by default for `KNNClassifier`/`KNNRegressor` ([#263](https://github.com/eonu/sequentia/issues/263))

### Styling

- upgrade to `ruff` v0.8.4 and fix type hints ([#264](https://github.com/eonu/sequentia/issues/264))

## [v2.0.2](https://github.com/eonu/sequentia/releases/tag/v2.0.2) - 2024-04-13

Expand Down
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ Some examples of how Sequentia can be used on sequence data include:

- **Simplicity and interpretability**: Sequentia offers a limited set of machine learning algorithms, chosen specifically to be more interpretable and easier to configure than more complex alternatives such as recurrent neural networks and transformers, while maintaining a high level of effectiveness.
- **Familiar and user-friendly**: To fit more seamlessly into the workflow of data science practitioners, Sequentia follows the ubiquitous Scikit-Learn API, providing a familiar model development process for many, as well as enabling wider access to the rapidly growing Scikit-Learn ecosystem.
- **Speed**: Some algorithms offered by Sequentia naturally have restrictive runtime scaling, such as k-nearest neighbors. However, our implementation is
optimized to the point of being multiple orders of magnitude faster than similar packages — see the [Benchmarks](#benchmarks) section for more information.

## Build Status

Expand All @@ -82,7 +84,7 @@ effective inference algorithm.
- [x] Sakoe–Chiba band global warping constraint
- [x] Dependent and independent feature warping (DTWD/DTWI)
- [x] Custom distance-weighted predictions
- [x] Multi-processed predictions
- [x] Multi-processed prediction

#### [Hidden Markov Models](https://sequentia.readthedocs.io/en/latest/sections/models/hmm/index.html) (via [`hmmlearn`](https://github.com/hmmlearn/hmmlearn))

Expand All @@ -99,7 +101,7 @@ based on the provided training sequence data.
- [x] Multivariate real-valued observations (modeled with Gaussian mixture emissions)
- [x] Univariate categorical observations (modeled with discrete emissions)
- [x] Linear, left-right and ergodic topologies
- [x] Multi-processed predictions
- [x] Multi-processed training and prediction

### Scikit-Learn compatibility

Expand Down Expand Up @@ -157,7 +159,7 @@ All of the above libraries support multiprocessing, and prediction was performed
<img src="benchmarks/benchmark.svg" width="100%"/>

> **Device information**:
> - Product: ThinkPad T14s (Gen 6)
> - Product: Lenovo ThinkPad T14s (Gen 6)
> - Processor: AMD Ryzen™ AI 7 PRO 360 (8 cores, 16 threads, 2-5GHz)
> - Memory: 64 GB LPDDR5X-7500MHz
> - Solid State Drive: 1 TB SSD M.2 2280 PCIe Gen4 Performance TLC Opal
Expand All @@ -175,7 +177,7 @@ pip install sequentia

For optimal performance when using any of the k-NN based models, it is important that the correct `dtaidistance` C libraries are accessible.

Please see the [`dtaidistance` installation guide](https://dtaidistance.readthedocs.io/en/latest/usage/installation.html) for troubleshooting if you run into C compilation issues, or if setting `use_c=True` on k-NN based models results in a warning.
Please see the [`dtaidistance` installation guide](https://dtaidistance.readthedocs.io/en/latest/usage/installation.html) for troubleshooting if you run into C compilation issues, or if using k-NN based models with `use_c=True` results in a warning.

You can use the following to check if the appropriate C libraries are available.

Expand All @@ -184,6 +186,8 @@ from dtaidistance import dtw
dtw.try_import_c()
```

If these libraries are unavailable, Sequentia will fall back to using a Python alternative.

### Development

Please see the [contribution guidelines](/CONTRIBUTING.md) to see installation instructions for contributing to Sequentia.
Expand Down
34 changes: 18 additions & 16 deletions benchmarks/plot.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,13 @@
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"plt.style.use(\"ggplot\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c92bf960-ddb5-409f-bd3c-5bce0a03ccd0",
"metadata": {},
"outputs": [],
"source": [
"from sequentia import"
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "6649bf2d-7430-401d-8113-f3c1e1cf4779",
"metadata": {},
"outputs": [
Expand All @@ -48,23 +37,36 @@
"\n",
"bars = ax.bar(labels, runtimes, width=0.5, color=\"C1\")\n",
"ax.set(xlabel=\"Package\", ylabel=\"Runtime (s)\")\n",
"ax.set_title(\"Univariate DTW-kNN performance (1,500 FSDD train/test sequences, 16 workers)\", fontsize=11)\n",
"ax.set_title(\n",
" (\n",
" \"Univariate DTW-kNN performance \"\n",
" \"(1,500 FSDD train/test sequences, 16 workers)\"\n",
" ),\n",
" fontsize=11,\n",
")\n",
"\n",
"\n",
"def fmt(s: float) -> str:\n",
" \"\"\"Formats the runtime.\"\"\"\n",
" if s < 60:\n",
" return f\"{round(s)}s\"\n",
" m, s = divmod(s, 60)\n",
" return f\"{round(m)}m {round(s)}s\"\n",
"\n",
"\n",
"for bar in bars:\n",
" plt.text(\n",
" bar.get_x() + bar.get_width() / 2, bar.get_height(),\n",
" fmt(bar.get_height()), ha='center', va='bottom', fontsize=9,\n",
" bar.get_x() + bar.get_width() / 2,\n",
" bar.get_height(),\n",
" fmt(bar.get_height()),\n",
" ha=\"center\",\n",
" va=\"bottom\",\n",
" fontsize=9,\n",
" )\n",
"\n",
"for lab in ax.get_xticklabels():\n",
" if lab.get_text() == \"sequentia\":\n",
" lab.set_fontweight('bold')\n",
" if lab.get_text() == \"sequentia\":\n",
" lab.set_fontweight(\"bold\")\n",
"\n",
"plt.tight_layout()\n",
"plt.savefig(\"benchmark.svg\")\n",
Expand Down
6 changes: 2 additions & 4 deletions benchmarks/test_pyts.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,7 @@ def prepare(data: SequentialDataset, length: int) -> DataSplit:
return X_pad[:, 0], data.y


def multivariate(
*, train_data: DataSplit, test_data: DataSplit, n_jobs: int
) -> None:
def run(*, train_data: DataSplit, test_data: DataSplit, n_jobs: int) -> None:
"""Fit and predict the classifier."""
# initialize model
clf = KNeighborsClassifier(
Expand Down Expand Up @@ -70,7 +68,7 @@ def multivariate(
)

benchmark = timeit.timeit(
"func(train_data=train_data, test_data=test_data, n_jobs=args.n_jobs)",
"run(train_data=train_data, test_data=test_data, n_jobs=args.n_jobs)",
globals=locals(),
number=args.number,
)
Expand Down
4 changes: 2 additions & 2 deletions benchmarks/test_sequentia.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
random_state: np.random.RandomState = np.random.RandomState(0)


def multivariate(
def run(
*, train_data: SequentialDataset, test_data: SequentialDataset, n_jobs: int
) -> None:
"""Fit and predict the classifier."""
Expand Down Expand Up @@ -52,7 +52,7 @@ def multivariate(
train_data, test_data = load_dataset(multivariate=False)

benchmark = timeit.timeit(
"func(train_data=train_data, test_data=test_data, n_jobs=args.n_jobs)",
"run(train_data=train_data, test_data=test_data, n_jobs=args.n_jobs)",
globals=locals(),
number=args.number,
)
Expand Down
6 changes: 2 additions & 4 deletions benchmarks/test_sktime.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,7 @@ def prepare(data: SequentialDataset) -> DataSplit:
return X_pd, data.y


def multivariate(
*, train_data: DataSplit, test_data: DataSplit, n_jobs: int
) -> None:
def run(*, train_data: DataSplit, test_data: DataSplit, n_jobs: int) -> None:
"""Fit and predict the classifier."""
# initialize model
clf = KNeighborsTimeSeriesClassifier(
Expand Down Expand Up @@ -89,7 +87,7 @@ def multivariate(
train_data, test_data = prepare(train_data), prepare(test_data)

benchmark = timeit.timeit(
"func(train_data=train_data, test_data=test_data, n_jobs=args.n_jobs)",
"run(train_data=train_data, test_data=test_data, n_jobs=args.n_jobs)",
globals=locals(),
number=args.number,
)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
project = "sequentia"
copyright = "2019, Sequentia Developers" # noqa: A001
author = "Edwin Onuonga (eonu)"
release = "2.5.0"
release = "2.6.0"

# -- General configuration ---------------------------------------------------

Expand Down
2 changes: 1 addition & 1 deletion make/lint.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def check(c: Config) -> None:
def format_(c: Config) -> None:
"""Format Python files."""
commands: list[str] = [
"poetry run ruff --fix .",
"poetry run ruff check --fix .",
"poetry run ruff format .",
]
for command in commands:
Expand Down
17 changes: 8 additions & 9 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "sequentia"
version = "2.5.0"
version = "2.6.0"
license = "MIT"
authors = ["Edwin Onuonga <ed@eonu.net>"]
maintainers = ["Edwin Onuonga <ed@eonu.net>"]
Expand Down Expand Up @@ -86,7 +86,7 @@ tox = "4.11.3"
pre-commit = ">=3"

[tool.poetry.group.lint.dependencies]
ruff = "0.1.3"
ruff = "0.8.4"
pydoclint = "0.3.8"

[tool.poetry.group.docs.dependencies]
Expand All @@ -100,8 +100,8 @@ pytest = { version = "^7.4.0" }
pytest-cov = { version = "^4.1.0" }

[tool.ruff]
required-version = "0.1.3"
select = [
required-version = "0.8.4"
lint.select = [
"F", # pyflakes: https://pypi.org/project/pyflakes/
"E", # pycodestyle (error): https://pypi.org/project/pycodestyle/
"W", # pycodestyle (warning): https://pypi.org/project/pycodestyle/
Expand Down Expand Up @@ -144,7 +144,7 @@ select = [
"PERF", # perflint: https://pypi.org/project/perflint/
"RUF", # ruff
]
ignore = [
lint.ignore = [
"ANN401", # https://beta.ruff.rs/docs/rules/any-type/
"B905", # https://beta.ruff.rs/docs/rules/zip-without-explicit-strict/
"TD003", # https://beta.ruff.rs/docs/rules/missing-todo-link/
Expand All @@ -162,16 +162,15 @@ ignore = [
"C408", # Unnecessary `dict` call (rewrite as a literal)
"D401", # First line of docstring should be in imperative mood
]
ignore-init-module-imports = true # allow unused imports in __init__.py
line-length = 79

[tool.ruff.pydocstyle]
[tool.ruff.lint.pydocstyle]
convention = "numpy"

[tool.ruff.flake8-annotations]
[tool.ruff.lint.flake8-annotations]
allow-star-arg-any = true

[tool.ruff.extend-per-file-ignores]
[tool.ruff.lint.extend-per-file-ignores]
"__init__.py" = ["PLC0414", "F403", "F401", "F405"]
"sequentia/datasets/*.py" = ["B006"]
"sequentia/enums.py" = ["E501"]
Expand Down
Loading

0 comments on commit 37ce9f4

Please sign in to comment.