Skip to content

RagnarGrootKoerkamp/suffix-array-searching

Repository files navigation

Suffix array searching

This repo contains two crates, one to speed up binary search in sorted integer arrays, and one to speed up suffix array searching. For now, this is a research project. Code is not directly intended to be used as-is as a dependency in other projects, and documentation is sub-par for that purpose.

static-search-tree: 40x faster binary search

This is explained in detail in my blog post.

It provides a data structure that can answer ‘binary search’ queries much faster, at the cost of 6% space overhead: given a list of n sorted integers and a list of queries, it returns for each query the first number that is at least the queried value.

It is on my to-do list, but with low priority, to make this into a nice standalone crate.

To reproduce experiments and plots, first cd static-search-tree, and create the results and plots directories (the scripts expect them to exist).

  1. Run the experiments: cargo run -r --bin bench -- --release.

    See cargo run -r --bin bench -- --help for more options.

  2. To generate plots: python3 ./plot.py. This will error at some point that results-human-release.json isn’t found. To avoid that, comment out the last plot_blog() call.

suffix-array-searching: Searching suffix arrays using static search trees

This is mostly placeholder code and some rough experiments. I may or may not get back to this.