All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
This only addresses a build error in the crates.io. Previously, I had tried to include
a header to render LaTeX. But, that proved error prone. It is removed in favor of html
in the README. I added a line in the Cargo.toml that should build the docs on crates.io
with the --feature mpi, so that all functions, including the mpi functions, have
documentation generated. Finally, I added documentation to optimize()
.
No code changes were made from 2.0.0.
This is a complete re-write in Rust. In addition to changing the language, the the following modifications to the algorithm have been made:
- The heart of this algorithm is
$O(n^2)$ . In order to approximate the null distribution, this$O(n^2)$ operation is performed n times. These permutations are now parallelized and will scale linearly down to the time it takes to run a single search through the threshold space. Run time on human data with ~16k genes on 30 CPU is less than an hour now, and uses less than 3G space. - The cmd line input has been greatly simplified. This additionally represents a
significant change to the protocol:
- Previously, while in the documentation the operation was described as 'ranking', it would be more appropriate to call it sorting or ordering. Ties were not addressed. In version 2.0.0, we now expect that the input is a ranked list where the first column is the feature identifier, and the second column is the rank. It is up to the user to appropriately rank their data. Examples and recommendations are provided in the documentation.
- There are messages printed to stderr when there is more than one set of thresholds with the same minimum p-value and the same intersect size. This occurs due to the nature of the hypergeometric p-value.
profiling/
stores runtime and memory usage information from hyperfine and heaptrack respectively- github actions CI has been added to run tests on pushes to
dev
andmain
- Semantic versioning and github releases have been added
- The package is distributed through crates.io and bioconda
- Docstrings with examples and module level documentation
- tests
- an MPI implementation to parallelize across multiple machines
This version was written by Yiming Kang and is the version which was used to produce the results in Dual threshold optimization and network inference reveal convergent evidence from TF binding locations and TF perturbation responses