iirs

IIRS is an Iupac Inverted RepeatS finder written in rust (rs), ported from IUPACpal, result of this paper.

That is, an exact tool for efficient identification of Inverted Repeats (IRs) in IUPAC-encoded DNA sequences as substrings of a large text, allowing also for potential mismatches and gaps.

Compared to the original, this version is faster, platform-independent and modular, facilitating the creation of customized format outputs. It does not require cmake nor sdsl. It uses divsufsort instead of libdivsufsort.

How to use the binary

The command line shares much of the functionality of the original IUPACpal.

Type iirs --help for a full description.

The notable differences are:

Support for multiple sequence names.
ALL_SEQUENCES argument for processing all the sequences in the input file.
Output format.

iirs -f input.fasta -s 't1 t2' -g 5 -F csv
iirs -f input.fasta --seq-names t1 --max-gap 5 --output-format csv
iirs -f input.fasta -s ALL_SEQUENCES -g 5 -m 3 -F csv

Many more practical examples can be found in the justfile.

How to install the binary

You can either build from source:

$ cargo install iirs

Or download the latest binary from releases and extract it somewhere on your $PATH.

Features

The default uses a Sparse Table implementation for the range minimum query, and it is sequential over IR centers. To change this behaviour you can use the features tabulation, parallel or a combination of both. This may result in a significant speed increase:

cargo install iirs --features "parallel tabulation"

Extra

It can also be used as a library both in rust and python.

cargo add iirs [--features X]

Or to python, after cloning the repo, via (no wheels yet):

pip install py-iirs/

Both libraries are minimal and only contain a struct / class SearchParams that does some bound checking, and a find_irs function.

Testing

cargo test for unit tests.
Justfile for individual tests against sequences. Some use the Linux profiler perf. To see the full list of commands use just -l.
bench.rs benches against a single file. To use together with just bench after modifying the parameters in bench.rs. To test against different features you can add them as arguments: just bench parallel or just bench parallel tabulation.
logs.rs benches against the cpp binary. You will need a IUPACpal binary (and they only support Linux). The binary is expected to be in the bench folder, but that can be changed in logs.rs and validate.py.
Note that just heatmap requires the python libraries listed in bench/requirements.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
bench		bench
py-iirs		py-iirs
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
input.fasta		input.fasta
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iirs

How to use the binary

How to install the binary

Features

Extra

Testing

About

Releases 2

Packages

Languages

License

daxida/iirs

Folders and files

Latest commit

History

Repository files navigation

iirs

How to use the binary

How to install the binary

Features

Extra

Testing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages