Skip to content

Documentation

Marcell Szikszai edited this page Mar 6, 2024 · 3 revisions

Table of Contents

Command line

This documentation provides details on the command-line arguments and options available in RNA3DB.

General Usage

$ python -m rna3db [--cpu <cpus>] <command> [<args>]

    --cpu <cpus>: Number of CPUs to use when able (optional)

Parse

Parse mmCIF files and extract RNAs.

$ python -m rna3db parse <input> <output>

    <input>: Directory containing mmCIF files to parse.
    <output>: Output JSON file.

Options

  • --nmr_resolution <float>
    • Resolution to use for NMR structures.
    • Default: float('inf')

Filter

Filter a JSON file based on various criteria.

$ python -m rna3db filter <input> <output> [options]
    <input>: Input JSON file.
    <output>: Output JSON file.

Options

  • --single_ratio_cutoff <float>
    • Filter chains where a single nucleotide makes up more than this fraction of residues.
    • Default: 0.8
  • --max_unknown_ratio <float>
    • Filter chains with more than this fraction of unknown nucleotides.
    • Default: 0.3
  • --max_resolution <float>
    • Filter chains over this resolution.
    • Resolution is given in ångströms (Å).
    • Default: 9.0
  • --min_length <int>
    • Filter chains shorter than this length.
    • Default: 32
  • --filter_log_path <path>
    • Path to the filter log. The filter log shows which filters hit each sequence.
    • Optional

Cluster

Cluster RNAs by sequence and structure similarity.

$ python -m rna3db cluster <input> <output> [options]
    <input>: Input JSON file.
    <output>: Output JSON file.

Options

Split

Split RNA data into training and test sets.

$ python -m rna3db split <input> <output> [options]
    <input>: Input JSON file.
    <output>: Output JSON file.

Options

  • --train_percentage <float>
    • Percentage of data for the train set.
    • Default: 0.3
  • --force_zero_test
    • Force component zero into the test set.
    • Optional