-
Notifications
You must be signed in to change notification settings - Fork 7
Documentation
Marcell Szikszai edited this page Mar 6, 2024
·
3 revisions
This documentation provides details on the command-line arguments and options available in RNA3DB.
$ python -m rna3db [--cpu <cpus>] <command> [<args>]
--cpu <cpus>: Number of CPUs to use when able (optional)
Parse mmCIF files and extract RNAs.
$ python -m rna3db parse <input> <output>
<input>: Directory containing mmCIF files to parse.
<output>: Output JSON file.
-
--nmr_resolution <float>
- Resolution to use for NMR structures.
-
Default:
float('inf')
Filter a JSON file based on various criteria.
$ python -m rna3db filter <input> <output> [options]
<input>: Input JSON file.
<output>: Output JSON file.
-
--single_ratio_cutoff <float>
- Filter chains where a single nucleotide makes up more than this fraction of residues.
-
Default:
0.8
-
--max_unknown_ratio <float>
- Filter chains with more than this fraction of unknown nucleotides.
-
Default:
0.3
-
--max_resolution <float>
- Filter chains over this resolution.
- Resolution is given in ångströms (Å).
-
Default:
9.0
-
--min_length <int>
- Filter chains shorter than this length.
-
Default:
32
-
--filter_log_path <path>
- Path to the filter log. The filter log shows which filters hit each sequence.
- Optional
Cluster RNAs by sequence and structure similarity.
$ python -m rna3db cluster <input> <output> [options]
<input>: Input JSON file.
<output>: Output JSON file.
-
--tbl_dir <path>
- Directory containing Infernal
.tbl
files. - Not required when using
--only_sequence
- Directory containing Infernal
-
--min_seq_id <float>
- Minimum Sequence Identity.
- See: MMseqs2: clustering criteria
-
Default:
0.99
-
--min_seq_coverage <float>
- Minimum Sequence Coverage.
- See: MMseqs2: clustering criteria
-
Default:
0.99
-
--mmseqs_binary_path <path>
- Path to MMseqs2 binary.
- Can usually be inferred via
$ which mmseqs
, but may need to be provided if RNA3DB is unable to find a suitable path. - Optional
-
--mmseqs_coverage_mode <int>
- MMseqs Coverage Mode.
- See: MMseqs2: How to set the right alignment coverage to cluster
-
Default:
1
-
--mmseqs_sensitivity <float>
- MMseqs Sensitivity.
- See: MMseqs2: Optimizing sensitivity and consumption of resources
-
Default:
7.5
-
--mmseqs_alignment_mode <int>
- MMseqs Alignment Mode.
- See: MMseqs2: Optimizing sensitivity and consumption of resources
-
Default:
3
-
--structural_e_value_cutoff <float>
- Structural E-Value Cutoff used to build graph edges.
-
Default:
1.0
-
--only_sequence
- Use only sequence information.
- Mutually exclusive with
--only_structure
-
--only_structure
- Use only structure information.
- Mutually exclusive with
--only_sequence
Split RNA data into training and test sets.
$ python -m rna3db split <input> <output> [options]
<input>: Input JSON file.
<output>: Output JSON file.
-
--train_percentage <float>
- Percentage of data for the train set.
-
Default:
0.3
-
--force_zero_test
- Force component zero into the test set.
- Optional