-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
101 lines (68 loc) · 5.47 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
pdf_document: default
output: pdf_document
html_document:
highlight: tango
word_document: default
---
# Trek: a program to retrieve context-dependent core substitution rate constants (Homo sapiens, germline) for DNA sequences and entire genomes
<img src="TrekGUI/Trek_logo.png" align="middle" height="180" width="150" margin="0 auto" />
The Trek methodology is described in detail at <http://dx.doi.org/10.1101/024257>. The name of the package reflects on the fact that we obtained the **Tr**ansposon **E**xposed **k** rate constants (byr$^{-1}$), using the genomic *treks* of retrotransposon remnants in the human genome. The name additionally salutes the 50$^{th}$ anniversary of the Star Trek franchise.
## Installation
**1.** Install the latest version of R programming language or skip to the next step. Step-wise instructions on R installation and upgrade can be found through the following links: [1](http://alexonscience.blogspot.co.uk/2015/10/installing-r-on-linuxunix-machines-from.html), [2](http://alexonscience.blogspot.co.uk/2015/10/upgrade-r-environmentlibraries-mac-osx.html).
**2.** Launch R from the command line and install the R packages *shiny* (required for the graphical user interface), *doMC*, *foreach* and *itertools* (required for a parallel execution of the program) from within R.
```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ R
```
```{r, tidy=TRUE, echo=TRUE, eval=FALSE}
> install.packages("shiny")
> install.packages("doMC")
> install.packages("foreach")
> install.packages("itertools")
```
**3.** Download the Trek source code from the [GitHub](https://github.com/aleksahak/Trek) repository. You can also do that via a Linux/Unix/OSX command line, given that git is installed, by typing the following:
```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ git clone https://github.com/aleksahak/Trek
```
The downloaded folder has the following content:
- **lib/** - the subfolder containing all the source files,
- **TrekGUI/** - the subfolder containing the graphical user interface,
- **Trek.R** - the interfacing R script used to execute Trek from command line.
**4.** Finally, you need to bit compile the package by going into the **lib/** subfolder and executing **bitcompile.R** code from within R.
```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ cd lib/
$ R
```
```{r, tidy=TRUE, echo=TRUE, eval=FALSE}
> source("bitcompile.R")
```
This generates a single file, **Trek.lib**, which encapsulates the main Trek code and all its dependencies. At this stage, the subfolder **lib/** can be safely removed. You might, however, want to copy the **test.fasta** file from inside **lib/**, in order to test the Trek installation.
At this point, the Trek installation folder should contain:
- **Trek.lib** - the bit-compiled Trek program,
- **TrekGUI/** - the subfolder containing the graphical user interface,
- **Trek.R** - the interfacing R script used to execute Trek from command line,
and, if the test sequence file is preserved,
- **test.fasta** - the example DNA sequence fasta file.
## Running Trek from R
Trek can be executed as an R program, either from within R, or from the Linux/Unix/OSX command line through R CMD BATCH or Rscript execution. The latter two options allow the usage of Trek from the scripts written via programming languages other than R.
In R, as exemplified in the **Trek.R** interfacing script, one should load the **Trek.lib** bit-compiled file, then execute Trek via the R function *Trek()*. The latter accepts four arguments:
- *FastaFile* - the relative or absolute path to the fasta file to analyse,
- *OutFile* - the relative or absolute path to the output file to be saved,
- *MutRates* - an argument accepting "sym" and "nosym" options for the strand-symmetrised (recommended) and raw parameter usage for substitution rates,
- *nCPU* - the number of CPUs to be used for the calculation, where the larger values can markedly speed up the mapping process for entire genomes.
Alternatively, the **Trek.R** file can be edited to set the desired arguments, and executed from the command line via R CMD BATCH or Rscript.
## Running Trek from a GUI
Trek features a browser-based graphical user interface (GUI), written with Shiny that can be executed locally on as many CPUs as desired. To launch the GUI, enter the **TrekGUI/** subfolder and double click on **TrekGUI** file. If the file fails to open a browser, make sure that the permissions are correctly set for the file (as executable):
```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ cd TrekGUI/
$ chmod 700 TrekGUI # May require sudo rights.
```
If double clicking on **TrekGUI** does not launch your browser after the above step, then, most probably, your Rscript utility of R is not installed on the default */usr/bin/Rscript* path. To correct the TrekGUI setup, first find out the installed path for Rscript by typing from the command line:
```{bash, tidy=TRUE, echo=TRUE, eval=FALSE}
$ which Rscript
```
then open the **TrekGUI** executable file via a usual plain text editor and correct the Rscript path stated at the first line.
Now double click on **TrekGUI**. As soon as the browser opens, the rest is self explanatory.
## License
You may redistribute the Trek source code (or its components) and/or modify/translate it under the terms of the GNU General Public License as published by the Free Software Foundation; version 2 of the License (GPL2). You can find the details of GPL2 in the following link: <http://www.gnu.org/licenses/gpl-2.0.html>.
Questions and bug reports to Aleksandr B. Sahakyan via [as952 *at* cam *dot* ac *dot* uk].