Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ver1.1 #2

Merged
merged 7 commits into from
Sep 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,5 @@ trial*
!trial.py
inputs
docs/book/
MSM/*_test.ipynb
test/
53 changes: 27 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,27 @@
# PaCS-ToolKit
# PaCS-Toolkit

PaCS-ToolKit enables the execution of PaCS-MD (Parallel Cascade Selection Molecular Dynamic Simulation), a non-bias-enhanced sampling method, across various environments. Additionally, it offers tools for result analysis and visualization.
PaCS-Toolkit enables the execution of PaCS-MD (Parallel Cascade Selection Molecular Dynamic Simulation), a non-bias-enhanced sampling method, across various environments. Additionally, it offers tools for result analysis and visualization.
While PaCS-MD offers a wide range of applications with existing evaluation types, our toolkit also allows for the integration of additional types as needed.

We believe our package will benefit your research.

- [PaCS-ToolKit](#pacs-toolkit)
- [PaCS-Toolkit](#pacs-toolkit)
- [Document](#document)
- [Quick install](#quick-install)
- [Example command](#example-command)
- [Citation](#citation)
- [LICENSE](#license)


## Document
- The documentation of PaCS-ToolKit is [here](https://kitaolab.github.io/PaCS-Toolkit/).
- The documentation of PaCS-Toolkit is [here](https://kitaolab.github.io/PaCS-Toolkit/).

## Quick install

<details><summary> 1. Install by pip </summary>

~~~shell
# Install all feautres of PaCS-ToolKit
# Install all feautres of PaCS-Toolkit
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

Expand All @@ -32,10 +33,10 @@ see [document](https://kitaolab.github.io/PaCS-Toolkit/) for more information.
<details><summary> 2. Install by conda and pip </summary>

~~~shell
conda create -n pacs "python>=3.7" -y
conda create -n pacs "python>=3.8" -y
conda activate pacs

# Install all features of PaCS-ToolKit
# Install all features of PaCS-Toolkit
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

Expand All @@ -44,32 +45,32 @@ see [document](https://kitaolab.github.io/PaCS-Toolkit/) for more information.
</details>


## Example command
```sh
pacs mdrun -t 1 -f input.toml
```
see help messages(`pacs --help`) and [document](https://kitaolab.github.io/PaCS-Toolkit/) for more information.

## Citation
~~~txt
- PaCS-Toolkit
[1] Ikizawa, S.*, Hori, T.*, Wijana, T.N.*, Kono, H., Bai, Z., Kimizono, T., Lu, W., Tran, D.P., & Kitao, A. PaCS-Toolkit: Optimized software utilities for parallel cascade selection molecular dynamics (PaCS-MD) simulations and subsequent analyses. J. Phys. Chem. B. 128, 15, 3631-3642 (2024). https://doi.org/10.1021/acs.jpcb.4c01271
- [1] PaCS-Toolkit: Ikizawa, S.*, Hori, T.*, Wijana, T.N.*, Kono, H., Bai, Z., Kimizono, T., Lu, W., Tran, D.P., & Kitao, A. PaCS-Toolkit: Optimized software utilities for parallel cascade selection molecular dynamics (PaCS-MD) simulations and subsequent analyses. *J. Phys. Chem. B.*, **128**, 15, 3631-3642 (2024). https://doi.org/10.1021/acs.jpcb.4c01271

- Original PaCS-MD or targeted-PaCS-MD (t-PaCS-MD)
[2] Harada, R., & Kitao, A. Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway. J. Chem. Phys. 139, 035103 (2013). https://doi.org/10.1063/1.4813023
- [2] Original PaCS-MD or targeted-PaCS-MD (t-PaCS-MD): Harada, R., & Kitao, A. Parallel cascade selection molecular dynamics (PaCS-MD) to generate conformational transition pathway. *J. Chem. Phys.* **139**, 035103 (2013). https://doi.org/10.1063/1.4813023

- Dissociation PaCS-MD (dPaCS-MD)
[3] Tran, D. P., Takemura, K., Kuwata, K., & Kitao, A. Protein–Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. J. Chem. Theory Comput. 14, 404–417 (2018). https://doi.org/10.1021/acs.jctc.7b00504
[4] Tran, D. P., & Kitao, A. Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model. J. Phys. Chem. B , 123, 11, 2469–2478 (2019). https://doi.org/10.1021/acs.jpcb.8b10309
[5] Hata, H., Phuoc Tran, D., Marzouk Sobeh, M., & Kitao, A. Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model. Biophysics and Physicobiology, 18, 305–31 (2021). https://doi.org/10.2142/biophysico.bppb-v18.037
- [3] Dissociation PaCS-MD (dPaCS-MD): Tran, D. P., Takemura, K., Kuwata, K., & Kitao, A. Protein–Ligand Dissociation Simulated by Parallel Cascade Selection Molecular Dynamics. *J. Chem. Theory Comput*. **14**, 404–417 (2018). https://doi.org/10.1021/acs.jctc.7b00504

- Application to protein domain motion
[6] Inoue, Y., Ogawa, Y., Kinoshita, M., Terahara, N., Shimada, M., Kodera, N., Ando, T., Namba, K., Kitao, A., Imada, K., & Minamino, T. Structural Insights into the Substrate Specificity Switch Mechanism of the Type III Protein Export Apparatus. Structure, 27 , 965-976 (2019). https://doi.org/10.1016/j.str.2019.03.017
- [4] Dissociation PaCS-MD (dPaCS-MD): Tran, D. P., & Kitao, A. Dissociation Process of a MDM2/p53 Complex Investigated by Parallel Cascade Selection Molecular Dynamics and the Markov State Model. *J. Phys. Chem. B*, **123**, 11, 2469–2478 (2019). https://doi.org/10.1021/acs.jpcb.8b10309

- Association and dissociation PaCS-MD (a/dPaCS-MD)
[7] Tran, D. P., & Kitao, A. Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding. J. Chem. Theory Comput. 16, 2835–2845 (2020). https://doi.org/10.1021/acs.jctc.9b01203
- [5] Dissociation PaCS-MD (dPaCS-MD): Hata, H., Phuoc Tran, D., Marzouk Sobeh, M., & Kitao, A. Binding free energy of protein/ligand complexes calculated using dissociation Parallel Cascade Selection Molecular Dynamics and Markov state model. *Biophysics and Physicobiology*, **18**, 305–31 (2021). https://doi.org/10.2142/biophysico.bppb-v18.037

- Edge expansion PaCS-MD (eePaCS-MD)
[8] Takaba, K., Tran, D. P., & Kitao, A. Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins. J. Chem. Phys. 152, 225101 (2020). https://doi.org/10.1063/5.0004654
[9] Takaba, K., Tran, D. P., & Kitao, A. Erratum: "Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins" [J. Chem. Phys. 152, 225101 (2020)]. . J. Chem. Phys. 153, 179902 (2020). https://doi.org/10.1063/5.0032465
- [6] Application to protein domain motion: Inoue, Y., Ogawa, Y., Kinoshita, M., Terahara, N., Shimada, M., Kodera, N., Ando, T., Namba, K., Kitao, A., Imada, K., & Minamino, T. Structural Insights into the Substrate Specificity Switch Mechanism of the Type III Protein Export Apparatus. *Structure*, **27** , 965-976 (2019). https://doi.org/10.1016/j.str.2019.03.017

- rmsdPaCS-MD
[10] Tran, D. P., Taira, Y., Ogawa, T., Misu, R., Miyazawa, Y., & Kitao, A. Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer. Sci Rep 12, 3860 (2022). https://doi.org/10.1038/s41598-022-07792-2
~~~
- [7] Association and dissociation PaCS-MD (a/dPaCS-MD): Tran, D. P., & Kitao, A. Kinetic Selection and Relaxation of the Intrinsically Disordered Region of a Protein upon Binding. *J. Chem. Theory Comput.*, **16**, 2835–2845 (2020). https://doi.org/10.1021/acs.jctc.9b01203

- [8] Edge expansion PaCS-MD (eePaCS-MD): Takaba, K., Tran, D. P., & Kitao, A. Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins. *J. Chem. Phys.* **152**, 225101 (2020). https://doi.org/10.1063/5.0004654

- [9] Edge expansion PaCS-MD (eePaCS-MD): Takaba, K., Tran, D. P., & Kitao, A. Erratum: "Edge expansion parallel cascade selection molecular dynamics simulation for investigating large-amplitude collective motions of proteins" [J. Chem. Phys. 152, 225101 (2020)]. *J. Chem. Phys.* **153**, 179902 (2020). https://doi.org/10.1063/5.0032465

- [10] rmsdPaCS-MD: Tran, D. P., Taira, Y., Ogawa, T., Misu, R., Miyazawa, Y., & Kitao, A. Inhibition of the hexamerization of SARS-CoV-2 endoribonuclease and modeling of RNA structures bound to the hexamer. *Sci Rep* **12**, 3860 (2022). https://doi.org/10.1038/s41598-022-07792-2


## LICENSE
Expand Down
8 changes: 4 additions & 4 deletions docs/src/fit.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,19 @@ pacs fit traj mdtraj -tf ./trial001/cycle001/replica001/prd.xtc -top ./inputs/in

#### for single trial
```shell
pacs fit trial mdtraj -t 1 -s ./trial001/cycle001/replica001/prd.pdb -r ./trial001/cycle001/replica001/prd.pdb -ts "protein" -rs "protein" -tf prd.xtc -p 10
pacs fit trial mdtraj -t 1 -top ./trial001/cycle001/replica001/prd.pdb -r ./trial001/cycle001/replica001/prd.pdb -ts "protein" -rs "protein" -tf prd.xtc -p 10
```

### Arguments

#### for single trajectory
```plaintext
usage: pacs fit mdtraj [-h] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
usage: pacs fit traj mdtraj [-h] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
```
- `-tf, --trj_file` (str):
- file name of the trajectory to be fitted (e.g. `-tf prd.xtc`)
- `-top, --topology` (str):
- topology file path for loading trajectory (e.g. `-s trial001/cycle000/replica001/prd.pdb`)
- topology file path for loading trajectory (e.g. `-top trial001/cycle000/replica001/prd.pdb`)
- `-r, --ref_structure` (str):
- reference structure file path for fitting reference (e.g. `-r trial001/cycle000/replica001/prd.pdb`)
- `-ts, --trj_selection` (str):
Expand All @@ -53,7 +53,7 @@ usage: pacs fit trial mdtraj [-h] [-t] [-tf] [-top] [-r] [-ts] [-rs] [-p] [-o]
- `-tf, --trj_file` (str):
- file name of the trajectory to be fitted (e.g. `-tf prd.xtc`)
- `-top, --topology` (str):
- topology file path for loading trajectory (e.g. `-s trial001/cycle000/replica001/prd.pdb`)
- topology file path for loading trajectory (e.g. `-top trial001/cycle000/replica001/prd.pdb`)
- `-r, --ref_structure` (str):
- reference structure file path for fitting reference (e.g. `-r trial001/cycle000/replica001/prd.pdb`)
- `-ts, --trj_selection` (str):
Expand Down
29 changes: 16 additions & 13 deletions docs/src/genfeature.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
# genfeature
- This command is used after executing `pacs mdrun`.
- This command generates data that will be used for MSM analysis.
- This command supports parallel process.


| feature | mdtraj | gmx | cpptraj |
| ------- | ------ | --- | ------- |
| comdist | o | x | x |
| comvec | o | x | x |
| pca | o | x | x |
| tica | o | x | x |
| rmsd | o | x | x |
| xyz | o | x | x |
- This command should be executed after running `pacs mdrun`.
- It generates feature data in `.npy` format, which is cconvenient for MSM analysis in Python.
- Feature data files (e.g., `t001c002r010.npy`) are stored in the directory specified with the `-od` option.
- Each `.npy` file has the `np.arry` in the shape as described in the table below.
- This command supports parallel processing.


Currently implemented analysis tools and the shape of the output data in `.npy` files.


| feature | mdtraj | gmx | cpptraj | shape of `.npy` |
| ------- | ------ | --- | ------- | ---------------------- |
| comdist | o | x | x | (n_frames,) |
| comvec | o | x | x | (n_frames, 3) |
| rmsd | o | x | x | (n_frames,) |
| xyz | o | x | x | (n_frames, n_atoms, 3) |
2 changes: 2 additions & 0 deletions docs/src/genfeature/comvec.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# comvec
- Center of mass vector
- Calculate the vector between the centers of mass of `s1` and `s2`
- The vector is calculated as `s1` - `s2`

### Example
- The following example generates features about COM vector for MSM analysis
Expand Down
3 changes: 2 additions & 1 deletion docs/src/genfeature/rmsd.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# RMSD
- Root Mean Square Deviation
- Calculate the RMSD relative to the structure specified in `ref`

### Example
- The following example generates features about RMSD for MSM analysis
Expand All @@ -14,7 +15,7 @@ pacs genfeature rmsd mdtraj -t 1 -tf prd.xtc -top ./inputs/input.gro -ref ./inpu

#### mdtraj
```plaintext
usage: pacs genfeature pca mdtraj [-h] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-ct] [-cr]
usage: pacs genfeature rmsd mdtraj [-h] [-tf] [-top] [-od] [-p] [-ref] [-ft] [-fr] [-ct] [-cr]
```

- `-t, --trial` (int):
Expand Down
30 changes: 12 additions & 18 deletions docs/src/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,23 +11,22 @@
- [2.2. Install by pip locally](#22-install-by-pip-locally)

## Requirements
- [Python](https://www.python.org/) >= 3.7
- PaCS-ToolKit currently supports 3 simulator
- [Gromacs](https://www.gromacs.org/)
- [Amber](https://ambermd.org/index.php)
- [Namd](https://www.ks.uiuc.edu/Research/namd/)
- [Python](https://www.python.org/) >= 3.7 (but python >= 3.8 is recommended because of deeptime)
- PaCS-Toolkit currently supports 3 simulator
- [Gromacs](https://www.gromacs.org/) >= 2022.2 tested
- [Amber](https://ambermd.org/index.php) >= 2023 tested
- [Namd](https://www.ks.uiuc.edu/Research/namd/) >= 2021-02-20 tested

## 1. Install by pip
### 1.1 Install by conda and pip
~~~shell
conda create -n pacsmd "python>=3.7" -y
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
~~~

- if using whole pacstk function
~~~shell
pip install "pacs[all] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
pip install pyemma
~~~

- elif using "pacs mdrun" and analyzer == "mdtraj"
Expand All @@ -41,9 +40,9 @@ pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

- elif performing MSM
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
pip install pyemma
~~~

### 1.2. Install by pip
Expand All @@ -63,9 +62,9 @@ pip install "pacs @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
~~~

- elif performing MSM
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install "pacs[msm] @ git+https://github.com/Kitaolab/PaCS-Toolkit.git"
pip install pyemma
~~~


Expand All @@ -87,15 +86,14 @@ cd pacsmd-${version}

### 2.1. Install by conda and pip locally
~~~shell
conda create -n pacsmd "python>=3.7" -y
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
~~~

- if using whole pacstk function
- pyemma does not recommend pip-install
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install -e ".[all]"
conda install -c conda-forge pyemma
~~~

- elif using "pacs mdrun" and analyzer == "mdtraj"
Expand All @@ -114,18 +112,15 @@ pip install -e "."
~~~

- elif performing MSM
- pyemma does not recommend pip-install
~~~
pip install -e ".[msm]"
conda install -c conda-forge pyemma
~~~

### 2.2. Install by pip locally
- if using whole pacstk function
- pyemma does not work, conda is recommend
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install -e ".[all]"
pip install pyemma
~~~

- elif using "pacs mdrun" and analyzer == "mdtraj"
Expand All @@ -144,8 +139,7 @@ pip install -e "."
~~~

- elif performing MSM
- sometimes pyemma does not work, conda is recommend
- python >= 3.8 is recommended because of deeptime
~~~shell
pip install -e ".[msm]"
pip install pyemma
~~~
44 changes: 29 additions & 15 deletions docs/src/mdrun/inputfile.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,22 @@
- input file must be in [toml format](https://toml.io/en/).

*Contents*
- [sample input file](#sample-input-file)
- [basic option](#basic-option)
- [simulator option](#simulator-option)
- [Gromacs](#gromacs)
- [Amber](#amber)
- [NAMD](#namd)
- [analyzer option](#analyzer-option)
- [Target](#target)
- [RMSD](#rmsd)
- [Association](#association)
- [Dissociation](#dissociation)
- [EdgeExpansion](#edgeexpansion)
- [A\_D](#a_d)
- [Template](#template)
- [hidden option (No need to specify)](#hidden-option-no-need-to-specify)
- [Input file](#input-file)
- [sample input file](#sample-input-file)
- [basic option](#basic-option)
- [simulator option](#simulator-option)
- [Gromacs](#gromacs)
- [Amber](#amber)
- [NAMD](#namd)
- [analyzer option](#analyzer-option)
- [Target](#target)
- [RMSD](#rmsd)
- [Association](#association)
- [Dissociation](#dissociation)
- [EdgeExpansion](#edgeexpansion)
- [A\_D](#a_d)
- [Template](#template)
- [hidden option (No need to specify)](#hidden-option-no-need-to-specify)

## sample input file
- please check [here](https://github.com/Kitaolab/PaCS-Toolkit/tree/main/jobscripts)
Expand Down Expand Up @@ -94,6 +95,16 @@ rmfile = true # Whether rmfile is executed after trial
- Gromacs index file
- **trajectory_extension: str, required**
- Trajectory file extension. ("." is necessary)
- **nojump: bool, default=false**
- whether to execute `-pbc nojump` treatment for the selection feature calculation in `analayzer`, snapshot extraction in `exporter` and performing rmmol
- **valid only when `analyzer` is also gromacs**
- If `true`, molecules are allowed to get out of the simulation box in order to avoid the error in MSM due to the jumping of break of the molecule over pbc box.
- If `false`, molecules are just made whole by `-pbc mol` and can warp across the pbc box.
- Be noted that the output `prd.xtc` files are not processed with these `-pbc` options. (only `prd_rmmol.xtc` files are processed)
- This option is recommended to use when a/dissociation and a_d pacsmd is performed using gromacs as simulator and analyzer
- `nojump=true` can lead too large coordinate value to cause overflow or loss-of-significane problem. It will not happpen in most cases, but be carefull if your ligand is very small and simulation box is very large.
- When this options is applied, analyzer can consider the distance even if ligand exceeds simulation box
- This option is not present in example input in the [sample input repository](https://github.com/Kitaolab/PaCS-Toolkit-example/tree/main) since this option was added in version 1.1.0

</details>

Expand All @@ -107,6 +118,7 @@ topology = "/work/topol.top" # Topology file such as top, parm7, psf,
mdconf = "/work/parameter.mdp" # Parameter file such as mdp, mdin, namd, etc.
index_file = "/work/index.ndx" # Gromacs index file
trajectory_extension = ".xtc" # Trajectory file extension. ("." is necessary)
nojump = true # whether to execute nojump treatment only for gmx
```


Expand Down Expand Up @@ -527,6 +539,7 @@ user-defined-variable2 = "hoge"

## hidden option (No need to specify)
<details><summary> click here </summary>

- **cmd_gmx: str**
- Gromacs command (ex. gmx, gmx_mpi)
- will be created from `cmd_serial`
Expand All @@ -536,4 +549,5 @@ user-defined-variable2 = "hoge"
- **structure_extension: str**
- Structure file extension
- will be created from `structure`

</details>
6 changes: 3 additions & 3 deletions docs/src/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ git clone https://github.com/Kitaolab/PaCS-Toolkit.git
pip install -e ".[mdtraj]"

# Or install by conda and pip
conda create -n pacsmd "python>=3.7" -y
conda create -n pacsmd "python>=3.8" -y
conda activate pacsmd
pip install -e ".[mdtraj]"
```
Expand Down Expand Up @@ -140,13 +140,13 @@ $ pacs fit trial mdtraj -t 1 -tf prd_rmmol.xtc -top rmmol_top.pdb -r ref.gro -ts
- So if you want to use other specific CVs, you need to write a code by yourself.

~~~shell
$ pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9"
$ pacs genfeature comdist mdtraj -t 1 -tf prd.xtc -top inputs/example_gromacs/input.gro -s1 "residue 1" -s2 "residue 9"
$ ls
comdist-CV/
~~~


## Step7: Building MSM and predicting free energy
- After extracting CVs, various analyses can be performed on them.
- After extracting CVs, various analyses can be performed on them.
- PaCS-MD is especially compatible with analyses using MSM.

Loading
Loading