Skip to content

Commit

Permalink
Some additional minor edits for internal consistency around estimator…
Browse files Browse the repository at this point in the history
…s, standard data forms
  • Loading branch information
dotsdl committed Apr 9, 2024
1 parent 081e90c commit b5c3d1d
Showing 1 changed file with 20 additions and 14 deletions.
34 changes: 20 additions & 14 deletions joss_paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,9 @@ authors:
orcid: 0000-0002-7615-7851
equal-contrib: true
affiliation: 1
- name: David Dotson
- name: David L. Dotson
equal-contrib: true # (This is how you can denote equal contributions between multiple authors)
orcid: 0000-0001-5879-2942
affiliation: 2
- name: Michael R. Shirts
orcid: 0000-0003-3249-1097
Expand Down Expand Up @@ -45,10 +46,11 @@ The software spans a wide range of functions,

A distinctive attribute of *alchemlyb* is its streamlined, end-to-end analysis workflow.
This user-friendly workflow facilitates navigation through the entire analysis pipeline,
from the initial data input stage to the final result derivation. This attribute enhances accessibility,
enabling researchers from diverse scientific backgrounds,
and not solely computational chemistry specialists,
to utilize *alchemlyb* effectively.
from the initial data input stage to the final result derivation.
This attribute enhances accessibility,
enabling researchers from diverse scientific backgrounds,
and not solely computational chemistry specialists,
to utilize *alchemlyb* effectively.


# Statement of need
Expand All @@ -60,21 +62,23 @@ Other free energies extracted from simulations are useful in solution thermodyna
The *alchemlyb* software processes the raw data from MD simulations using key estimators from statistical mechanics, drastically simplifying the process of extracting crucial thermodynamic insights from molecular simulations.

Various molecular dynamics (MD) engines, including GROMACS [@pronk2013gromacs], AMBER [@case2014ff14sb], GOMC [@cummings2021open], and NAMD [@phillips2020scalable],
offer distinct tools for performing free energy calculations.
offer distinct tools for performing free energy calculations.
However, the diversity in output formats and analysis tools among different MD engines complicates the research process.
Data generated by each engine requires individualized processing and analysis methods, hindering seamless collaboration and comparison of results.
Data generated by each engine requires individualized processing and analysis methods,
hindering seamless collaboration and comparison of results.


THe [alchemical-analysis.py](https://github.com/MobleyLab/alchemical-analysis) tool [@klimovich2015guidelines], which preceeded *alchemlyb*, addressed this problem.
The [alchemical-analysis.py](https://github.com/MobleyLab/alchemical-analysis) tool [@klimovich2015guidelines], which preceeded *alchemlyb*, addressed this problem.
Now that [alchemical-analysis.py](https://github.com/MobleyLab/alchemical-analysis) has been deprecated,
*alchemlyb* continues to provide a unified, engine-agnostic analysis workflow.
Unlike its predecessor, *alchemlyb* breaks down components of the workflow into modular tools,
Unlike its predecessor, *alchemlyb* breaks down components of the workflow into modular tools,
allowing users to more easily customize their analysis.
This innovation enables consistent processing of free energy data from diverse MD engines, facilitating streamlined comparison and combination of results.
This innovation enables consistent processing of free energy data from diverse MD engines,
facilitating streamlined comparison and combination of results.

Notably, *alchemlyb*'s robust and user-friendly nature has led to its integration into other automated workflow libraries such as BioSimSpace [@hedges2023suite].
This further enhances its accessibility and usability within broader scientific workflows,
reinforcing its position as a versatile and essential tool in the field of computational chemistry [^1].
reinforcing its position as a versatile and essential tool in the field of computational chemistry [^1].

[^1]: As of 29/12/2023, *alchemlyb* has been downloaded 23,922 times from [conda-forge](https://anaconda.org/conda-forge/alchemlyb/files).

Expand All @@ -89,15 +93,17 @@ Overlapping is facilitated by introducing a parameter `lambda` ($\lambda $) that
MD engines simulate the system at these states, generating and accumulating free energy data.

*alchemlyb* offers specific parsers designed to load raw free energy data from various MD engines, converting them into standard `pandas` `DataFrames`.
Two types of free energy data are considered: reduced potential energy differences between lambda states (`u_nk`, $u_{nk}$), which are used for free energy perturbation (FEP) estimators [@zwanzig1954high],
and $dU/d\lambda$ at all lambda states, suitable for thermodynamic integration (TI) estimators [@kirkwood1935statistical].
Two types of free energy data are considered:
Hamiltonian gradients (`dHdl`, $dH/d\lambda$) at all lambda states, suitable for thermodynamic integration (TI) estimators [@kirkwood1935statistical],
and reduced potential energy differences between lambda states (`u_nk`, $u_{nk}$), which are used for free energy perturbation (FEP) estimators [@zwanzig1954high],

In *alchemlyb*, TI [@paliwal2011benchmark] and TI with Gaussian quadrature [@gusev2023active] estimators are implemented in the TI category of estimators.
FEP category estimators include Bennett Acceptance Ratio (BAR) [@bennett1976efficient] and Multistate BAR (MBAR) [@shirts2008statistically].
These estimators assume uncorrelated samples, and *alchemlyb* provides tools for data resampling based on autocorrelation times [@chodera2007use].

To evaluate the accuracy of the free energy estimate, *alchemlyb* offers a range of assessment tools.
The error of the TI method is correlated with the average curvature [@pham2011identifying], while the error of FEP estimators depends on the overlap in sampled energy distributions [@pohorille2010good].
The error of the TI method is correlated with the average curvature [@pham2011identifying],
while the error of FEP estimators depends on the overlap in sampled energy distributions [@pohorille2010good].
*alchemlyb* visualizes the smoothness of the integrand for TI estimators and the overlap matrix for FEP estimators.
Additionally, the accumulated samples should be collected from equilibrated simulations,
and *alchemlyb* has tools for plotting the convergence of the free energy estimate as a function of simulation time [@yang2004free] to detect potentially un-equilibrated data.
Expand Down

0 comments on commit b5c3d1d

Please sign in to comment.