diff --git a/joss_paper/paper.md b/joss_paper/paper.md index c9e1df3c..c2e1e86b 100644 --- a/joss_paper/paper.md +++ b/joss_paper/paper.md @@ -39,7 +39,7 @@ bibliography: paper.bib # Summary *alchemlyb* is an open-source Python software package for the analysis of alchemical free energy calculations, an integral part of computational chemistry and biology, most notably in the field of drug discovery. -Its functionality covers contains individual function-based building blocks for all aspects of a full typical free energy analysis workflow, starting with the extraction of raw data from the output of molecular dynamics (MD) packages, moving on to data preprocessing tasks such as decorrelation of time series, using various estimators to derive free energy estimates from simulation samples, and finally providing quality analysis tools for data convergence checking and visualization. +Its functionality covers contains individual function-based building blocks for all aspects of a full typical free energy analysis workflow, starting with the extraction of raw data from the output of diverse molecular dynamics (MD) packages, moving on to data preprocessing tasks such as decorrelation of time series, using various estimators to derive free energy estimates from simulation samples, and finally providing quality analysis tools for data convergence checking and visualization. *alchemlyb* also contains high-level end-to-end workflows that combine multiple building blocks into a user-friendly analysis pipeline from the initial data input stage to the final result derivation. This workflow functionality enhances accessibility by enabling researchers from diverse scientific backgrounds, and not solely computational chemistry specialists, to utilize *alchemlyb* effectively. @@ -50,11 +50,17 @@ In the pharmaceutical sector, computational chemistry techniques are integral fo Notably, absolute binding free energy calculations between proteins and ligands or relative binding affinity of ligands to the same protein are routinely employed for this purpose [@merz2010drug]. The resultant estimates of these free energies are essential for understanding binding affinity throughout various stages of drug discovery, such as hit identification and lead optimization [@merz2010drug]. Other free energies extracted from simulations are useful in solution thermodynamics, chemical engineering, environmental science, and material science. -The *alchemlyb* software processes the raw data from MD simulations using key estimators from statistical mechanics, drastically simplifying the process of extracting crucial thermodynamic insights from molecular simulations. -Various molecular dynamics (MD) engines, including GROMACS [@pronk2013gromacs], AMBER [@case2014ff14sb], GOMC [@cummings2021open], and NAMD [@phillips2020scalable], offer distinct tools for performing free energy calculations. -However, the diversity in output formats and analysis tools among different MD engines complicates the research process. -Data generated by each engine requires individualized processing and analysis methods, hindering seamless collaboration and comparison of results. +Molecular dynamics (MD) packages such as GROMACS [@pronk2013gromacs], AMBER [@case2014ff14sb], NAMD [@phillips2020scalable], and GOMC [@cummings2021open] are used to run free energy simulations and many of these packages also contain tools for the subsequent processing of simulation data into free energies. +However, there are no standard output formats and analysis tools implement different algorithms for the different stages of the free energy data processing pipeline. +Therefore, it is very difficult to analyze data from different MD packages in a consistent manner. +Furthermore, the native analysis tools do not always implement current best practices [@klimovich2015guidelines] or are out of date +Overall, the coupling between data generation and analysis in most MD packages hinders seamless collaboration and comparison of results across. + +*alchemlyb* addresses this problem by focusing only on the data analysis with the goal to provide a unified interface for working with free energy data. +In an initial step data are read from the native MD package file formats and then organized into a common standard data structure, a pandas Dataframe. +Additional functions enable subsampling or decorrelation of data and applying estimators from statistical mechanics to derive free energy quantities. +Overall, *alchemlyb* implements modular building blocks to simplify the process of extracting crucial thermodynamic insights from molecular simulations in a uniform manner. The [alchemical-analysis.py](https://github.com/MobleyLab/alchemical-analysis) tool [@klimovich2015guidelines], which preceeded *alchemlyb*, addressed this problem.