Skip to content

Commit

Permalink
Merge pull request #122 from alchemistry/finalize-0.4.0
Browse files Browse the repository at this point in the history
Finalize 0.4.0
  • Loading branch information
orbeckst authored Apr 27, 2021
2 parents db503e3 + 83a5505 commit 478392a
Show file tree
Hide file tree
Showing 6 changed files with 96 additions and 67 deletions.
9 changes: 4 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ python:
- "3.5"
- "3.6"
- "3.7"
- "3.8"
- "3.9"

branches:
only:
Expand All @@ -20,16 +22,13 @@ branches:
install:
- pip install --upgrade pip setuptools wheel
- pip install --upgrade pytest
- pip install six
- pip install codecov
- pip install pytest-cov
- pip install pytest-pep8
- pip install six codecov pytest-cov pytest-pep8
- pip install --only-binary=numpy numpy # Otherwise this would take ages
- pip install https://github.com/alchemistry/alchemtest/archive/master.zip
- pip install -e .

script:
- py.test --cov alchemlyb src/alchemlyb/tests
- pytest --cov alchemlyb src/alchemlyb/tests

after_success:
- codecov
15 changes: 8 additions & 7 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -15,28 +15,29 @@ The rules for this file:
------------------------------------------------------------------------------


??/??/2020 wehs7661, dotsdl, xiki-tempula
04/27/2021 wehs7661, dotsdl, xiki-tempula, orbeckst

* 0.?.?
* 0.4.0

Enhancements
- Allow the dhdl from TI estimator to be separated for multiple lambda
(PR #121).
(PR #121).
- Allow the convergence to be plotted. (PR #121)
- Allow automatic sorting and duplication removal during subsampling
(issue #118, PR #119).
(issue #118, PR #119).
- Allow statistical_inefficiency to work on multiindex series. (issue #116,
PR #117)
PR #117)
- Allow the overlap matrix of the MBAR estimator to be plotted. (issue #73,
PR #107)
PR #107)
- Allow the dhdl of the TI estimator to be plotted. (issue #73, PR #110)
- Allow the dF states to be plotted. (issue #73, PR #112)

Deprecations
- Last version that is tested against Python 3.5 and 2.7.

Fixes
- removed redundant statistical inefficiency calculation in
`alchemlyb.preprocessing.subsampling.equilibrium_detection`
`alchemlyb.preprocessing.subsampling.equilibrium_detection`


Changes
Expand Down
28 changes: 14 additions & 14 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,13 @@ alchemlyb: the simple alchemistry library

|doi| |docs| |build| |cov|

**Warning**: This library is young. It is **not** API stable. It is a
nucleation point. By all means use and help improve it, but note that it will
change with time.
**alchemlyb** makes alchemical free energy calculations easier to do
by leveraging the full power and flexibility of the PyData stack. It
includes:

**alchemlyb** is an attempt to make alchemical free energy calculations easier
to do by leveraging the full power and flexibility of the PyData stack. It
includes:

1. Parsers for extracting raw data from output files of common molecular
dynamics engines such as GROMACS [Abraham2015]_.
1. Parsers for extracting raw data from output files of common
molecular dynamics engines such as `GROMACS`_, `AMBER`_, `NAMD`_
and `other simulation codes`_.

2. Subsamplers for obtaining uncorrelated samples from timeseries data.

Expand All @@ -24,11 +21,6 @@ In particular, it uses internally the excellent `pymbar
<http://pymbar.readthedocs.io/>`_ library for performing MBAR and extracting
independent, equilibrated samples [Chodera2016]_.

.. [Abraham2015] Abraham, M.J., Murtola, T., Schulz, R., Páll, S., Smith, J.C.,
Hess, B., and Lindahl, E. (2015). GROMACS: High performance molecular
simulations through multi-level parallelism from laptops to supercomputers.
SoftwareX 1–2, 19–25.
.. [Shirts2008] Shirts, M.R., and Chodera, J.D. (2008). Statistically optimal
analysis of samples from multiple equilibrium states. The Journal of Chemical
Physics 129, 124105.
Expand All @@ -37,6 +29,14 @@ independent, equilibrated samples [Chodera2016]_.
Equilibration Detection in Molecular Simulations. Journal of Chemical Theory
and Computation 12, 1799–1805.
.. _GROMACS: <http://www.gromacs.org>

.. _AMBER: http://ambermd.org/

.. _NAMD: http://www.ks.uiuc.edu/Research/namd/

.. _`other simulation codes`: https://alchemlyb.readthedocs.io/en/latest/parsing.html

.. |doi| image:: https://zenodo.org/badge/68669096.svg
:alt: Zenodo DOI
:scale: 100%
Expand Down
87 changes: 51 additions & 36 deletions docs/api_principles.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. -*- coding: utf-8 -*-
API principles
==============

Expand All @@ -14,68 +16,81 @@ These functions are simple in usage and pure in scope, and can be chained togeth
`alchemlyb` seeks to be as boring and simple as possible to enable more complex work.
Its components allow work at all scales, from use on small systems using a single workstation to larger datasets that require distributed computing using libraries such as dask.

First and foremost, scientific code must be *correct* and we try to ensure this requirement by following best software engineering practices during development, close to full test coverage of all code in the library, and providing citations to published papers for included algorithms. We use a curated, public data set (`alchemtest`_) for automated testing.

.. _alchemtest: https://github.com/alchemistry/alchemtest


Core philosophy
---------------

1. Use functions when possible, classes only when necessary (or for estimators, see (2)).
2. For estimators, mimic the **scikit-learn** API as much as possible.
3. Aim for a consistent interface throughout, e.g. all parsers take similar inputs and yield a common set of outputs.

4. Have all functionality tested.


API components
--------------

The library is structured as follows, following a similar style to **scikit-learn**::
The library is structured as follows, following a similar style to
**scikit-learn**::

alchemlyb
|
-- parsing
| |
| -- gmx
| |
| -- amber
| |
| -- openmm
| |
| -- namd
| |
| |__ ...
|
-- preprocessing
| |
| -- subsampling
| |
| |__ ...
|
-- estimators
|
-- mbar_
|
-- ti_
|
-- ...
├── parsing
│   ├── amber.py
│   ├── gmx.py
│   ├── gomc.py
│   ├── namd.py
│   └── ...
├── preprocessing
│   ├── subsampling.py
│   └── ...
├── estimators
│   ├── bar_.py
│   ├── mbar_.py
│   ├── ti_.py
│   └── ...
├── convergence ### NOT IMPLEMENTED
│   ├── convergence.py
│   └── ...
└── visualisation
├── convergence.py
├── dF_state.py
├── mbar_matrix.py
├── ti_dhdl.py
└── ...


The ``parsing`` submodule contains parsers for individual MD engines, since the output files needed to perform alchemical free energy calculations vary widely and are not standardized.
The :mod:`~alchemlyb.parsing` submodule contains parsers for individual MD engines, since the output files needed to perform alchemical free energy calculations vary widely and are not standardized.
Each module at the very least provides an `extract_u_nk` function for extracting reduced potentials (needed for MBAR), as well as an `extract_dHdl` function for extracting derivatives required for thermodynamic integration.
Other helper functions may be exposed for additional processing, such as generating an XVG file from an EDR file in the case of GROMACS.
All `extract\_*` functions take similar arguments (a file path,
parameters such as temperature), and produce standard outputs
(:class:`pandas.DataFrame` for reduced potentials, :class:`pandas.Series` for derivatives).

The `preprocessing` submodule features functions for subsampling timeseries, as may be desired before feeding them to an estimator.
The :mod:`~alchemlyb.preprocessing` submodule features functions for subsampling timeseries, as may be desired before feeding them to an estimator.
So far, these are limited to `slicing`, `statistical_inefficiency`, and `equilibrium_detection` functions, many of which make use of subsampling schemes available from :mod:`pymbar`.
These functions are written in such a way that they can be easily composed as parts of complex processing pipelines.

The `estimators` module features classes *a la* **scikit-learn** that can be initialized with parameters that determine their behavior and then "trained" on a `fit` method.
So far, `MBAR` has been partially implemented, and because the numerical heavy-lifting is already well-implemented in `pymbar.MBAR`, this class serves to give an interface that will be familiar and consistent with the others.
Thermodynamic integration is not yet implemented.
The :mod:`~alchemlyb.estimators` module features classes *a la* **scikit-learn** that can be initialized with parameters that determine their behavior and then "trained" on a `fit` method.
MBAR, BAR, and thermodynamic integration (TI) as the major methods are all implemented.
Correct error estimates require the use of time series with independent samples.

The :mod:`~alchemlyb.convergence` submodule will feature convenience functions/classes for doing convergence analysis using a given dataset and a chosen estimator, though the form of this is not yet thought-out.
However, the `gist a41e5756a58e1775e3e3a915f07bfd37`_ shows an example for how this can be done already in practice.

The :mod:`visualization` submodule contains convenience plotting functions as known from, for example, `alchemical-analysis.py`_.

All of these components lend themselves well to writing clear and flexible pipelines for processing data needed for alchemical free energy calculations, and furthermore allow for scaling up via libraries like `dask`_ or `joblib`_.

.. _`alchemical-analysis.py`: https://github.com/MobleyLab/alchemical-analysis/

The `convergence` submodule will feature convenience functions/classes for doing convergence analysis using a given dataset and a chosen estimator, though the form of this is not yet thought-out.
However, the gist shows an example for how this can be done already in practice.
.. _dask: https://dask.org/

All of these components lend themselves well to writing clear and flexible pipelines for processing data needed for alchemical free energy calculations, and furthermore allow for scaling up via libraries like `dask` or `joblib`.
.. _joblib: https://joblib.readthedocs.io


Development model
Expand Down
3 changes: 0 additions & 3 deletions docs/examples.rst

This file was deleted.

21 changes: 19 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,34 @@
description='the simple alchemistry library',
author='David Dotson',
author_email='dotsdl@gmail.com',
maintainer='Oliver Beckstein',
maintainer_email='orbeckst@gmail.com',
classifiers=[
'Development Status :: 3 - Alpha',
'Development Status :: 4 - Beta',
'Intended Audience :: Science/Research',
'License :: OSI Approved :: BSD License',
'Operating System :: POSIX',
'Operating System :: MacOS :: MacOS X',
'Operating System :: Microsoft :: Windows ',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9',
'Programming Language :: C',
'Topic :: Scientific/Engineering',
'Topic :: Scientific/Engineering :: Bio-Informatics',
'Topic :: Scientific/Engineering :: Chemistry',
'Topic :: Software Development :: Libraries :: Python Modules',
],
packages=find_packages('src'),
package_dir={'': 'src'},
license='BSD',
long_description=open('README.rst').read(),
tests_require = ['pytest', 'alchemtest'],
install_requires=['numpy', 'pandas>=0.23.0', 'pymbar>=3.0.5', 'scipy', 'scikit-learn', 'matplotlib']
install_requires=['numpy', 'pandas>=0.23.0', 'pymbar>=3.0.5,<4', 'scipy', 'scikit-learn', 'matplotlib']
)

0 comments on commit 478392a

Please sign in to comment.