Skip to content

biocheming/AF3_mmseqs2

Repository files navigation

header

AlphaFold 3 with MMSeqs2-GPU

This package provides an implementation of the inference pipeline of AlphaFold 3. See below for how to access the model parameters. You may only use AlphaFold 3 model parameters if received directly from Google. Use is subject to these terms of use.

Any publication that discloses findings arising from using this source code, the model parameters or outputs produced by those should cite the Accurate structure prediction of biomolecular interactions with AlphaFold 3 paper.

Please also refer to the Supplementary Information for a detailed description of the method.

AlphaFold 3 is also available at alphafoldserver.com for non-commercial use, though with a more limited set of ligands and covalent modifications.

If you have any questions, please contact the AlphaFold team at alphafold@google.com.

Obtaining Model Parameters

This repository contains all necessary code for AlphaFold 3 inference. To request access to the AlphaFold 3 model parameters, please complete this form. Access will be granted at Google DeepMind’s sole discretion. We will aim to respond to requests within 2–3 business days. You may only use AlphaFold 3 model parameters if received directly from Google. Use is subject to these terms of use.

Installation and Running Your First Prediction

See the installation documentation.

# nano CMakeLists.txt # open CMakelists.txt
# on the line 18, add `set(CMAKE_CXX_FLAGS -lz)`, save and close 
conda create -n af3 python=3.11 # create af3 env with conda 
conda activate af3 # activate af3 env
conda install -c bioconda hmmer # install hmmer
pip install -r dev-requirements.txt 
pip install . --no-deps 
build_data # build cif database
python run_alphafold_test.py # test 

# jax GPU test
python
import jax
jax.devices()
from jax.lib import xla_bridge
xla_bridge.get_backend().platform

# using SSD for seqs storage
# download data about 252GB (630GB after decompresed)
python fetch_databases.py --download_destination=<your_database_path>


# bash environment settings (~/.bashrc)
export XLA_FLAGS="--xla_gpu_enable_triton_gemm=false" 
export XLA_PYTHON_CLIENT_PREALLOCATE=true
export XLA_CLIENT_MEM_FRACTION=0.95

MMSeqs2-GPU installation see: MMseqs2_user_guide.md

Once you have installed AlphaFold 3, you can test your setup using e.g. the

following input JSON file named alphafold_input.json:

{
  "name": "2PV7",
  "sequences": [
    {
      "protein": {
        "id": ["A", "B"],
        "sequence": "GMRESYANENQFGFKTINSDIHKIVIVGGYGKLGGLFARYLRASGYPISILDREDWAVAESILANADVVIVSVPINLTLETIERLKPYLTENMLLADLTSVKREPLAKMLEVHTGAVLGLHPMFGADIASMAKQVVVRCDGRFPERYEWLLEQIQIWGAKIYQTNATEHDHNMTYIQALRHFSTFANGLHLSKQPINLANLLALSSPIYRLELAMIGRLFAQDAELYADIIMDKSENLAVIETLKQTYDEALTFFENNDRQGFIDAFHKVRDWFGDYSEQFLKESRQLLQQANDLKQG"
      }
    }
  ],
  "modelSeeds": [1],
  "dialect": "alphafold3",
  "version": 1
}

You can then run AlphaFold 3 using the following command:

python run_alphafold.py --buckets '256,512,768,1024' --flash_attention_implementation xla --mmseqs2_use_gpu=true --input_dir examples --output_dir examples/output

There are various flags that you can pass to the run_alphafold.py command, to list them all run python run_alphafold.py --help. Two fundamental flags that control which parts AlphaFold 3 will run are:

  • --run_data_pipeline (defaults to true): whether to run the data pipeline, i.e. genetic and template search. This part is CPU-only, time consuming and could be run on a machine without a GPU.
  • --run_inference (defaults to true): whether to run the inference. This part requires a GPU.

AlphaFold 3 Input

See the input documentation.

Togenerate an input json file using the script:

python scripts/generate_input.py

AlphaFold 3 Output

See the output documentation.

Performance

See the performance documentation.

Known Issues

Known issues are documented in the known issues documentation.

Please create an issue if it is not already listed in Known Issues or in the issues tracker.

Citing This Work

Any publication that discloses findings arising from using this source code, the model parameters or outputs produced by those should cite:

@article{Abramson2024,
  author  = {Abramson, Josh and Adler, Jonas and Dunger, Jack and Evans, Richard and Green, Tim and Pritzel, Alexander and Ronneberger, Olaf and Willmore, Lindsay and Ballard, Andrew J. and Bambrick, Joshua and Bodenstein, Sebastian W. and Evans, David A. and Hung, Chia-Chun and O’Neill, Michael and Reiman, David and Tunyasuvunakool, Kathryn and Wu, Zachary and Žemgulytė, Akvilė and Arvaniti, Eirini and Beattie, Charles and Bertolli, Ottavia and Bridgland, Alex and Cherepanov, Alexey and Congreve, Miles and Cowen-Rivers, Alexander I. and Cowie, Andrew and Figurnov, Michael and Fuchs, Fabian B. and Gladman, Hannah and Jain, Rishub and Khan, Yousuf A. and Low, Caroline M. R. and Perlin, Kuba and Potapenko, Anna and Savy, Pascal and Singh, Sukhdeep and Stecula, Adrian and Thillaisundaram, Ashok and Tong, Catherine and Yakneen, Sergei and Zhong, Ellen D. and Zielinski, Michal and Žídek, Augustin and Bapst, Victor and Kohli, Pushmeet and Jaderberg, Max and Hassabis, Demis and Jumper, John M.},
  journal = {Nature},
  title   = {Accurate structure prediction of biomolecular interactions with AlphaFold 3},
  year    = {2024},
  volume  = {630},
  number  = {8016},
  pages   = {493–-500},
  doi     = {10.1038/s41586-024-07487-w}
}

Acknowledgements

AlphaFold 3's release was made possible by the invaluable contributions of the following people:

Andrew Cowie, Bella Hansen, Charlie Beattie, Chris Jones, Grace Margand, Jacob Kelly, James Spencer, Josh Abramson, Kathryn Tunyasuvunakool, Kuba Perlin, Lindsay Willmore, Max Bileschi, Molly Beck, Oleg Kovalevskiy, Sebastian Bodenstein, Sukhdeep Singh, Tim Green, Toby Sargeant, Uchechi Okereke, Yotam Doron, and Augustin Žídek (engineering lead).

We also extend our gratitude to our collaborators at Google and Isomorphic Labs.

AlphaFold 3 uses the following separate libraries and packages:

We thank all their contributors and maintainers!

Get in Touch

If you have any questions not covered in this overview, please contact the AlphaFold team at alphafold@google.com.

We would love to hear your feedback and understand how AlphaFold 3 has been useful in your research. Share your stories with us at alphafold@google.com.

Licence and Disclaimer

This is not an officially supported Google product.

Copyright 2024 DeepMind Technologies Limited.

AlphaFold 3 Source Code and Model Parameters

The AlphaFold 3 source code is licensed under the Creative Commons Attribution-Non-Commercial ShareAlike International License, Version 4.0 (CC-BY-NC-SA 4.0) (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://github.com/google-deepmind/alphafold3/blob/main/LICENSE.

The AlphaFold 3 model parameters are made available under the AlphaFold 3 Model Parameters Terms of Use (the "Terms"); you may not use these except in compliance with the Terms. You may obtain a copy of the Terms at https://github.com/google-deepmind/alphafold3/blob/main/WEIGHTS_TERMS_OF_USE.md.

Unless required by applicable law, AlphaFold 3 and its output are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. You are solely responsible for determining the appropriateness of using AlphaFold 3, or using or distributing its source code or output, and assume any and all risks associated with such use or distribution and your exercise of rights and obligations under the relevant terms. Output are predictions with varying levels of confidence and should be interpreted carefully. Use discretion before relying on, publishing, downloading or otherwise using the AlphaFold 3 Assets.

AlphaFold 3 and its output are for theoretical modeling only. They are not intended, validated, or approved for clinical use. You should not use the AlphaFold 3 or its output for clinical purposes or rely on them for medical or other professional advice. Any content regarding those topics is provided for informational purposes only and is not a substitute for advice from a qualified professional. See the relevant terms for the specific language governing permissions and limitations under the terms.

Third-party Software

Use of the third-party software, libraries or code referred to in the Acknowledgements section above may be governed by separate terms and conditions or license provisions. Your use of the third-party software, libraries or code is subject to any such terms and you should check that you can comply with any applicable restrictions or terms and conditions before use.

Mirrored and Reference Databases

The following databases have been: (1) mirrored by Google DeepMind; and (2) in part, included with the inference code package for testing purposes, and are available with reference to the following:

About

Alphafold3 using mmseqs2-gpu for sequence query

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages