banner
Home / Blog / Easy and accurate protein structure prediction using ColabFold | Nature Protocols
Blog

Easy and accurate protein structure prediction using ColabFold | Nature Protocols

Oct 17, 2024Oct 17, 2024

Nature Protocols (2024)Cite this article

1073 Accesses

109 Altmetric

Metrics details

Since its public release in 2021, AlphaFold2 (AF2) has made investigating biological questions, by using predicted protein structures of single monomers or full complexes, a common practice. ColabFold-AF2 is an open-source Jupyter Notebook inside Google Colaboratory and a command-line tool that makes it easy to use AF2 while exposing its advanced options. ColabFold-AF2 shortens turnaround times of experiments because of its optimized usage of AF2’s models. In this protocol, we guide the reader through ColabFold best practices by using three scenarios: (i) monomer prediction, (ii) complex prediction and (iii) conformation sampling. The first two scenarios cover classic static structure prediction and are demonstrated on the human glycosylphosphatidylinositol transamidase protein. The third scenario demonstrates an alternative use case of the AF2 models by predicting two conformations of the human alanine serine transporter 2. Users can run the protocol without computational expertise via Google Colaboratory or in a command-line environment for advanced users. Using Google Colaboratory, it takes <2 h to run each procedure. The data and code for this protocol are available at https://protocol.colabfold.com.

We present an outline of how to use ColabFold to perform structure prediction of monomers, complexes and alternative conformations and guidance on interpreting the results through appropriate confidence metrics and visualizations.

Integrating MMseqs2’s quick homology search, ColabFold enables accelerated structure prediction compared with AlphaFold2 at similar accuracy, while exposing many advanced parameters. ColabFold can be accessed through a Google Colaboratory notebook for beginners and a command-line interface for advanced users.

This is a preview of subscription content, access via your institution

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

$29.99 / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

$259.00 per year

only $21.58 per issue

Buy this article

Prices may be subject to local taxes which are calculated during checkout

All sequences used in this protocol can be found in Equipment and in the PDB.

ColabFold is available at https://github.com/sokrypton/ColabFold and https://colabfold.com. The localcolabfold installer is available at https://github.com/YoshitakaMo/localcolabfold. Colab prediction notebooks based on ColabFold-AF2 v1.5.3 and local prediction scripts are available at https://github.com/steineggerlab/colabfold-protocol, which also includes all the input and output files.

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

Article CAS PubMed PubMed Central Google Scholar

Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

Article CAS PubMed PubMed Central Google Scholar

Baek, M. et al. Efficient and accurate prediction of protein structure using RoseTTAFold2. Preprint at bioRxiv https://doi.org/10.1101/2023.05.24.542179 (2023).

Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).

Google Scholar

Humphreys, I. R. et al. Computed structures of core eukaryotic protein complexes. Science 374, eabm4805 (2021).

Article CAS PubMed PubMed Central Google Scholar

Bryant, P., Pozzati, G. & Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 13, 1265 (2022).

Article CAS PubMed PubMed Central Google Scholar

Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

Article CAS PubMed PubMed Central Google Scholar

Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv https://doi.org/10.1101/2021.10.04.463034 (2021).

Peng, Z., Wang, W., Han, R., Zhang, F. & Yang, J. Protein structure prediction in the deep learning era. Curr. Opin. Struct. Biol. 77, 102495 (2022).

Article CAS PubMed Google Scholar

Cheng, S. et al. FastFold: Optimizing AlphaFold training and inference on GPU clusters. In Proc. 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming 417–430 (ACM, 2024).

Fang, X. et al. A method for multiple-sequence-alignment-free protein structure prediction using a protein language model. Nat. Mach. Intell. 5, 1087–1096 (2023).

Article Google Scholar

Ahdritz, G. et al. OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Nat. Methods 21, 1514–1524 (2022).

Article Google Scholar

Li, Z. et al. Uni-Fold: an open-source platform for developing protein folding models beyond AlphaFold. Preprint at bioRxiv https://doi.org/10.1101/2022.08.04.502811 (2022).

Liu, S. et al. PSP: million-level protein sequence dataset for protein structure prediction. Preprint at https://arxiv.org/abs/2206.12240 (2022).

Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

Article CAS PubMed Google Scholar

Lee, J.-W. et al. DeepFold: enhancing protein structure prediction through optimized loss functions, improved template features, and re-optimized energy function. Bioinformatics 39, btad712 (2023).

Article CAS PubMed PubMed Central Google Scholar

Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).

Article CAS PubMed Google Scholar

Mirdita, M., Steinegger, M. & Söding, J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics 35, 2856–2858 (2019).

Article CAS PubMed PubMed Central Google Scholar

Lee, S. et al. Petabase-scale homology search for structure prediction. Cold Spring Harb. Perspect. Biol. 16, a041465 (2024).

Article PubMed Google Scholar

Abakarova, M., Marquet, C., Rera, M., Rost, B. & Laine, E. Alignment-based protein mutational landscape prediction: doing more with less. Genome Biol. Evol. 15, evad201 (2023).

Article PubMed PubMed Central Google Scholar

Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).

Article CAS PubMed Google Scholar

wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 47, D520–D528 (2019).

Article Google Scholar

Liu, J. et al. Enhancing alphafold-multimer-based protein complex structure prediction with MULTICOM in CASP15. Commun. Biol. 6, 1140 (2023).

Article CAS PubMed PubMed Central Google Scholar

Peng, Z., Wang, W., Wei, H., Li, X. & Yang, J. Improved protein structure prediction with trRosettaX2, AlphaFold2, and optimized MSAs in CASP15. Proteins 91, 1704–1711 (2023).

Article CAS PubMed Google Scholar

Rego, N. & Koes, D. 3Dmol.js: molecular visualization with WebGL. Bioinformatics 31, 1322–1324 (2015).

Article PubMed Google Scholar

Nomura, K. et al. Bacterial pathogens deliver water- and solute-permeable channels to plant cells. Nature 621, 586–591 (2023).

Article CAS PubMed PubMed Central Google Scholar

Mosalaganti, S. et al. AI-based structure prediction empowers integrative structural analysis of human nuclear pores. Science 376, eabm9506 (2022).

Article CAS PubMed Google Scholar

Zhang, H. et al. Structure of human glycosylphosphatidylinositol transamidase. Nat. Struct. Mol. Biol. 29, 203–209 (2022).

Article CAS PubMed Google Scholar

Del Alamo, D., Sala, D., Mchaourab, H. S. & Meiler, J. Sampling alternative conformational states of transporters and receptors with AlphaFold2. eLife 11, e75751 (2022).

Article PubMed PubMed Central Google Scholar

Gal, Y. & Ghahramani, Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. Proc. Mach. Learn. Res. 48, 1050–1059 (2016).

Google Scholar

Wallner, B. AFsample: improving multimer prediction with AlphaFold using massive sampling. Bioinformatics 39, btad573 (2023).

Article CAS PubMed PubMed Central Google Scholar

Wayment-Steele, H. K. et al. Predicting multiple conformations via sequence clustering and AlphaFold2. Nature 625, 832–839 (2024).

Article CAS PubMed Google Scholar

Monteiro da Silva, G., Cui, J. Y., Dalgarno, D. C., Lisi, G. P. & Rubenstein, B. M. High-throughput prediction of protein conformational distributions with subsampled AlphaFold2. Nat. Commun. 15, 2464 (2024).

Article CAS PubMed PubMed Central Google Scholar

Chakravarty, D. & Porter, L. L. AlphaFold2 fails to predict protein fold switching. Protein Sci. 31, e4353 (2022).

Article CAS PubMed PubMed Central Google Scholar

Saldaño, T. et al. Impact of protein conformational diversity on AlphaFold predictions. Bioinformatics 38, 2742–2748 (2022).

Article PubMed Google Scholar

Garibsingh, R.-A. A. et al. Rational design of ASCT2 inhibitors using an integrated experimental-computational approach. Proc. Natl Acad. Sci. USA 118, e2104093118 (2021).

Article CAS PubMed PubMed Central Google Scholar

Garaeva, A. A., Guskov, A., Slotboom, D. J. & Paulino, C. A one-gate elevator mechanism for the human neutral amino acid transporter ASCT2. Nat. Commun. 10, 3427 (2019).

Article PubMed PubMed Central Google Scholar

Wu, R. et al. High-resolution de novo structure prediction from primary sequence. Preprint at bioRxiv https://doi.org/10.1101/2022.07.21.500999 (2022).

Chowdhury, R. et al. Single-sequence protein structure prediction using a language model and deep learning. Nat. Biotechnol. 40, 1617–1623 (2022).

Article CAS PubMed PubMed Central Google Scholar

Wang, W., Peng, Z. & Yang, J. Single-sequence protein structure prediction using supervised transformer protein language models. Nat. Comput. Sci. 2, 804–814 (2022).

Article CAS PubMed Google Scholar

Bertoline, L. M. F., Lima, A. N., Krieger, J. E. & Teixeira, S. K. Before and after AlphaFold2: an overview of protein structure prediction. Front. Bioinform. 3, 1120370 (2023).

Article PubMed PubMed Central Google Scholar

Paysan-Lafosse, T. et al. InterPro in 2022. Nucleic Acids Res. 51, D418–D427 (2023).

Article CAS PubMed Google Scholar

Redl, I. et al. ADOPT: intrinsic protein disorder prediction through deep bidirectional transformers. NAR Genom. Bioinform. 5, lqad041 (2023).

Article PubMed PubMed Central Google Scholar

Zhang, J., Schaeffer, R. D., Durham, J., Cong, Q. & Grishin, N. V. DPAM: a domain parser for AlphaFold models. Protein Sci. 32, e4548 (2023).

Article CAS PubMed PubMed Central Google Scholar

Howe, P. W. Principal components analysis of protein structure ensembles calculated using NMR data. J. Biomol. NMR 20, 61–70 (2001).

Article CAS PubMed Google Scholar

Roe, D. R. & Cheatham, T. E. PTRAJ and CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 9, 3084–3095 (2013).

Article CAS PubMed Google Scholar

Zhang, H. et al. Structure of a human glycosylphosphatidylinositol (GPI) transamidase. Available at https://www.rcsb.org/structure/7W72 (2022).

Garibsingh, R.-A. A. et al. ASCT2 in the presence of the inhibitor Lc-BPE (position “up”) in the outward-open conformation. Available at https://www.rcsb.org/structure/7BCQ (2021).

Garaeva, A. A., Guskov, A., Slotboom, D. J. & Paulino, C. Inward-open structure of the ASCT2 (SLC1A5) mutant C467R in presence of TBOA. Available at https://www.rcsb.org/structure/6RVX (2019).

Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).

Article CAS PubMed Google Scholar

Mariani, V., Biasini, M., Barbato, A. & Schwede, T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29, 2722–2728 (2013).

Article CAS PubMed PubMed Central Google Scholar

Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

Article CAS PubMed PubMed Central Google Scholar

O’Reilly, F. J. et al. Protein complexes in cells by AI-assisted structural proteomics. Mol. Syst. Biol. 19, e11544 (2023).

Article PubMed PubMed Central Google Scholar

Steinegger, M. et al. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinforma. 20, 473 (2019).

Article Google Scholar

Gabler, F. et al. Protein sequence analysis using the MPI bioinformatics toolkit. Curr. Protoc. Bioinforma. 72, e108 (2020).

Article CAS Google Scholar

Download references

M.S. acknowledges the support by the National Research Foundation of Korea, grants 2020M3-A9G7-103933, 2021-R1C1-C102065, 2021-M3A9-I4021220 and RS-2024-00396026; the Samsung DS research fund; the Creative-Pioneering Researchers Program; and the AI-Bio Research Grant through Seoul National University. M.M. acknowledges support by the National Research Foundation of Korea (grant RS-2023-00250470). Y.M. acknowledges support from Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) from AMED under grant number JP23ama121027. S.O. was supported by the National Institutes of Health (NIH) DP5OD026389 and the National Science Foundation (NSF) MCB2032259.

These authors contributed equally: Gyuri Kim, Sewon Lee, Eli Levy Karin.

Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea

Gyuri Kim, Hyunbin Kim & Martin Steinegger

School of Biological Sciences, Seoul National University, Seoul, South Korea

Sewon Lee, Martin Steinegger & Milot Mirdita

ELKMO, Copenhagen, Denmark

Eli Levy Karin

Department of Biotechnology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan

Yoshitaka Moriwaki

Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Tokyo, Japan

Yoshitaka Moriwaki

Department of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan

Yoshitaka Moriwaki

Massachusetts Institute of Technology, Cambridge, MA, USA

Sergey Ovchinnikov

Artificial Intelligence Institute, Seoul National University, Seoul, South Korea

Martin Steinegger

Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea

Martin Steinegger

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

G.K., S.L., E.L.K. and M.S. developed the protocol. Y.M., S.O., M.S. and M.M. developed the ColabFold software and notebooks. G.K., S.L. and H.K. performed predictions and visualized the data. S.O., M.S. and M.M. supervised the monomer and complex prediction procedures. E.L.K., Y.M., M.S. and M.M. supervised the conformation prediction procedure. G.K., S.L. and E.L.K. analyzed the results and wrote the paper, with contributions from all authors.

Correspondence to Sergey Ovchinnikov, Martin Steinegger or Milot Mirdita.

The authors declare no competing interests.

Nature Protocols thanks Jianyi Yang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Key references using this protocol

Mirdita, M. et al. Nat. Methods 19, 679–682 (2022): https://doi.org/10.1038/s41592-022-01488-1

Lee, S. et al. Cold Spring Harb. Perspect. Biol. 16, a041465 (2024): https://doi.org/10.1101/cshperspect.a041465

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

Kim, G., Lee, S., Levy Karin, E. et al. Easy and accurate protein structure prediction using ColabFold. Nat Protoc (2024). https://doi.org/10.1038/s41596-024-01060-5

Download citation

Received: 21 November 2023

Accepted: 07 August 2024

Published: 14 October 2024

DOI: https://doi.org/10.1038/s41596-024-01060-5

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative