Alphafold
AlphaFold is an artificial intelligence program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure.
Available modules
Alphafold2
is available as modules on Midway3 that you can check via module avail alphafold
.
module avail alphafold
---------------------- /software/modulefiles----------------------------------
alphafold/2.0.0(default) alphafold/2.2.0 alphafold/2.3.2
The AlphaFold source code and running scripts (e.g. run_alphafold.py
) can be found at the Alphafold GitHub.
The training data sets for different versions of Alphafold are accessible under /software/alphafold-data/
, /software/alphafold-data-2.2/
and /software/alphafold-data-2.3/
.
Example job script
Typically, Alphafold2 uses OpenMM, a GPU-accelerated molecular simulation package, to relax the candidate protein. OpenMM requires the CUDA toolkit to run on a GPU node.
If you want to run on a CPU-only node without the relaxation run for the candidate protein, you can run the python script run_alphafold.py
with --use_gpu_relax=false
.
The following example job script illustrates how to use the alphafold/2.3.2
module on a GPU node with 2 GPUs and up to 16 CPU cores for multithreading on Midway3.
#!/bin/bash
#SBATCH --job-name=alphafold2
#SBATCH --account=[your-accountname]
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --time=04:00:00
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --gres=gpu:2
#SBATCH --constraint=v100
#SBATCH --mem=64G
module load alphafold/2.3.2 cuda/11.3
cd $SLURM_SUBMIT_DIR
echo "GPUs available: GPU ID $CUDA_VISIBLE_DEVICES"
echo "CPU cores: $SLURM_CPUS_PER_TASK"
DOWNLOAD_DATA_DIR=/software/alphafold-data-2.3
python run_alphafold.py \
--data_dir=$DOWNLOAD_DATA_DIR \
--uniref90_database_path=$DOWNLOAD_DATA_DIR/uniref90/uniref90.fasta \
--mgnify_database_path=$DOWNLOAD_DATA_DIR/mgnify/mgy_clusters_2022_05.fa \
--bfd_database_path=$DOWNLOAD_DATA_DIR/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--uniref30_database_path=$DOWNLOAD_DATA_DIR/uniref30/UniRef30_2021_03 \
--pdb70_database_path=$DOWNLOAD_DATA_DIR/pdb70/pdb70 \
--template_mmcif_dir=$DOWNLOAD_DATA_DIR/pdb_mmcif/mmcif_files \
--obsolete_pdbs_path=$DOWNLOAD_DATA_DIR/pdb_mmcif/obsolete.dat \
--model_preset=monomer \
--max_template_date=2022-1-1 \
--db_preset=full_dbs \
--use_gpu_relax=true \
--output_dir=out_alphafold_2.1.1_multi-monomer \
--fasta_paths=T1083.fasta,T1084.fasta