SI2011 track3 Alanine Dipeptide aMD simulation

From NBCRwiki

Jump to: navigation, search

Back to Track 3

Contents

Introduction

In this short tutorial, we will use N-acetyl-N '-methyl-alaninamide, often referred as alanine dipeptide, as an example system to introduce accelerated molecular dynamics (aMD). aMD is an enhanced sampling method that improves the conformational space sampling by reducing energy barriers separating different states of a system. The method modifies the potential energy landscape by raising energy wells that are below a certain threshold level, while leaving those above this level unaffected. As a result, barriers separating adjacent energy basins are reduced, allowing the system to sample conformational space that cannot be easily accessed in a classical MD (cMD) simulation.

In this tutorial, we will use the NAMD implementation of aMD

Loading the alanine dipeptide system in VMD

All the tutorial files can be downloaded as ALA.zip

First, we need to identify the directory where all the CADD workflow files are stored. By default, this should be under your home directory, a folder named CADDworkflows/. If you cannot find the directory, let an instructor know. Once you enter the CADDworkflows/ directory, go to MDAnlaysis/AMDNBCR_0.2_input/, unzip the downloaded file. Note that the existing files in that directory are not complete and you should download the zip file from the above link and unzip it in the AMDNBCR_0.2_input/ directory.

Once you have all the files ready, start the VMD program and load the topology file wtdia.psf and the coordinate file wtdia.pdb file. From the VMD Main menu, choose 'Graphics' and then 'Representations'. Type 'segname DIA' in the 'Representations' window---this will hide the water box, leaving only the alanine dipeptide in display. Change the 'drawing method' to CPK, and from the main menu, choose 'Mouse'-->'Label'-->'Atom', then click on the following five atoms in the backbone of the alanine dipeptide: CA, NL, CLP, CRP and NR. You should see these atoms labeled as shown in the figure below.

ALA1.png

These atoms are involved in the alanine dipeptide backbone dihedrals, Phi and Psi. To measure the two dihedrals, choose 'Mouse'-->'Label'-->'Dihedral' from the main menu, then click on the following four atoms in sequence: CLP-->NL-->CA-->CRP. The highlighted dihedral (Phi) should show a value of -82.49. Similarly, click on the next four atoms in sequence: NL-->CA-->CRP-->NR, and you'll see another dihedral (Psi) highlighted, which has a value of 78.28.

The Phi-Psi dihedrals of alanine dipeptide are canonical representations of protein backbone conformations, and are often used in the parameterization of MD force fields. Despite the small system size, sampling of the Phi-Psi dihedrals is non-trivial, especially in the gas phase. Here, we will use aMD to enhance the sampling of these two dihedral angles.

Editing the NAMD input file

In the AMDNBCR_0.2_input/ directory, find the file 'control-cMD.conf' and open it using any text editor. This is the configuration file for a cMD simulation of the alanine dipeptide system, which is used as a control of the aMD results. You should be familiar with the format of this file by this point--for instance,

 structure          wtdia.psf
 coordinates        wtdia.pdb

specifies the topology file and the coordinate file for the simulation, and

  run 50000

specifies the number of steps the cMD simulation will run.

CMDconf.png

Now open the sample-aMD.conf using your text editor. Most of the file is identical with the control-cMD.conf we just examined. At the bottom of this file, however, you should see a few extra lines:

 # AMD Settings
 accelMD             on
 accelMDdihe         on
 accelMDE            14.
 accelMDalpha        5.5
 accelMDOutFreq      500

These are aMD-specific settings. Briefly, the 'accelMD' parameter turns on the aMD calculation, and the 'accelMDdihe' parameter specifies the aMD mode---here we are only applying the boost potential to the dihedral degree of freedom in the system. Other options include 'total boost' and 'dual boost' (see the above link for more info). The parameter 'accelMDE' specifies the threshold energy of the system, which is set to 14 kcal/mol; while the 'accelMDalpha' parameter helps determine the shape of the new potential surface---small accelMDalpha values will result in more flattened energy landscapes. Finally, the 'accelMDOutFreq' parameter controls the frequency of aMDoutput. It is recommended that you keep this parameter the same as the 'dcdfreq' parameter value.

Running aMD through the CADD workflow

Now we are ready to run the aMD simulation using the CADD pipeline. If you haven't already, start the pipeline program, and go to 'MD analysis'-->'Load workflow'-->'AMDNBCR', as shown in the figure below:

AMDstart.png

You should find four input options in the loaded AMDNBCR workflow, as shown here:

AMDready.png

The four input options control the aMD simulation performed through the CADD pipeline: 'Config File Browser' allows you to specify the NAMD configuration file (control-cMD.conf or sample-aMD.conf), and 'Input Directory Browser' lets you select the directory where the rest of the input files are stored. Like any regular MD simulation, an aMD simulation requires the topology and coordinate files of a system, as well as a force field parameter file (par_all27_prot_lipid.prm). In this case, because we're starting from the result of a previous simulation, where we minimized and pre-equilibrated the system, we also have three additional files: mineq-wtdia-01.coor, mineq-wtdia-01.vel and mineq-wtdia-01.xsc. All of these files should be put in the directory specified at 'Input Directory Browser'. Finally, 'NumProcessors' is the number of CPUs that you will use to run the simulation--given the number of workshop participants, we suggest that you use no more than 8 processors for your job.

Again, please make sure that you have downloaded the .zip file and put all the unzipped files in the AMDNBCR_0.2_input/ directory. Once you have the files ready, click on the green thunder label on the CADD pipeline menu bar:

AMDrun.png

You should find that the 'AMDws' label in the middle of the workflow window turns red:

AMDrunning.png

This indicates that the aMD workflow is running. The simulation should finish within a few minutes, and once it's done, the 'AMDws' label will change back to its original color.

Examining the output files

Once the aMD simulation is finished, you can change the config file you specified at 'Input Directory Browser' to 'control-cMD.conf', and repeat the above process to run this simulation. You can also modify the parameters in these files as you wish, and repeat the simulations multiple times.

Once you finished running all the simulations, you may examine the output files in the directory specified in the previous section. If you used the default settings, the output files should be found at your home directory, under CADDworkflows-->MDAnlaysis-->AMDNBCR_0.2_output. If you performed multiple runs, the output files from each run will be put in individual subdirectories:

AMDout.png

The namd.log file in each subdirectory contains the log information from each run. For an aMD simulation, it also has aMD-specific output. Open this file in a text editor, and find lines that start with 'ACCELERATED MD'. For instance:

AMDlog.png

This line contains the values of various energy components, which should be identical with the values reported following the 'ACCELERATED MD' line. The key information here is the boost potential at the current timestep following 'dV'. This value can be extracted from the log file and then used to reconstruct the ensemble average introduced earlier in class. If you're using a mac or running linux/unix, the following command will extract the dV values:

 awk '/ACCELERATED MD:/ {print $6}' namd.log > namd.dV.dat

The sample-aMD.dcd file is the trajectory of the aMD simulation, which can be loaded in VMD for visual examination. Additionally, we can calculate the Phi-Psi dihedrals from the trajectory. From your VMD main menu, choose 'Extensions'-->'Tk console'. Type or copy/paste the following script:

set outfile [open TEMP-dihe.dat w]     ;# put the dihedral values in a file named 'TEMP-dihe.dat'
label add Dihedrals {0/4} {0/6} {0/8} {0/14}    
label add Dihedrals {0/6} {0/8} {0/14} {0/16}   

The label command is the command-line version of the Mouse-->label option we used previously. The above two commands will select the two dihedrals, Phi and Psi, by specifying the four atoms involved in each dihedral. For instance, {0/4} specifies the first atom in the first dihedral: 0 is the molecule ID and 4 is the index of the atom CLP. For more info on the 'label' command, see the VMD manual

set num [molinfo top get numframes]   ;# this is the total number of frames in the trajectory
# now we start looping through each frame
for {set i 0} {$i < $num} {incr i} {
  animate goto $i    ;# goto frame i
  set dlist [label list Dihedrals]    ;# obtain the values of all dihedrals selected earlier
  set record0 [lindex $dlist 0]     ;# get the value of the first dihedral
  set record1 [lindex $dlist 1]     ;# get the value of the second dihedral
  puts $outfile "[lindex $record0 4] \t [lindex $record1 4]"    ;# write to the output file
}
close $outfile    ;# finish writing

With the namd.dV.dat file and the TEMP-dihe.dat file, we can construct the phi-psi free energy profile for the alanine dipeptide system. The basic idea is to construct a 2D histogram of the phi-psi dihedrals and then count the weighted probability that the system visits each 'bin' in the histogram. Due to the limited simulation time and restricted access to data analysis programs, we will not construct the free energy profiles here. Instead, we will use previously computed results to showcase the effect of aMD.

Pre-calculated free energy profiles

The free energy profile on the left is obtained from a 250-ns cMD simulation of the gas-phase alanine dipeptide, and the profiles on the right are obtained from 250-ns aMD simulations of the gas-phase alanine dipeptide. Note the enhanced sampling of the Phi-Psi space in aMD simulations.

Overall, 44-84% of the phi-psi space is sampled aMD, whereas only 39% of the phi-psi space is sampled in cMD. The sampling is enhanced as more aggressive aMD parameters are used (from upper left to lower right in the aMD graph). However, the statistical noise associated with the results also increases. The optimum set of aMD parameters is chosen based on the balance between the enhanced sampling performance and the statistical noise of the results. More details on the analysis and the results of solvated alanine dipeptide

CMDresult.png

AMDphipsi.png

Back to Track 3

Personal tools