dnaMD.DNA class

Summary

DNA.get_parameters(parameter, bp[, ...])

To get the parameters over all frame for the given range of base pair/steps

DNA.set_base_pair_parameters(filename, bp[, ...])

To read and store basepairs parameters (shear, stretch, stagger, buckle, propeller and opening) from an input file.

DNA.set_helical_radius(filename, bp[, ...])

To read and set local helical radius of both strand

DNA.set_base_step_parameters(filename, bp_step)

To read and store base-step (Shift, Slide, Rise, Tilt, Roll and Twist) and helical base-step (X-disp, Y-disp, h-Rise, Inclination, Tip and h-Twist) parameters from an input file

DNA.get_mean_error(bp, parameter[, ...])

To calculate average and error of the given parameter for the given set of base-pairs/steps

DNA.set_helical_axis(filename[, step_range, ...])

To read and set local helical-axis positions from an input file.

DNA.time_vs_parameter(parameter, bp[, ...])

To get the parameter of either a specfic base-pair/step or a DNA segment as a function of time.

DNA.parameter_distribution(parameter, bp[, ...])

To get the parameter distribution of either a specific base-pair/step or a DNA segment over the MD simulation.

DNA.set_major_minor_groove(filename, bp_step)

To read and store Major and Minor grooves from an input file.

DNA.set_backbone_dihedrals(filename, bp[, ...])

To read and store backbone dihedrals (alpha, beta, gamma, delta, epsilon and zeta) and chi dihedral of both strands from an input file.

DNA.generate_smooth_axis([step_range, step, ...])

To determine the global helical axis by smoothing local axis using spline interpolation.

DNA.calculate_curvature_tangent([...])

To calculate curvatures and tangent vectors along the helical axis.

DNA.calculate_angle_bw_tangents(base_step[, ...])

To calculate angle (Radian) between two tangent vectors of global helical axis.

DNA.calculate_2D_angles_bw_tangents(paxis, ...)

To calculate angles (Radian) in 2D plane between two tangent vectors of global helical axis.

DNA.write_haxis_pdb([filename, step_range, ...])

To write trajectory of helcial-axis as a PDB format file.

Documentation

class DNA(num_bp, filename=None, startBP=1)[source]

DNA class stores all data obtained from the input files.

To initialize this class:

dna = DNA(60)       # 60 is the number of basepairs

This class also contains several methods (functions) that are discussed in following sections.

The data inside the class is organized as a dictionary in following architecture:

         |------- bp  ---- BP number      ----  Parameter array (1D/2D)
data-----|
         |------- bps ---- BP Step Number ----  Parameter array (1D/2D)
Parameters:
  • num_bp (int) – Number of base-pairs in the DNA.

  • filename (str) – Name of HDF5 file

  • startBP (int) – Number ID of first basepair.

num_bp

Number of base-pairs in the DNA.

Type:

int

num_step

Number of base-steps in the DNA.

Type:

int

filename

Name of HDF5 file

Type:

str

startBP

Number ID of first basepair.

Type:

int

smooth_axis

If axis is smoothed and global axis is already determined.

Type:

bool

data

Dictionary of data. All data are stored in this dictionary.

Type:

dictionary

time

Trajectory time.

Type:

ndarray

mask

Boolean array indicating which frame of trajectory should be masked.

Type:

ndarray

h5

h5py.File object to read the data directly from file.

Type:

object

_openFile()[source]

Open the HDF5 file for reading/writing.

This methad is called during the initialization of this class.

_set_mask(mask)[source]

Set mask array in both class and hdf5 file

_set_time(time)[source]

Set time in both class and hdf5 file

calculate_2D_angles_bw_tangents(paxis, base_step, masked=True)[source]

To calculate angles (Radian) in 2D plane between two tangent vectors of global helical axis.

Principal axis and respective 2D planes

Principal Axis

2D planes

X-axis

XY and XZ planes

Y-axis

XY and YZ planes

Z-axis

XZ and YZ planes

Parameters:
  • paxis (str) – Principal axis parallel to the DNA axis. For example, IF DNA helical axis is parallel or aligned with Z-axis, paxis should be 'Z'.

  • base_step (1D list) – List of two base-steps for which angle will be calculated. For example: base_step = [5, 50] angle between tangent vector 5th and 50th base-steps will be calculated.

  • masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria.

Returns:

  • angle-one (1D array) – Array of calculated angle in first plane of length is equal to number of frames. When masked is applied, length of this array can be smaller than total number of frames.

  • angle-two (1D array) – Array of calculated angle in second plane of length is equal to number of frames. When masked is applied, length of this array can be smaller than total number of frames.

calculate_angle_bw_tangents(base_step, cumulative=False, masked=False)[source]

To calculate angle (Radian) between two tangent vectors of global helical axis.

Parameters:
  • base_step (1D list) –

    List of two base-steps for which angle will be calculated. For example: base_step = [5, 50] either of following can be calculated.

    1. angle between tangent vector 5th and 50th base-steps.

    2. summation over 44 angles that are formed between adjacent tangent vectors of 5-50 bp DNA segment.

    See below two choose between these two types.

  • cumulative (bool)) – Default: False: If it is false, first type of angle is calculated otherwise second type of angle is calculated as explained in the above example of option base_step.

  • masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria.

Returns:

angle – Array of calculated angle of length is equal to number of frames. When masked is applied, length of this array can be smaller than total number of frames.

Return type:

1D array

calculate_curvature_tangent(step_range=False, step=None, store_tangent=False)[source]

To calculate curvatures and tangent vectors along the helical axis.

The curvature and tangent vectors are calculated using Frenet-Serret formula. The calculated values are stored in DNA.data dictionary and also in HDF5 file.

Parameters:
  • step_range (bool) –

    • step_range = True : Calculate curvature and tangent vectors for the given range of base-steps

    • step_range = False: Calculate curvature and tangent vectors for entire DNA. If smoothed helical-axis of any base-step will be found to be not available, error will be raised.

  • step (list) –

    List containing lower and higher limit of base-steps range.
    • This option only works with step_range=True.

    • This list should not contain more than two number.

    • First number should be less than second number.

    Example for base-step 4 to 15:

    step = [4,15]         # step_range = True

  • store_tangent (bool) –

    • store_tangent = True : The calculated tangent vectors will be stored for later use.

    • store_tangent = False: The calculated tangent vectors will be discarded.

    In case of HDF5 file, calculated tangents will be stored in this file and it will not add cost to memory. However, without HDF5 file, storing tangents in DNA.data will be expansive for memory.

generate_smooth_axis(step_range=False, step=None, smooth=500.0, spline=3, fill_point=6, cut_off_angle=20)[source]

To determine the global helical axis by smoothing local axis using spline interpolation.

Note

A 3D curve is fitted on local helical axis that are calculated using do_x3dna tool. Sometimes in few frames, fitting may not be accurate and produces artifact. To record these frames, DNA.mask array containing boolean values are generated. If value is True, fitting might not be correct and vice versa. This array could be used in later analysis to skip/mask the frames containing inaccurate axis.

Warning

This function requires SciPy package.

Parameters:
  • step_range (bool) –

    • step_range = True : Make axis smooth for the given range of base-steps

    • step_range = False: Make axis smooth for for entire DNA. If original helical-axis of any base-step

      will be found to be not available, error will be raised.

  • step (list) –

    List containing lower and higher limit of base-steps range.
    • This option only works with step_range=True.

    • This list should not contain more than two number.

    • First number should be less than second number.

    Example for base-step 4 to 15:

    step = [4,15]         # step_range = True

  • smooth (float) –

    A smoothing condition. For more details, see about s = None, which is passed into scipy.interpolate.splprep() method.

    Warning

    • Lower value may lead to an artifact of local sharp kink in the smoothed axis.

    • Higher value may lead to the calculation of wrong helical axis.

  • spline (int) –

    Degree of spline. For more details, see about k = 3, which is passed into scipy.interpolate.splprep() method.

  • fill_point (int) – Number of intrapolated points between two adjacent helical-axis coordinates.

  • cut_off_angle (float) –

    Cut-off bending angle to define sharp kink in fitted curve. If angle in fitted curve is larger than this cut-off, refitting will be performed after deleting few of the original helical axis positions. If after this deletions, bending angle will not reduce below cut-off angle, value of smooth will be increased by 100 and entire cycle of fitting-refitting will be performed. When, value of smooth increases to more than 10000 during this fitting-refitting cycles, fitting process will be stopped with a warning message.

    To record the frames with bad fitting, True value will be stored in DNA.mask array for respective frame.

get_mean_error(bp, parameter, err_type='std', bp_range=True, merge_bp=1, merge_method='mean', masked=False, tool='gmx analyze')[source]

To calculate average and error of the given parameter for the given set of base-pairs/steps

Warning

To calculate errors by using error = 'acf' or error = 'block', GROMACS tool g_analyze or gmx analyze should be present in $PATH.

Parameters:
  • bp (1D list or array) –

    base-pairs to analyze Example:

    bp = [6]                                # bp_range = False
    bp = [4,15]                             # bp_range = True
    bp = range(4,15)                        # bp_range = False
    bp = np.arange(4,15)                    # bp_range = False
    bp = [2,5,6,7,9,12,18]                  # bp_range = False
    

  • parameter (str) – Name of a parameter. For details about accepted keywords, see parameter in the method DNA.get_parameters().

  • error (str) –

    Method of error estimation. Currently accepted method as follows:

    • error = 'std' : Standard Deviation

    • error = 'acf' : Standard error using autocorrelation time (requires: g_analyze or gmx analyze)

    • error = 'block' : Standard error using block averaging method (requires: g_analyze or gmx analyze)

  • bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.

  • merge_bp (int) – Number of base-pairs or steps to merge for creating the small DNA segments

  • merge_method (str) –

    Method to calculate the parameter of a DNA segment from local parameters of all base-pairs/steps that are between the range given through bp. Currently accepted keywords are as follows:

    • merge_method = mean: Average of local parameters

    • merge_method = sum: Sum of local parameters

  • masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria

  • tool (str) – Gromacs tool g_analyze or gmx analyze or gmx_mpi analyze etc. It will be used to calculate autocorrelation time or block averaging error. It should be present in $PATH

Returns:

  • basepairs/steps (1D array) – Number of base pair-steps. If merge_bp>1, middle number will be returned.

  • values (1D array) – Average values of the parameter

  • errors (1D array) – Error values for corresponding average values

get_parameters(parameter, bp, bp_range=True, masked=False)[source]

To get the parameters over all frame for the given range of base pair/steps

Parameters:
  • parameter (str) –

    Input parameter name. Parameter from following list are accepted:

    • shear

    • stretch

    • stagger

    • buckle

    • propeller

    • opening

    • shift

    • slide

    • rise

    • tilt

    • roll

    • twist

    • x-disp

    • y-disp

    • h-Rise

    • inclination

    • tip

    • h-Twist

    • helical X-axis

    • helical Y-axis

    • helical z-axis

    • helical X-axis smooth

    • helical Y-axis smooth

    • helical z-axis smooth

    • curvature

    • tangent

    • radius S-1

    • radius S-2

    • major Groove

    • major Groove Refined

    • minor Groove

    • minor Groove Refined

    • alpha S1

    • beta S1

    • gamma S1

    • delta S1

    • epsilon S1

    • zeta S1

    • chi S1

    • alpha S2

    • beta S2

    • gamma S2

    • delta S2

    • epsilon S2

    • zeta S2

    • chi S2

  • bp (1D list) –

    List of base-pairs to analyze Example:

    bp = [6]                                # bp_range = False
    bp = [4,15]                             # bp_range = True
    bp = range(4,15)                        # bp_range = False
    bp = np.arange(4,15)                    # bp_range = False
    bp = [2,5,6,7,9,12,18]                  # bp_range = False
    

  • bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.

  • masked (bool) – Default=False: To skip specific frames/snapshots. dnaMD.DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() method to skip those frames where 3D fitting curve was not successful within the given critera.

Returns:

parametersparameters[bp][nframe] (2D list): where bp is number of base pairs/steps and nframe is total number of frames in the trajectory.

Return type:

2D list

parameter_distribution(parameter, bp, bins=30, merge=False, merge_method='mean', masked=False)[source]

To get the parameter distribution of either a specific base-pair/step or a DNA segment over the MD simulation.

Parameters:
  • parameter (str) – Name of a base-pair or base-step or helical parameter For details about accepted keywords, see parameter in the method DNA.get_parameters().

  • bp (1D list or array) –

    base-pairs to analyze Example:

    bp = [6]                                  # merge = False
    bp = [4,15]                               # merge = True
    

  • int (bins) – Number of bins to calculate histogram

  • merge (bool) – Default=False: As shown above, if True, bp should a list of range otherwise a list of single value. If bp = True, the parameter for the respective DNA segment could be merged or calculated by merge_method.

  • merge_method (str) –

    Method to calculate the parameter of a DNA segment from local parameters of all base-pairs/steps that are between the range given through bp. Currently accepted keywords are as follows:

    • merge_method = mean: Average of local parameters

    • merge_method = sum: Sum of local parameters

  • masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria.

Returns:

  • values (1D array) – Array containing parameter values

  • density (1D array) – Array containing density for respective parameter values

set_backbone_dihedrals(filename, bp, parameters='all', bp_range=True)[source]

To read and store backbone dihedrals (alpha, beta, gamma, delta, epsilon and zeta) and chi dihedral of both strands from an input file.

Note

  • alpha: O3’(i-1)-P-O5’-C5’

  • beta: P-O5’-C5’-C4’

  • gamma: O5’-C5’-C4’-C3’

  • delta: C5’-C4’-C3’-O3’

  • epsilon: C4’-C3’-O3’-P(i+1)

  • zeta: C3’-O3’-P(i+1)-O5’(i+1)

  • chi for pyrimidines(Y): O4’-C1’-N1-C2

  • chi for purines(R): O4’-C1’-N9-C4

Parameters:
  • filename (str) – Input file, which is generated from do_x3dna. e.g. BackBoneCHiDihedrals_g.dat

  • bp (1D list or array) –

    base-pairs to analyze Example:

    bp = [6]                                # bp_range = False
    bp = [4,15]                             # bp_range = True
    bp = range(4,15)                        # bp_range = False
    bp = np.arange(4,15)                    # bp_range = False
    bp = [2,5,6,7,9,12,18]                  # bp_range = False
    

  • parameters (str or 1D list) –

    Either All to extract all angles or list of angles as follows:

    • alpha S1

    • beta S1

    • gamma S1

    • delta S1

    • epsilon S1

    • zeta S1

    • chi S1

    • alpha S2

    • beta S2

    • gamma S2

    • delta S2

    • epsilon S2

    • zeta S2

    • chi S2

  • bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.

set_base_pair_parameters(filename, bp, parameters='all', bp_range=True)[source]

To read and store basepairs parameters (shear, stretch, stagger, buckle, propeller and opening) from an input file.

Parameters:
  • filename (str) – Input file, which is generated from do_x3dna. e.g. L-BP_g.dat

  • bp (1D list or array) –

    base-pairs to analyze Example:

    bp = [6]                                # bp_range = False
    bp = [4,15]                             # bp_range = True
    bp = range(4,15)                        # bp_range = False
    bp = np.arange(4,15)                    # bp_range = False
    bp = [2,5,6,7,9,12,18]                  # bp_range = False
    

  • parameters (1D list) –

    List of base-pairs parameters as follows:

    • shear

    • stretch

    • stagger

    • buckle4

    • propeller

    • opening6

    By default all six parameters will be extracted from the file.

  • bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.

set_base_step_parameters(filename, bp_step, parameters='all', step_range=True, helical=False)[source]

To read and store base-step (Shift, Slide, Rise, Tilt, Roll and Twist) and helical base-step (X-disp, Y-disp, h-Rise, Inclination, Tip and h-Twist) parameters from an input file

Parameters:
  • filename (str) – Input file, which is generated from do_x3dna. e.g. L-BPS_g.dat or L-BPH_g.dat.

  • bp_step (1D list or array) –

    base-steps to analyze Example:

    bp_step = [6]                                # step_range = False
    bp_step = [4,15]                             # step_range = True
    bp_step = range(4,15)                        # step_range = False
    bp_step = np.arange(4,15)                    # step_range = False
    bp_step = [2,5,6,7,9,12,18]                  # step_range = False
    

  • parameters (str or 1D list) –

    Either All to extract all parameter available in input file or list of either base-step or helical base-step parameter as follows:

    If helical = False:

    • shift

    • slide

    • rise

    • tilt

    • roll

    • twist

    If helical = True:

    • x-disp

    • y-disp

    • h-rise

    • inclination

    • tip

    • h-twist

  • step_range (bool) – Dfault=True: As shown above, if True, bp_step is taken as a range otherwise list or numpy array.

  • helical (bool) – If True, parameters in input file will be considered as helical base-steps parameters If False, parameters will be considered as base-steps parameters.

set_helical_axis(filename, step_range=False, step=None)[source]

To read and set local helical-axis positions from an input file.

Parameters:
  • filename (str) – Input file, which is generated from do_x3dna. e.g. HelAxis_g.dat

  • step_range (bool) –

    • step_range = True : read axis coordinates of base-steps for the given range of base-steps

    • step_range = False: read axis coordinates of all base-steps

  • step (list) –

    List containing lower and higher limit of base-steps range.
    • This option only works with step_range=True.

    • This list should not contain more than two number.

    • First number should be less than second number.

    Example for base-step 4 to 15:

    step = [4,15]         # step_range = True

set_helical_radius(filename, bp, atomname='P', full=False, bp_range=True)[source]

To read and set local helical radius of both strand

Parameters:
  • filename (str) – Input file, which is generated from do_x3dna. e.g. HelixRad_g.dat

  • bp (1D list or array) –

    base-pairs to analyze Example:

    bp = [6]                                # bp_range = False
    bp = [4,15]                             # bp_range = True
    bp = range(4,15)                        # bp_range = False
    bp = np.arange(4,15)                    # bp_range = False
    bp = [2,5,6,7,9,12,18]                  # bp_range = False
    

  • atomname (str) – Atom name to consider for the DNA helix (accepted keywords: P, O4*, O4', C1* and C1)

  • full (bool) – To calculate full helical radius. Overrides atomname option and uses atom P, and 1 A is added to the radius calculated by 3DNA package

  • bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.

set_major_minor_groove(filename, bp_step, parameters='all', step_range=True)[source]

To read and store Major and Minor grooves from an input file.

  • Minor groove : direct P-P distance

  • Minor Groove Refined : refined P-P distance which take into account the directions of the sugar-phosphate backbones

  • Major groove : direct P-P distance

  • Major Groove Refined : refined P-P distance which take into account the directions of the sugar-phosphate backbones

Warning

  • The major and minor grooves (direct P-P) cannot be calculated for first and last two base-steps

  • The major and minor grooves (refined P-P) cannot be calculated for first and last three base-steps

Parameters:
  • filename (str) – Input file, which is generated from do_x3dna. e.g. MGroove_g.dat

  • bp_step (1D list or array) –

    base-steps to analyze Example:

    bp_step = [6]                                # step_range = False
    bp_step = [4,15]                             # step_range = True
    bp_step = range(4,15)                        # step_range = False
    bp_step = np.arange(4,15)                    # step_range = False
    bp_step = [2,5,6,7,9,12,18]                  # step_range = False
    

  • parameters (1D list) –

    List of groove parameter names as follows:
    • minor groove

    • minor groove refined

    • major groove

    • major groove refined

    By default all four groove parameters will be extracted from the file.

  • step_range (bool) – Default=True: As shown above, if True, bp_step is taken as a range otherwise list or numpy array.

time_vs_parameter(parameter, bp, merge=False, merge_method='mean', masked=False)[source]

To get the parameter of either a specfic base-pair/step or a DNA segment as a function of time.

Parameters:
  • parameter (str) – Name of a base-pair or base-step or helical parameter. For details about accepted keywords, see parameter in the method DNA.get_parameters().

  • bp (1D list or array) –

    base-pairs to analyze Example:

    bp = [6]                                  # merge = False
    bp = [4,15]                               # merge = True
    

  • merge (bool) – Default=False. As shown above, if True, bp should a list of range otherwise a list of single value. If bp = True, the parameter for the respective DNA segment could be merged or calculated by merge_method.

  • merge_method (str) –

    Method to calculate the parameter of a DNA segment from local parameters of all base-pairs/steps that are between the range given through bp. Currently accepted keywords are as follows:

    • merge_method = mean: Average of local parameters

    • merge_method = sum: Sum of local parameters

  • masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria.

Returns:

  • time (1D array) – Array containing time of each frame from trajectory

  • value (1D array) – Array containing parameter values for each frame from trajectory

write_haxis_pdb(filename='helical_axis.pdb', step_range=False, step=None, write_smooth_axis=True, write_orig_axis=False, write_curv=False, scale_curv=1)[source]

To write trajectory of helcial-axis as a PDB format file.

Both local helical axis and global (smoothed) axis can be written to PDB file. For global axis, curvature could be written in B-factor field of PDB file.

Parameters:
  • filename (str) – Name of the output PDB format file.

  • step_range (bool) –

    • step_range = True : Make axis smooth for the given range of base-steps

    • step_range = False: Make axis smooth for entire DNA. If original helical-axis of any base-step will

      be found to be not available, error will be raised.

  • step (list) –

    List containing lower and higher limit of base-steps range.
    • This option only works with step_range=True.

    • This list should not contain more than two number.

    • First number should be less than second number.

    Example for base-step 4 to 15:

    step = [4,15]         # step_range = True

  • write_smooth_axis (bool) – Write coordinates of smoothed helical axis as chain A.

  • write_orig_axis (bool) – Write coordinates of original helical axis (output from do_x3dna) as chain B.

  • write_curv (bool) – Write curvature of smoothed helical axis in B-factor column of PDB file.

  • scale_curv (int) – Scaling of curvature. curvature * scale_curv is written in B-factor column of PDB file.