# dnaMD.DNA class¶

## Summary¶

 DNA.get_parameters(parameter, bp[, …]) To get the parameters over all frame for the given range of base pair/steps DNA.set_base_pair_parameters(filename, bp[, …]) To read and store basepairs parameters (shear, stretch, stagger, buckle, propeller and opening) from an input file. DNA.set_helical_radius(filename, bp[, …]) To read and set local helical radius of both strand DNA.set_base_step_parameters(filename, bp_step) To read and store base-step (Shift, Slide, Rise, Tilt, Roll and Twist) and helical base-step (X-disp, Y-disp, h-Rise, Inclination, Tip and h-Twist) parameters from an input file DNA.get_mean_error(bp, parameter[, …]) To calculate average and error of the given parameter for the given set of base-pairs/steps DNA.set_helical_axis(filename[, step_range, …]) To read and set local helical-axis positions from an input file. DNA.time_vs_parameter(parameter, bp[, …]) To get the parameter of either a specfic base-pair/step or a DNA segment as a function of time. DNA.parameter_distribution(parameter, bp[, …]) To get the parameter distribution of either a specific base-pair/step or a DNA segment over the MD simulation. DNA.set_major_minor_groove(filename, bp_step) To read and store Major and Minor grooves from an input file. DNA.set_backbone_dihedrals(filename, bp[, …]) To read and store backbone dihedrals (alpha, beta, gamma, delta, epsilon and zeta) and chi dihedral of both strands from an input file. DNA.generate_smooth_axis([step_range, step, …]) To determine the global helical axis by smoothing local axis using spline interpolation. DNA.calculate_curvature_tangent([…]) To calculate curvatures and tangent vectors along the helical axis. DNA.calculate_angle_bw_tangents(base_step[, …]) To calculate angle (Radian) between two tangent vectors of global helical axis. DNA.calculate_2D_angles_bw_tangents(paxis, …) To calculate angles (Radian) in 2D plane between two tangent vectors of global helical axis. DNA.write_haxis_pdb([filename, step_range, …]) To write trajectory of helcial-axis as a PDB format file.

## Documentation¶

class DNA(num_bp, filename=None, startBP=1)[source]

DNA class stores all data obtained from the input files.

To initialize this class:

dna = DNA(60)       # 60 is the number of basepairs


This class also contains several methods (functions) that are discussed in following sections.

The data inside the class is organized as a dictionary in following architecture:

         |------- bp  ---- BP number      ----  Parameter array (1D/2D)
data-----|
|------- bps ---- BP Step Number ----  Parameter array (1D/2D)

Parameters: num_bp (int) – Number of base-pairs in the DNA. filename (str) – Name of HDF5 file startBP (int) – Number ID of first basepair.
num_bp

int – Number of base-pairs in the DNA.

num_step

int – Number of base-steps in the DNA.

filename

str – Name of HDF5 file

startBP

int – Number ID of first basepair.

smooth_axis

bool – If axis is smoothed and global axis is already determined.

data

dictionary – Dictionary of data. All data are stored in this dictionary.

time

ndarray – Trajectory time.

mask

ndarray – Boolean array indicating which frame of trajectory should be masked.

h5

object – h5py.File object to read the data directly from file.

_openFile()[source]

Open the HDF5 file for reading/writing.

This methad is called during the initialization of this class.

_set_mask(mask)[source]

Set mask array in both class and hdf5 file

_set_time(time)[source]

Set time in both class and hdf5 file

calculate_2D_angles_bw_tangents(paxis, base_step, masked=True)[source]

To calculate angles (Radian) in 2D plane between two tangent vectors of global helical axis.

Principal axis and respective 2D planes
Principal Axis 2D planes
X-axis XY and XZ planes
Y-axis XY and YZ planes
Z-axis XZ and YZ planes
Parameters: paxis (str) – Principal axis parallel to the DNA axis. For example, IF DNA helical axis is parallel or aligned with Z-axis, paxis should be 'Z'. base_step (1D list) – List of two base-steps for which angle will be calculated. For example: base_step = [5, 50] angle between tangent vector 5th and 50th base-steps will be calculated. masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria. angle-one (1D array) – Array of calculated angle in first plane of length is equal to number of frames. When masked is applied, length of this array can be smaller than total number of frames. angle-two (1D array) – Array of calculated angle in second plane of length is equal to number of frames. When masked is applied, length of this array can be smaller than total number of frames.
calculate_angle_bw_tangents(base_step, cumulative=False, masked=False)[source]

To calculate angle (Radian) between two tangent vectors of global helical axis.

Parameters: base_step (1D list) – List of two base-steps for which angle will be calculated. For example: base_step = [5, 50] either of following can be calculated. angle between tangent vector 5th and 50th base-steps. summation over 44 angles that are formed between adjacent tangent vectors of 5-50 bp DNA segment. See below two choose between these two types. cumulative (bool)) – Default: False: If it is false, first type of angle is calculated otherwise second type of angle is calculated as explained in the above example of option base_step. masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria. angle – Array of calculated angle of length is equal to number of frames. When masked is applied, length of this array can be smaller than total number of frames. 1D array
calculate_curvature_tangent(step_range=False, step=None, store_tangent=False)[source]

To calculate curvatures and tangent vectors along the helical axis.

The curvature and tangent vectors are calculated using Frenet-Serret formula. The calculated values are stored in DNA.data dictionary and also in HDF5 file.

Parameters: step_range (bool) – step_range = True : Calculate curvature and tangent vectors for the given range of base-steps step_range = False: Calculate curvature and tangent vectors for entire DNA. If smoothed helical-axis of any base-step will be found to be not available, error will be raised. step (list) – List containing lower and higher limit of base-steps range. This option only works with step_range=True. This list should not contain more than two number. First number should be less than second number. Example for base-step 4 to 15: step = [4,15]         # step_range = True store_tangent (bool) – store_tangent = True : The calculated tangent vectors will be stored for later use. store_tangent = False: The calculated tangent vectors will be discarded. In case of HDF5 file, calculated tangents will be stored in this file and it will not add cost to memory. However, without HDF5 file, storing tangents in DNA.data will be expansive for memory.
generate_smooth_axis(step_range=False, step=None, smooth=500.0, spline=3, fill_point=6, cut_off_angle=20)[source]

To determine the global helical axis by smoothing local axis using spline interpolation.

Note

A 3D curve is fitted on local helical axis that are calculated using do_x3dna tool. Sometimes in few frames, fitting may not be accurate and produces artifact. To record these frames, DNA.mask array containing boolean values are generated. If value is True, fitting might not be correct and vice versa. This array could be used in later analysis to skip/mask the frames containing inaccurate axis.

Warning

This function requires SciPy package.

Parameters: step_range (bool) – step_range = True : Make axis smooth for the given range of base-steps step_range = False: Make axis smooth for for entire DNA. If original helical-axis of any base-step will be found to be not available, error will be raised. step (list) – List containing lower and higher limit of base-steps range. This option only works with step_range=True. This list should not contain more than two number. First number should be less than second number. Example for base-step 4 to 15: step = [4,15]         # step_range = True smooth (float) – A smoothing condition. For more details, see about s = None, which is passed into scipy.interpolate.splprep() method. Warning Lower value may lead to an artifact of local sharp kink in the smoothed axis. Higher value may lead to the calculation of wrong helical axis. spline (int) – Degree of spline. For more details, see about k = 3, which is passed into scipy.interpolate.splprep() method. fill_point (int) – Number of intrapolated points between two adjacent helical-axis coordinates. cut_off_angle (float) – Cut-off bending angle to define sharp kink in fitted curve. If angle in fitted curve is larger than this cut-off, refitting will be performed after deleting few of the original helical axis positions. If after this deletions, bending angle will not reduce below cut-off angle, value of smooth will be increased by 100 and entire cycle of fitting-refitting will be performed. When, value of smooth increases to more than 10000 during this fitting-refitting cycles, fitting process will be stopped with a warning message. To record the frames with bad fitting, True value will be stored in DNA.mask array for respective frame.
get_mean_error(bp, parameter, err_type='std', bp_range=True, merge_bp=1, merge_method='mean', masked=False, tool='gmx analyze')[source]

To calculate average and error of the given parameter for the given set of base-pairs/steps

Warning

To calculate errors by using error = 'acf' or error = 'block', GROMACS tool g_analyze or gmx analyze should be present in $PATH. Parameters: bp (1D list or array) – base-pairs to analyze Example: bp = [6] # bp_range = False bp = [4,15] # bp_range = True bp = range(4,15) # bp_range = False bp = np.arange(4,15) # bp_range = False bp = [2,5,6,7,9,12,18] # bp_range = False  parameter (str) – Name of a parameter. For details about accepted keywords, see parameter in the method DNA.get_parameters(). error (str) – Method of error estimation. Currently accepted method as follows: error = 'std' : Standard Deviation error = 'acf' : Standard error using autocorrelation time (requires: g_analyze or gmx analyze) error = 'block' : Standard error using block averaging method (requires: g_analyze or gmx analyze) bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array. merge_bp (int) – Number of base-pairs or steps to merge for creating the small DNA segments merge_method (str) – Method to calculate the parameter of a DNA segment from local parameters of all base-pairs/steps that are between the range given through bp. Currently accepted keywords are as follows: merge_method = mean: Average of local parameters merge_method = sum: Sum of local parameters masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria tool (str) – Gromacs tool g_analyze or gmx analyze or gmx_mpi analyze etc. It will be used to calculate autocorrelation time or block averaging error. It should be present in $PATH basepairs/steps (1D array) – Number of base pair-steps. If merge_bp>1, middle number will be returned. values (1D array) – Average values of the parameter errors (1D array) – Error values for corresponding average values
get_parameters(parameter, bp, bp_range=True, masked=False)[source]

To get the parameters over all frame for the given range of base pair/steps

Parameters: parameter (str) – Input parameter name. Parameter from following list are accepted: shear stretch stagger buckle propeller opening shift slide rise tilt roll twist x-disp y-disp h-Rise inclination tip h-Twist helical X-axis helical Y-axis helical z-axis helical X-axis smooth helical Y-axis smooth helical z-axis smooth curvature tangent radius S-1 radius S-2 major Groove major Groove Refined minor Groove minor Groove Refined alpha S1 beta S1 gamma S1 delta S1 epsilon S1 zeta S1 chi S1 alpha S2 beta S2 gamma S2 delta S2 epsilon S2 zeta S2 chi S2 bp (1D list) – List of base-pairs to analyze Example: bp = [6] # bp_range = False bp = [4,15] # bp_range = True bp = range(4,15) # bp_range = False bp = np.arange(4,15) # bp_range = False bp = [2,5,6,7,9,12,18] # bp_range = False  bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array. masked (bool) – Default=False: To skip specific frames/snapshots. dnaMD.DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() method to skip those frames where 3D fitting curve was not successful within the given critera. parameters – parameters[bp][nframe] (2D list): where bp is number of base pairs/steps and nframe is total number of frames in the trajectory. 2D list
parameter_distribution(parameter, bp, bins=30, merge=False, merge_method='mean', masked=False)[source]

To get the parameter distribution of either a specific base-pair/step or a DNA segment over the MD simulation.

Parameters: parameter (str) – Name of a base-pair or base-step or helical parameter For details about accepted keywords, see parameter in the method DNA.get_parameters(). bp (1D list or array) – base-pairs to analyze Example: bp = [6] # merge = False bp = [4,15] # merge = True  int (bins) – Number of bins to calculate histogram merge (bool) – Default=False: As shown above, if True, bp should a list of range otherwise a list of single value. If bp = True, the parameter for the respective DNA segment could be merged or calculated by merge_method. merge_method (str) – Method to calculate the parameter of a DNA segment from local parameters of all base-pairs/steps that are between the range given through bp. Currently accepted keywords are as follows: merge_method = mean: Average of local parameters merge_method = sum: Sum of local parameters masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria. values (1D array) – Array containing parameter values density (1D array) – Array containing density for respective parameter values
set_backbone_dihedrals(filename, bp, parameters='all', bp_range=True)[source]

To read and store backbone dihedrals (alpha, beta, gamma, delta, epsilon and zeta) and chi dihedral of both strands from an input file.

Note

• alpha: O3’(i-1)-P-O5’-C5’
• beta: P-O5’-C5’-C4’
• gamma: O5’-C5’-C4’-C3’
• delta: C5’-C4’-C3’-O3’
• epsilon: C4’-C3’-O3’-P(i+1)
• zeta: C3’-O3’-P(i+1)-O5’(i+1)
• chi for pyrimidines(Y): O4’-C1’-N1-C2
• chi for purines(R): O4’-C1’-N9-C4
Parameters: filename (str) – Input file, which is generated from do_x3dna. e.g. BackBoneCHiDihedrals_g.dat bp (1D list or array) – base-pairs to analyze Example: bp = [6] # bp_range = False bp = [4,15] # bp_range = True bp = range(4,15) # bp_range = False bp = np.arange(4,15) # bp_range = False bp = [2,5,6,7,9,12,18] # bp_range = False  parameters (str or 1D list) – Either All to extract all angles or list of angles as follows: alpha S1 beta S1 gamma S1 delta S1 epsilon S1 zeta S1 chi S1 alpha S2 beta S2 gamma S2 delta S2 epsilon S2 zeta S2 chi S2 bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.
set_base_pair_parameters(filename, bp, parameters='all', bp_range=True)[source]

To read and store basepairs parameters (shear, stretch, stagger, buckle, propeller and opening) from an input file.

Parameters: filename (str) – Input file, which is generated from do_x3dna. e.g. L-BP_g.dat bp (1D list or array) – base-pairs to analyze Example: bp = [6] # bp_range = False bp = [4,15] # bp_range = True bp = range(4,15) # bp_range = False bp = np.arange(4,15) # bp_range = False bp = [2,5,6,7,9,12,18] # bp_range = False  parameters (1D list) – List of base-pairs parameters as follows: shear stretch stagger buckle4 propeller opening6 By default all six parameters will be extracted from the file. bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.
set_base_step_parameters(filename, bp_step, parameters='all', step_range=True, helical=False)[source]

To read and store base-step (Shift, Slide, Rise, Tilt, Roll and Twist) and helical base-step (X-disp, Y-disp, h-Rise, Inclination, Tip and h-Twist) parameters from an input file

Parameters: filename (str) – Input file, which is generated from do_x3dna. e.g. L-BPS_g.dat or L-BPH_g.dat. bp_step (1D list or array) – base-steps to analyze Example: bp_step = [6] # step_range = False bp_step = [4,15] # step_range = True bp_step = range(4,15) # step_range = False bp_step = np.arange(4,15) # step_range = False bp_step = [2,5,6,7,9,12,18] # step_range = False  parameters (str or 1D list) – Either All to extract all parameter available in input file or list of either base-step or helical base-step parameter as follows: If helical = False: shift slide rise tilt roll twist If helical = True: x-disp y-disp h-rise inclination tip h-twist step_range (bool) – Dfault=True: As shown above, if True, bp_step is taken as a range otherwise list or numpy array. helical (bool) – If True, parameters in input file will be considered as helical base-steps parameters If False, parameters will be considered as base-steps parameters.
set_helical_axis(filename, step_range=False, step=None)[source]

To read and set local helical-axis positions from an input file.

Parameters: filename (str) – Input file, which is generated from do_x3dna. e.g. HelAxis_g.dat step_range (bool) – step_range = True : read axis coordinates of base-steps for the given range of base-steps step_range = False: read axis coordinates of all base-steps step (list) – List containing lower and higher limit of base-steps range. This option only works with step_range=True. This list should not contain more than two number. First number should be less than second number. Example for base-step 4 to 15: step = [4,15]         # step_range = True
set_helical_radius(filename, bp, atomname='P', full=False, bp_range=True)[source]

To read and set local helical radius of both strand

Parameters: filename (str) – Input file, which is generated from do_x3dna. e.g. HelixRad_g.dat bp (1D list or array) – base-pairs to analyze Example: bp = [6] # bp_range = False bp = [4,15] # bp_range = True bp = range(4,15) # bp_range = False bp = np.arange(4,15) # bp_range = False bp = [2,5,6,7,9,12,18] # bp_range = False  atomname (str) – Atom name to consider for the DNA helix (accepted keywords: P, O4*, O4', C1* and C1) full (bool) – To calculate full helical radius. Overrides atomname option and uses atom P, and 1 A is added to the radius calculated by 3DNA package bp_range (bool) – Default=True: As shown above, if True, bp is taken as a range otherwise list or numpy array.
set_major_minor_groove(filename, bp_step, parameters='all', step_range=True)[source]

To read and store Major and Minor grooves from an input file.

• Minor groove : direct P-P distance
• Minor Groove Refined : refined P-P distance which take into account the directions of the sugar-phosphate backbones
• Major groove : direct P-P distance
• Major Groove Refined : refined P-P distance which take into account the directions of the sugar-phosphate backbones

Warning

• The major and minor grooves (direct P-P) cannot be calculated for first and last two base-steps
• The major and minor grooves (refined P-P) cannot be calculated for first and last three base-steps
Parameters: filename (str) – Input file, which is generated from do_x3dna. e.g. MGroove_g.dat bp_step (1D list or array) – base-steps to analyze Example: bp_step = [6] # step_range = False bp_step = [4,15] # step_range = True bp_step = range(4,15) # step_range = False bp_step = np.arange(4,15) # step_range = False bp_step = [2,5,6,7,9,12,18] # step_range = False  parameters (1D list) – List of groove parameter names as follows: minor groove minor groove refined major groove major groove refined By default all four groove parameters will be extracted from the file. step_range (bool) – Default=True: As shown above, if True, bp_step is taken as a range otherwise list or numpy array.
time_vs_parameter(parameter, bp, merge=False, merge_method='mean', masked=False)[source]

To get the parameter of either a specfic base-pair/step or a DNA segment as a function of time.

Parameters: parameter (str) – Name of a base-pair or base-step or helical parameter. For details about accepted keywords, see parameter in the method DNA.get_parameters(). bp (1D list or array) – base-pairs to analyze Example: bp = [6] # merge = False bp = [4,15] # merge = True  merge (bool) – Default=False. As shown above, if True, bp should a list of range otherwise a list of single value. If bp = True, the parameter for the respective DNA segment could be merged or calculated by merge_method. merge_method (str) – Method to calculate the parameter of a DNA segment from local parameters of all base-pairs/steps that are between the range given through bp. Currently accepted keywords are as follows: merge_method = mean: Average of local parameters merge_method = sum: Sum of local parameters masked (bool) – Default=False. To skip specific frames/snapshots. DNA.mask array should be set to use this functionality. This array contains boolean (either True or False) value for each frame to mask the frames. Presently, mask array is automatically generated during DNA.generate_smooth_axis() to skip those frames where 3D fitting curve was not successful within the given criteria. time (1D array) – Array containing time of each frame from trajectory value (1D array) – Array containing parameter values for each frame from trajectory
write_haxis_pdb(filename='helical_axis.pdb', step_range=False, step=None, write_smooth_axis=True, write_orig_axis=False, write_curv=False, scale_curv=1)[source]

To write trajectory of helcial-axis as a PDB format file.

Both local helical axis and global (smoothed) axis can be written to PDB file. For global axis, curvature could be written in B-factor field of PDB file.

Parameters: filename (str) – Name of the output PDB format file. step_range (bool) – step_range = True : Make axis smooth for the given range of base-steps step_range = False: Make axis smooth for entire DNA. If original helical-axis of any base-step will be found to be not available, error will be raised. step (list) – List containing lower and higher limit of base-steps range. This option only works with step_range=True. This list should not contain more than two number. First number should be less than second number. Example for base-step 4 to 15: step = [4,15]         # step_range = True write_smooth_axis (bool) – Write coordinates of smoothed helical axis as chain A. write_orig_axis (bool) – Write coordinates of original helical axis (output from do_x3dna) as chain B. write_curv (bool) – Write curvature of smoothed helical axis in B-factor column of PDB file. scale_curv (int) – Scaling of curvature. curvature * scale_curv is written in B-factor column of PDB file.