residue_movement#

Create and Analyze Pairwise Stacking Fingerprints (PSFs)

This module contains the functions to create and analyze Pairwise Stacking Fingerprints. It allows the user to generate Polar Scatterplots and Heatmaps of one residue movement relative to the other.

This pipeline takes a pdb file with multiple frames and two residues. It then calculates the r, rho, and theta values at each frame as defined in the Bottaro paper (https://doi.org/10.1093/nar/gku972), and exports these values to a .csv file.

Functions

calc_center_3pts(a, b, c)

Finds the average x, y, z position of three (x, y, z) Vectors.

calculate_bottaro_values_for_frame(...)

Calculates the r, rho, and theta values as expressed in the Bottaro paper.

collect_atom_locations_by_frame(trj, ...)

Creates a list of all atom locations for a particular atom and residue number per frame.

correct_theta_sign(rho, y_axis, theta)

Corrects the sign of an angle theta with the x-axis within a plane defined by a given y-axis.

create_base_from_coords_list(frame, ...)

Combines C2, C4, C6 positions with midpoint positions for a given frame.

write_bottaro_to_csv([pdb, outcsv, ...])

Write the r, rho, and theta values used to make a Pairwise Stacking Fingerprint (PSF) from a trajectory PDB to a CSV.

write_psf_data([pdb, outcsv, pers_res, ...])

Alias for write_bottaro_to_csv().

Classes

Base(c2_coords, c4_coords, c6_coords, ...)

Represents a nucleotide base with x, y, z coordinates for C2, C4, and C6 atoms, and their average position at a single frame.

collect_atom_locations_by_frame(trj, residue, atom)[source]#

Creates a list of all atom locations for a particular atom and residue number per frame.

Curates a list coords_by_frame where coords_by_frame[i] is the (x, y, z) positions of a provided atom in a residue residue at the ith frame.

Parameters:
trjmd.Trajectory

Trajectory to analyze.

residueint

The 0-indexed residue number of the residue where atom_id is found (PDB Column 5).

atomstr

The name of the atom to get coordinates for (PDB Column 2).

Returns:
coords_by_framelist

List of (x, y, z) coordinates of atom_id in residue_num for each frame.

See also

Base

Python Class that represents a nucleotide base

calculate_bottaro_values_for_frame

Calculates the r, rho, and theta values as expressed in the Bottaro paper.

create_base_from_coords_list

Combines C2, C4, C6 positions with midpoint positions for a given frame

Notes

residue_num must be 0-indexed to match how mdtraj.Trajectory indexes residues.

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                              'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                              atoms = {'C2','C4','C6'})
>>> >>> st.collect_atom_locations_by_frame(filtered_traj, residue = 3, atom = "C2")
[(58.794, 59.636, 49.695), (59.185005, 58.797, 50.137), (59.379, 58.553005, 49.853), 
(58.76, 59.068, 49.681), (59.003, 59.054, 49.878002), (59.049, 58.967, 50.051), 
(59.219, 58.476006, 49.948), (58.948, 58.588005, 50.085), (58.922, 58.747, 49.766003), 
(59.124, 58.916, 49.978004)]
calc_center_3pts(a, b, c)[source]#

Finds the average x, y, z position of three (x, y, z) Vectors.

Takes in three Vectors generated using the pdb.xyz method and finds their center. Works with three points that make up a triangle (like those within a 6-member ring).

Parameters:
aVector

x, y, z coordinates of the first point.

bVector

x, y, z coordinates of the second point.

cVector

x, y, z coordinates of the third point.

Returns:
midpointVector

One Vector with (x, y, z) coordinates at the center of the three input Vectors.

See also

Base

Python Class that represents a nucleotide base

calculate_bottaro_values_for_frame

Calculates the r, rho, and theta values as expressed in the Bottaro paper.

create_base_from_coords_list

Combines C2, C4, C6 positions with midpoint positions for a given frame

Examples

>>> import stacker as st
>>> print(st.calc_center_3pts(st.Vector(0,0,0), st.Vector(1,1,1), st.Vector(2,2,2)))
[ 1.0
  1.0
  1.0 ]
class Base(c2_coords, c4_coords, c6_coords, midpoint_coords)[source]#

Bases: object

Represents a nucleotide base with x, y, z coordinates for C2, C4, and C6 atoms, and their average position at a single frame.

This class defines a data type ‘Base’ that consists of coordinates for the atoms C2, C4, and C6, as well as the midpoint of these atoms for a single residue at a single frame.

Attributes:
c2_coordsVector

(x, y, z) coordinates for the atom C2.

c4_coordsVector

(x, y, z) coordinates for the atom C4.

c6_coordsVector

(x, y, z) coordinates for the atom C6.

midpoint_coordsVector

(x, y, z) coordinates representing the midpoint of C2, C4, and C6.

__init__(c2_coords, c4_coords, c6_coords, midpoint_coords)[source]#

Initialize a Base instance.

Parameters:
c2_coordstuple

(x, y, z) coordinates for C2.

c4_coordstuple

(x, y, z) coordinates for C4.

c6_coordstuple

(x, y, z) coordinates for C6.

midpoint_coordstuple

(x, y, z) coordinates for the midpoint.

create_base_from_coords_list(frame, C2_coords, C4_coords, C6_coords, midpoint_coords)[source]#

Combines C2, C4, C6 positions with midpoint positions for a given frame.

Takes a frame (0-indexed) and outputs a Base instance with the x, y, z locations of C2, C4, C6, and midpoint of the same residue at that frame.

Parameters:
frameint

Frame number (0-indexed).

C2_coordslist

List of (x, y, z) coordinates of C2 atom in some residue for each frame.

C4_coordslist

List of (x, y, z) coordinates of C4 atom in some residue for each frame.

C6_coordslist

List of (x, y, z) coordinates of C6 atom in some residue for each frame.

midpoint_coordslist

List of (x, y, z) coordinates of the midpoint of residue for each frame.

Returns:
Base

An instance of Base with coordinates at the specified frame.

See also

calc_center_3pts

Finds the average x, y, z position of three (x, y, z) Vectors

Notes

frame is 0-indexed because this is how mdtraj handles mdtraj.Trajectory

correct_theta_sign(rho, y_axis, theta)[source]#

Corrects the sign of an angle theta with the x-axis within a plane defined by a given y-axis.

When calculating the angle theta between two vectors in 3D space, once the vectors move >180 degrees apart, the angle becomes the shortest path. To have 360 degrees of freedom, we calculate theta within the plane by checking if it faces the same direction as a vector y, and correcting otherwise.

Parameters:
rhoVector

The vector compared to the x-axis to form theta.

y_axisVector

Directional vector to define a plane with x-axis; orthogonal to x-axis.

thetafloat

The calculated angle of rho with x-axis to be corrected.

Returns:
float

Theta as calculated on the plane.

calculate_bottaro_values_for_frame(perspective_base_coords, viewed_midpoint)[source]#

Calculates the r, rho, and theta values as expressed in the Bottaro paper.

Calculates the r, rho, and theta values between two nucleotides in a single frame as presented in Figure 1 of Bottaro et. al (https://doi.org/10.1093/nar/gku972).

Parameters:
perspective_base_coordsBase

List of the x, y, z coords of C2, C4, C6 and their midpoint for the perspective residue in a single frame.

viewed_midpointVector

x, y, z position of the midpoint of the viewed nucleotide at the same frame.

Returns:
list

A list containing 3 floats from Bottaro; structure: [r_dist, rho_dist, theta].

References

[1] Sandro Bottaro, Francesco Di Palma, Giovanni Bussi, The role of nucleobase interactions in RNA structure and dynamics, Nucleic Acids Research, Volume 42, Issue 21, 1 December 2014, Pages 13306–13314, https://doi.org/10.1093/nar/gku972

write_bottaro_to_csv(pdb='', outcsv='', pers_res=-1, view_res=-1, res1_atoms={'C2', 'C4', 'C6'}, res2_atoms={'C2', 'C4', 'C6'}, index=1)[source]#

Write the r, rho, and theta values used to make a Pairwise Stacking Fingerprint (PSF) from a trajectory PDB to a CSV.

Calculates the r, rho, and theta values as described in Bottaro et al. from a perspective nucleotide residue to a viewed nucleotide residue per frame. Writes the results to a CSV file.

Parameters:
pdbstr

Filename of PDB containing information for ONLY two residues (perspective and viewed nucleotide) at each frame.

outcsvstr

Filename of CSV file to write to.

pers_resint, default = -1

Residue index of the perspective residue whose plane to project onto (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and perspective id is the first res_id.

view_resint, default = -1

Residue index of the viewed residue whose midpoint to project to perspective residue plane (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and viewed id is the second res_id.

res1_atomsset, default = {“C2”, “C4”, “C6”}

Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 1 to find center of geometry for perspective nucleotide.

res2_atomsset, default = {“C2”, “C4”, “C6”}

Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 2 to find center of geometry for viewed nucleotide.

indexint, default = 1

Index of the residues. 1-indexed (default) means residue ids start at 1. cpptraj uses 1-indexed residues. mdtraj PDB outputs will be 0-indexed.

See also

filter_traj_to_pdb

convert a trajectory and topology file into a single filtered PDB file to input here

visualize_two_residue_movement_scatterplot

Visualize Output CSV data as a PSF scatterplot from this data

visualize_two_residue_movement_heatmap

Visualize Output CSV data as a PSF heatmap from this data

References

[1] Sandro Bottaro, Francesco Di Palma, Giovanni Bussi, The role of nucleobase interactions in RNA structure and dynamics, Nucleic Acids Research, Volume 42, Issue 21, 1 December 2014, Pages 13306–13314, https://doi.org/10.1093/nar/gku972

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, pers_res=perspective_residue, view_res=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> print("".join(open(output_csv_name).readlines()[:10]))
frame,r_dist,rho_dist,theta
0,7.5253415,6.5321836,204.02934901525177
1,6.884639,6.0513134,199.40647902703924
2,7.301847,6.151191,205.1906453260924
3,6.5815425,5.461494,199.5421877249345
4,7.0760417,5.3919506,204.0150121540755
5,7.2589145,6.3483577,201.32674968542617
6,7.4929285,6.414151,205.92967194025135
7,7.1484976,6.035165,202.88441276229827
8,7.344863,5.541237,217.30043061558888
write_psf_data(pdb='', outcsv='', pers_res=-1, view_res=-1, res1_atoms={'C2', 'C4', 'C6'}, res2_atoms={'C2', 'C4', 'C6'}, index=1)[source]#

Alias for write_bottaro_to_csv().

Write the r, rho, and theta values used to make a Pairwise Stacking Fingerprint (PSF) from a trajectory PDB to a CSV.

Calculates the r, rho, and theta values as described in Bottaro et al. from a perspective nucleotide residue to a viewed nucleotide residue per frame. Writes the results to a CSV file.

Parameters:
pdbstr

Filename of PDB containing information for ONLY two residues (perspective and viewed nucleotide) at each frame.

outcsvstr

Filename of CSV file to write to.

pers_resint, default = -1

Residue index of the perspective residue whose plane to project onto (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and perspective id is the first res_id.

view_resint, default = -1

Residue index of the viewed residue whose midpoint to project to perspective residue plane (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and viewed id is the second res_id.

res1_atomsset, default = {“C2”, “C4”, “C6”}

Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 1 to find center of geometry for perspective nucleotide.

res2_atomsset, default = {“C2”, “C4”, “C6”}

Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 2 to find center of geometry for viewed nucleotide.

indexint, default = 1

Index of the residues. 1-indexed (default) means residue ids start at 1. cpptraj uses 1-indexed residues. mdtraj PDB outputs will be 0-indexed.

See also

filter_traj_to_pdb

convert a trajectory and topology file into a single filtered PDB file to input here

visualize_two_residue_movement_scatterplot

Visualize Output CSV data as a PSF scatterplot from this data

visualize_two_residue_movement_heatmap

Visualize Output CSV data as a PSF heatmap from this data

References

[1] Sandro Bottaro, Francesco Di Palma, Giovanni Bussi, The role of nucleobase interactions in RNA structure and dynamics, Nucleic Acids Research, Volume 42, Issue 21, 1 December 2014, Pages 13306–13314, https://doi.org/10.1093/nar/gku972

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, pers_res=perspective_residue, view_res=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> print("".join(open(output_csv_name).readlines()[:10]))
frame,r_dist,rho_dist,theta
0,7.5253415,6.5321836,204.02934901525177
1,6.884639,6.0513134,199.40647902703924
2,7.301847,6.151191,205.1906453260924
3,6.5815425,5.461494,199.5421877249345
4,7.0760417,5.3919506,204.0150121540755
5,7.2589145,6.3483577,201.32674968542617
6,7.4929285,6.414151,205.92967194025135
7,7.1484976,6.035165,202.88441276229827
8,7.344863,5.541237,217.30043061558888