stacker package

stacker package#

Submodules#

stacker.file_manipulation module#

Filter Trajectory Files

This module contains functions for manipulating trajectory files. This includes filtering a trajectory to desired atoms, converting trajectory filetype, and outputting Python trajectories to other filetypes (eg. prmtop, mdcrd, pdb)

filter_traj(trj_file, top_file, residues={}, atoms={})[source]#

Filters an input trajectory to only the specified atoms and residues

Filteres an input trajectory that contains all of the atoms in a topology to only the desired atoms at the desired residues (eg. the atoms necessary to find the center of geometry of a residue). If residues or atoms are empty, all residues or atoms are included respectively.

Parameters:

trj_filestr: filepath of the trajectory
top_filestr: filepath of the topology of the molecule
residuesset or str: 1-indexed residue numbers of residues to keep in the trajectory. Accepts smart-indexed str representing a list of residues (e.g ‘1-5,6,39-48’). If Empty, include all residues.
atomsset: atomnames to keep in the trajectory. If Empty, include all atoms.

Returns:

filtered_trajectorymdtraj.Trajectory: a trajectory object representing the filtered structure across all frames

See also

filter_traj_to_pdb: Filters an input trajectory to only the specified atoms and residues and outputs to pdb
mdtraj.Trajectory: The Trajectory object in mdtraj package

Notes

Inputed trajectory should have 1-indexed Residue Indices, Outputed trajectory object will be 0-indexed.

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                             residues = {426,427}, 
...                             atoms = {'C2','C4','C6'})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
>>> table, bonds = filtered_traj.topology.to_dataframe()
>>> print(table)
serial name element  resSeq resName  chainID segmentID
0   None   C6       C     425       G        0          
1   None   C2       C     425       G        0          
2   None   C4       C     425       G        0          
3   None   C6       C     426       C        0          
4   None   C4       C     426       C        0          
5   None   C2       C     426       C        0       

Residue Number support SmartIndexing

>>> import stacker as st
>>> filtered_traj = st.filter_traj('../testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             '../testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                              residues = '1-16,25,50-57', 
...                              atoms = {'C2','C4','C6'})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
>>> filtered_traj
<mdtraj.Trajectory with 10 frames, 75 atoms, 25 residues, without unitcells at 0x1156c3ed0>

filter_traj_to_pdb(trj_file, top_file, pdb, residues={}, atoms={})[source]#

Filters an input trajectory to only the specified atoms and residues and outputs to pdb

Filteres an input trajectory that contains all of the atoms in a trajectory to only the desired atoms at the desired residues (eg. the atoms necessary to find the center of geometry of a residue) and writes the output to a specified pdb file. If residues or atomnames are empty, all residues or atoms are included respectively.

Parameters:

trj_filestr: path to file of the concatenated trajectory. Should be resampled to the 1 in 50 frames sampled trajectories for each replicate.
top_filestr: path to file of the topology of the molecule
pdbstr: path to the output pdb file
residuesset or str: 1-indexed residue numbers of residues to keep in the trajectory. Accepts smart-indexed str representing a list of residues (e.g ‘1-5,6,39-48’) If Empty, include all residues.
atomnamesset: atomnames to keep in the trajectory

Returns:

None

See also

filter_traj: Filters an input trajectory to only the specified atoms and residues

Notes

Inputed trajectory should have 1-indexed Residue Indices, Outputed trajectory object will be 0-indexed.

file_convert(trj_file, top_file, outfile)[source]#

Converts a trajectory input file to a new output type.

The output file type is determined by the outfile extension. Uses mdtraj.save() commands to convert trajectory files to various file types such as mdtraj.save_mdcrd(), mdtraj.save_pdb(), mdtraj.save_xyz(), etc.

Parameters:

trj_filestr: Path to the file of the concatenated trajectory (eg. .mdcrd file).
top_filestr: Path to the file of the topology of the molecule (.prmtop file).
outfilestr: Output filename (include .mdcrd, .pdb, etc.).

Returns:

None

See also

mdtraj.load: Load trajectory files
mdtraj.save: Save md.Trajectory to file
mdtraj.load_xyz: Load a .xyz trajectory file

Notes

Output filetype determined from file extension of output_file parameter.

Examples

>>> import stacker as st
>>> import mdtraj as md
>>> st.file_convert('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                 'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                 'stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.xyz')
WARNING: Output file atom, residue, and chain indices are zero-indexed
Trajectory written to: stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.xyz
>>> md.load_xyz('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.xyz', 
...             top='stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop')
<mdtraj.Trajectory with 10 frames, 12089 atoms, 494 residues, without unitcells at 0x10bb75cd0>

class SmartIndexingAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None, deprecated=False)[source]#

Bases: Action

Custom argparse action to handle smart indexing of frame numbers.

Parses a comma-separated list of frame numbers with optional ranges (e.g., ‘1-20, 34, 25, 50-100’) and generates a list of individual frame numbers. Modifies the namespace by setting the attribute specified by the ‘dest’ parameter to the list of individual frame numbers.

Parameters:

parser: argparse.ArgumentParser: The argparse parser object.
namespace: argparse.Namespace: The argparse namespace.
values: str: The input string containing frame numbers and ranges.
option_string: str, default=None: The option string.

Attributes:

deststr: The attribute name in the namespace where the parsed list will be stored.

Methods

__call__(parser, namespace, values, option_string=None)

Parses the provided string of values into a sorted list of integers and sets it as an attribute in the namespace.

Examples

>>> parser = argparse.ArgumentParser()
>>> parser.add_argument("-fl", "--frame_list", metavar="FRAME_LIST", help="Smart-indexed list of 1-indexed Frame Numbers within trajectory to analyze", required=False, action=SmartIndexingAction)
>>> args = parser.parse_args(["-fl", "1-20,34,25,50-100"])
>>> print(args.frame_list)
[1, 2, ..., 20, 34, 25, 50, 51, ..., 100]

static parse_smart_index(value)[source]#

Checks that an inputted variable is a list that can be smart indexed and indexes it if necessary.

Parameters:

value{str, set, list}: The input string containing ranges (e.g., “1-5,10,15-20”) or a set of integers.

Returns:

set: A set of integers parsed from the input.

Raises:

ValueError: If the input is not a string or set.

Examples

>>> import stacker as st
>>> st.SmartIndexingAction.parse_smart_index('1-5,8,12-17')
{1, 2, 3, 4, 5, 8, 12, 13, 14, 15, 16, 17}

__init__(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None, deprecated=False)#

_get_args()#

_get_kwargs()#

format_usage()#

stacker.kmeans module#

Run K Means Clustering and Principal Component Analysis

This module contains functions to run K Means Clustering on SSF results and visualize the clusters with barplots, silhouette analysis, and PCA scatterplots.

read_and_preprocess_data((file1, file2, ...))[source]#

Reads and preprocesses SSF data for K Means analysis per dataset.

Reads SSF data from txt files for each dataset, decompresses the data, and attaches each Trajectory to its frame-wise SSF results. The values are flattened SSF lists, so rather than a 3200 frames x 127 res x 127 res, it’s a 3200 frames x 16129 res-res pairs. For example, a 2-residue, 2-frame SSF of

[ [[1, 2], [3, 4]],

[[5, 6], [7, 8]] ]

is flattened to:

[[1, 2, 3, 4], [5, 6, 7, 8]]

Parameters:

file1, file2, …list of str: List of filenams to read and preprocess. Outputted from -s ssf -d output.txt. Should be in the format {datapath}/{traj_name}.txt.gz

Returns:

data_arraysdict: Dictionary where keys are dataset names and values are the processed data arrays.

See also

create_kmeans_input: Stacks SSF data into a single 2D Numpy array.

Examples

>>> import stacker as st
>>> dataset_names = ['testing/5JUP_N2_tGGG_aCCU_+1GCU.txt.gz', 'testing/5JUP_N2_tGGG_aCCU_+1CGU.txt.gz']  # 3200 frames, SSFs of 127 x 127 residues
>>> data_arrays = st.read_and_preprocess_data(dataset_names)
>>> print(data_arrays['dataset1'].shape)
(3200, 16129)

create_kmeans_input(data_arrays)[source]#

Blinds SSF Data (removes trajectory labels) for input to K Means

Stacks SSF data into a single 2D numpy array from all frames of all trajectories without labels for each frame. Used for input to KMeans Clustering

Parameters:

data_arraysdict: Output of read_and_preprocess_data(). Dictionary where keys are dataset names and values are the processed data arrays.

Returns:

blinded_datanp.typing.ArrayLike: A 2D numpy array containing all frames stacked together.

See also

read_and_preprocess_data: Reads and preprocesses data for each dataset

Examples

>>> import stacker as st
>>> data_arrays = {
...     'dataset1': np.random.rand(3200, 16129),
...     'dataset2': np.random.rand(3200, 16129)
... }
>>> kmeans_input = st.create_kmeans_input(data_arrays)
>>> print(kmeans_input.shape)
(6400, 16129)

run_kmeans(data_arrays, n_clusters, max_iter=1000, n_init=20, random_state=1, outdir='')[source]#

Performs KMeans clustering on blinded SSF data and saves the results.

This function applies the KMeans clustering algorithm to the provided blinded SSF data, assigns each frame to a cluster, and counts the number of frames in each cluster for each dataset. The results are printed and saved to a file.

Parameters:

data_arraysdict: Output of read_and_preprocess_data(). Dictionary where keys are dataset names and values are the processed data arrays.
n_clustersint: The number of clusters to form
max_iterint, default=1000: Maximum number of iterations of the k-means algorithm for a single run.
n_initint, default=20: Number of times the k-means algorithm will be run with different centroid seeds.
random_stateint, default=1: Determines random number generation for centroid initialization.
outdirstr, default=’’: Directory to save the clustering results. If empty, just prints to standard output.

Returns:

np.ndarray: The labels of the clusters for each frame.

See also

create_kmeans_input: blinds SSF Data for input to K Means
read_and_preprocess_data: reads and preprocesses SSF data for K Means analysis per dataset

Examples

>>> import stacker as st
>>> data_arrays = {
...     'dataset1': np.random.rand(3200, 16129),
...     'dataset2': np.random.rand(3200, 16129)
... }
>>> blinded_data = st.create_kmeans_input(data_arrays)
>>> st.run_kmeans(blinded_data, n_clusters=4)
Reading data: dataset1
Reading data: dataset2
(6400, 16129)
{'dataset1': array([800, 800, 800, 800]), 'dataset2': array([800, 800, 800, 800])}
Dataset: dataset1
    Cluster 1: 800 matrices
    Cluster 2: 800 matrices
    Cluster 3: 800 matrices
    Cluster 4: 800 matrices
Dataset: dataset2
    Cluster 1: 800 matrices
    Cluster 2: 800 matrices
    Cluster 3: 800 matrices
    Cluster 4: 800 matrices

plot_cluster_trj_data(cluster_file, outfile, x_labels_map=None)[source]#

Plots the output of run_kmeans() to a PNG file.

Creates a grouped bar plot of the number of frames from each trajectory in each cluster following KMeans clustering. Writes the plot output to a PNG file.

Parameters:

input_filestr: Path to clustering results written by run_kmeans()
outfilestr: Filepath where the plot PNG file will be saved.
x_labels_mapdict, optional: Dictionary to remap x labels. Keys are original labels and values are new labels.

Returns:

None

Examples

This will read the clustering results from ‘clustering_results.txt’, create a bar plot, and save it as ‘kmeans_plot.cluster_4.png’ in the specified output directory.

>>> import stacker as st
>>> st.plot_cluster_trj_data('clustering_results.txt', "../testing/kmeans_plot.png", {'5JUP_N2_tGGG_aCCU_+1CGU_data': 'tGGG_aCCU_+1CGU', '5JUP_N2_tGGG_aCCU_+1GCU_data': 'tGGG_aCCU_+1GCU'})

plot_silhouette(n_clusters, blind_data, outdir='')[source]#

Creates Silhouette plots to determine the best number of clusters

Parameters:

n_clustersint, default = 0: The number of clusters to form.
blind_datanp.typing.ArrayLike: A 2D numpy array containing all frames stacked together. Output of create_kmeans_input()
outfilestr: Filepath where the plot PNG file will be saved.

plot_pca(blinded_data, dataset_names, coloring='dataset', outdir='', cluster_labels=None, new_dataset_names=None)[source]#

Creates PCA Plot to compare systems in 2D

Creates a PCA plot that can be colored by the KMeans clustering result or by dataset. Compares SSFs similarly to K Means.

Parameters:

blinded_datanp.ndarray: A 2D numpy array containing all frames stacked together. Output of create_kmeans_input()
dataset_nameslist of str: List of filenames to read and preprocess. Outputted from stacker -s ssf -d output.txt.gz. Should be in the format {datapath}/{traj_name}.txt.gz
coloring{‘dataset’, ‘kmeans’, ‘facet’}: Method to color the points on the scatterplot. Options: - dataset: Plot all points on the same scatterplot and color by dataset of origin. - kmeans: Plot all points on the same scatterplot and color by KMeans Cluster with n_clusters - facet: Same as dataset but plot each dataset on a different coordinate grid.
outdirstr, default=’’: Directory to save the clustering results.
cluster_labelsnp.ndarray, optional: The labels of the clusters for each frame, output from run_kmeans. Used if coloring = “kmeans” to color points by cluster
new_dataset_namesdict, optional: Dictionary to remap dataset names. Keys are original filenames in dataset_names and values are shortened names.

Returns:

None

See also

create_kmeans_input: blinds SSF Data for input to K Means
read_and_preprocess_data: reads and preprocesses SSF data for K Means analysis per dataset
sklearn.decomposition.PCA: Runs PCA

stacker.pairwise_distance module#

Create and Analyze System Stacking Fingerprints (SSFs)

This module contains the functions to create and analyze System Stacking Fingerprints. It allows the user to generate pairwise distance data across a trajectory and compare these matrices across different trajectories.

calculate_residue_distance(trj, res1, res2, res1_atoms=('C2', 'C4', 'C6'), res2_atoms=('C2', 'C4', 'C6'), frame=1)[source]#

Calculates the vector between two residues with x, y, z units in Angstroms.

Calculates the distance between the center of two residues. The center is defined by the average x, y, z position of three passed atoms for each residue (typically every other carbon on the 6-carbon ring of the nucleotide base).

Parameters:

trjmd.Trajectory: Single frame trajectory.
res1int: 1-indexed residue number of the first residue (PDB Column 5).
res2int: 1-indexed residue number of the second residue (PDB Column 5).
res1_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 1.
res2_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 2.
frameint, default=1: 1-indexed frame number of trajectory to calculate the distance.

Returns:

distance_res12Vector: Vector from the center of geometry of residue 1 to residue 2.

See also

get_residue_distance_for_frame: Calculates pairwise distances between all residues in a given frame.

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                              'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                              residues = {426,427}, 
...                              atoms = {'C2','C4','C6'})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
>>> distance_vec = st.calculate_residue_distance(
...     trajectory=filtered_traj, 
...     res1_num=426, 
...     res2_num=427, 
...     res1_atoms=("C2", "C4", "C6"), 
...     res2_atoms=("C2", "C4", "C6"), 
...     frame=1
... )
>>> distance_vec.magnitude()
7.5253396

get_residue_distance_for_frame(trj, frame, res1_atoms=('C2', 'C4', 'C6'), res2_atoms=('C2', 'C4', 'C6'), write_output=True)[source]#

Calculates System Stacking Fingerprint (SSF) between all residues in a given frame.

Calculates the distances between all pairs of residues in a given frame of a trajectory. Outputs as a square matrix with all residues on each side. This is the data behind a System Stacking Fingerprint (SSF)

Parameters:

trjmd.Trajectory: Trajectory to analyze (must have a topology).
frameint: 1-indexed frame to analyze.
res1_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 1.
res2_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 2.
write_outputbool, default=True: If True, displays a loading screen to standard output.

Returns:

pairwise_distancesnp.typing.ArrayLike: Matrix where position (i, j) represents the distance from residue i to residue j.

See also

get_residue_distance_for_trajectory: Calculates System Stacking Fingerprints (SSFs) for all residues across all frames of a trajectory
filter_traj: Filters an input trajectory to only the specified atoms and residues
mdtraj.load: Load a trajectory+topology file

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                             residues = '2-5,13-16,23-31,46-51,65-76,88-104,122-141,164-175,184-198,288-289,401-415,420-430', 
...                             atoms = {'C2','C4','C6'})
>>> ssf = st.get_residue_distance_for_frame(filtered_traj, frame = 2, write_output = False)
>>> ssf.shape
(127, 127)

get_residue_distance_for_trajectory(trj, frames={}, res1_atoms=('C2', 'C4', 'C6'), res2_atoms=('C2', 'C4', 'C6'), threads=1, write_output=True)[source]#

Calculates System Stacking Fingerprints (SSFs) for all residues across all frames of a trajectory.

Calculates the distances between all pairs of residues for all frames of a trajectory. Outputs as a square matrix with all residues on each side. This is the data behind a System Stacking Fingerprint (SSF)

Parameters:

trjmd.Trajectory: Trajectory to analyze (must have a topology). Output of st.filter_traj() or mdtraj.load()
framesarray_like or str: list of frame indices to analyze (1-indexed). Accepts smart-indexed str representing a list of frames (e.g ‘1-5,6,39-48’) If empty, uses all frames.
res1_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 1.
res2_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 2.
threadsint, default=1: Number of threads to use for parallel processing.
write_outputbool, default=True: If True, displays a loading screen to standard output. Do not use if threads > 1.

Returns:

ssf_per_framearray_like: List where pairwise_distances[f] is the output of get_residue_distance_for_frame(trajectory, f, res1_atoms, res2_atoms).

See also

system_stacking_fingerprints: Alias for this function
get_residue_distance_for_frame: Calculates System Stacking Fingerprint (SSF) between all residues in a given frame.
filter_traj: Filters an input trajectory to only the specified atoms and residues
mdtraj.load: Load a trajectory+topology file
display_arrays_as_video: Displays this data as an SSF.

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                             residues = '2-5,13-16,23-31,46-51,65-76,88-104,122-141,164-175,184-198,288-289,401-415,420-430', 
...                             atoms = {'C2','C4','C6'})
>>> ssfs = st.get_residue_distance_for_trajectory(filtered_traj, frames = '1-3', write_output = False)
>>> ssfs.shape
(3, 127, 127)

system_stacking_fingerprints(trj, frames={}, res1_atoms=('C2', 'C4', 'C6'), res2_atoms=('C2', 'C4', 'C6'), threads=1, write_output=True)[source]#

Alias for get_residue_distance_for_trajectory().

Calculates System Stacking Fingerprints (SSFs) for all residues across all frames of a trajectory.

Calculates the distances between all pairs of residues for all frames of a trajectory. Outputs as a square matrix with all residues on each side. This is the data behind a System Stacking Fingerprint (SSF)

Parameters:

trjmd.Trajectory: Trajectory to analyze (must have a topology). Output of st.filter_traj() or mdtraj.load()
framesarray_like or str: list of frame indices to analyze (1-indexed). Accepts smart-indexed str representing a list of frames (e.g ‘1-5,6,39-48’) If empty, uses all frames.
res1_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 1.
res2_atomstuple, default=(“C2”, “C4”, “C6”): Atom names whose positions are averaged to find the center of residue 2.
threadsint, default=1: Number of threads to use for parallel processing.
write_outputbool, default=True: If True, displays a loading screen to standard output. Do not use if threads > 1.

Returns:

ssf_per_framearray_like: List where pairwise_distances[f] is the output of get_residue_distance_for_frame(trajectory, f, res1_atoms, res2_atoms).

See also

system_stacking_fingerprints: Alias for this function
get_residue_distance_for_frame: Calculates System Stacking Fingerprint (SSF) between all residues in a given frame.
filter_traj: Filters an input trajectory to only the specified atoms and residues
mdtraj.load: Load a trajectory+topology file
display_arrays_as_video: Displays this data as an SSF.

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                             residues = '2-5,13-16,23-31,46-51,65-76,88-104,122-141,164-175,184-198,288-289,401-415,420-430', 
...                             atoms = {'C2','C4','C6'})
>>> ssfs = st.get_residue_distance_for_trajectory(filtered_traj, frames = '1-3', write_output = False)
>>> ssfs.shape
(3, 127, 127)

get_frame_average(ssfs)[source]#

Calculates an average System Stacking Fingerprint (SSF) across multiple SSFs

Used to calculate an average SSF across multiple frames of a trajectory. Can average the result of get_residue_distance_for_trajectory

Parameters:

ssfsnumpy.typing.ArrayLike: List or array of 2D NumPy arrays representing a pairwise distance matrix of an MD structure. All 2D NumPy arrays must be of the same dimenstions. Output of get_residue_distance_for_trajectory()

Returns:

avg_framenumpy.typing.ArrayLike: A single 2D NumPy array representing a pairwise distance matrix where each position i,j is the average distance from residue i to j across all matrices in frames.

See also

get_residue_distance_for_trajectory: Calculates System Stacking Fingerprints (SSFs) for all residues across all frames of a trajectory
system_stacking_fingerprints: Alias for get_residue_distance_for_trajectory

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                             residues = '2-5,13-16,23-31,46-51,65-76,88-104,122-141,164-175,184-198,288-289,401-415,420-430', 
...                             atoms = {'C2','C4','C6'})
>>> ssfs = st.get_residue_distance_for_trajectory(filtered_traj, frames = '1-3', write_output = False)
>>> avg_ssf = st.get_frame_average(ssfs)
>>> avg_ssf.shape
[FILL]

get_top_stacking(trj, matrix, csv='', n_events=5, include_adjacent=False)[source]#

Returns top stacking residue pairs for a given System Stacking Fingerprint (SSF)

Given a trajectory and a SSF made from get_residue_distance_for_frame() or get_frame_average() prints the residue pairings with the strongest stacking events (ie. the residue pairings with center of geometry distance closest to 3.5Å).

Parameters:

trjmd.Trajectory: trajectory used to get the stacking fingerprint
matrixtyping.ArrayLike: Single-frame SSF created by get_residue_distance_for_frame() or get_frame_average()
csvstr, default = ‘’,: output filename of the tab-separated txt file to write data to. If empty, data printed to standard output
n_eventsint, default = 5: maximum number of stacking events to display, if -1 display all residue pairings
include_adjacentbool, default = False: True if adjacent residues should be included in the printed output

See also

get_residue_distance_for_frame: Calculates System Stacking Fingerprint (SSF) between all residues in a given frame.
get_residue_distance_for_trajectory: Calculates System Stacking Fingerprints (SSFs) for all residues across all frames of a trajectory
system_stacking_fingerprints: Alias for get_residue_distance_for_trajectory

Examples

We can calculate the stacking events for a single frame:

>>> import stacker as st
>>> filtered_traj = st.filter_traj('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                             atoms = {'C2','C4','C6'})
>>> ssf = st.get_residue_distance_for_frame(filtered_traj, frame = 2, write_output = False)
>>> ssf.shape
(252,252)
>>> st.get_top_stacking(filtered_traj, ssf)
Row     Column  Value
197     195     3.50
420     413     3.51
94      127     3.51
93      130     3.53
117     108     3.38

Or we can get most residue pairs that had the most stacking across many frames of a trajectory:

>>> import stacker as st
>>> filtered_traj = st.filter_traj('stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                             'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                             atoms = {'C2','C4','C6'})
>>> ssfs = st.get_residue_distance_for_trajectory(filtered_traj, frames = '1-3', write_output = False)
>>> avg_ssf = st.get_frame_average(ssfs)
>>> avg_ssf.shape
(252, 252)
>>> st.get_top_stacking(filtered_traj, ssf)
Row     Column  Value
130     93      3.56
108     117     3.44
195     197     3.61
127     94      3.65
47      167     3.66

increment_residue(residue)[source]#

Increments residue ID by 1

Useful when converting from mdtraj 0-index residue naming to 1-indexed

Parameters:

residuestr: The residue id given by trajectory.topology.residue(i)

Returns:

incremented_idstr: The residue id with the sequence number increased by 1

Examples

>>> increment_residue('G43')
'G44'

load_ssfs(file)[source]#

Loads a list of SSFs created by stacker -s ssf -d OUTFILE

Loads a list of SSFs where each element is an SSF from a given frame. This is OUTFILE when running stacker -s ssf -d OUTFILE and quickly provides saved SSF data rather than recalculating SSFs for a given trajectory.

Parameters:

filestr: outfile from stacker -s ssf -d OUTFILE. Must be .txt or .txt.gz.

Returns:

ssfsnumpy.typing.ArrayLike: List or array of 2D NumPy arrays representing a pairwise distance matrix of an MD structure. All 2D NumPy arrays must be of the same dimenstions. Output of get_residue_distance_for_trajectory()

See also

system_stacking_fingerprints: calculates ssfs rather than loading it like this function.
get_residue_distance_for_trajectory: Alias for system_stacking_fingerprints()
get_frame_average: calculates average SSF for a trajectory. matrix parameter is the output of this

exception MultiFrameTraj[source]#

Bases: Exception

A multi-frame trajectory is passed to a one-frame function

Raised if a multi-frame trajectory is passed to a function that only works on one trajectory (eg. calculate_residue_distance_vector())

__init__(*args, **kwargs)#

add_note(object, /)#: Exception.add_note(note) – add a note to the exception

args#

with_traceback(object, /)#: Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

stacker.residue_movement module#

Create and Analyze Pairwise Stacking Fingerprints (PSFs)

This module contains the functions to create and analyze Pairwise Stacking Fingerprints. It allows the user to generate Polar Scatterplots and Heatmaps of one residue movement relative to the other.

This pipeline takes a pdb file with multiple frames and two residues. It then calculates the r, rho, and theta values at each frame as defined in the Bottaro paper (https://doi.org/10.1093/nar/gku972), and exports these values to a .csv file.

collect_atom_locations_by_frame(trj, residue, atom)[source]#

Creates a list of all atom locations for a particular atom and residue number per frame.

Curates a list coords_by_frame where coords_by_frame[i] is the (x, y, z) positions of a provided atom in a residue residue at the ith frame.

Parameters:

trjmd.Trajectory: Trajectory to analyze.
residueint: The 0-indexed residue number of the residue where atom_id is found (PDB Column 5).
atomstr: The name of the atom to get coordinates for (PDB Column 2).

Returns:

coords_by_framelist: List of (x, y, z) coordinates of atom_id in residue_num for each frame.

See also

Base: Python Class that represents a nucleotide base
calculate_bottaro_values_for_frame: Calculates the r, rho, and theta values as expressed in the Bottaro paper.
create_base_from_coords_list: Combines C2, C4, C6 positions with midpoint positions for a given frame

Notes

residue_num must be 0-indexed to match how mdtraj.Trajectory indexes residues.

Examples

>>> import stacker as st
>>> filtered_traj = st.filter_traj('testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd', 
...                              'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop', 
...                              atoms = {'C2','C4','C6'})
>>> >>> st.collect_atom_locations_by_frame(filtered_traj, residue = 3, atom = "C2")
[(58.794, 59.636, 49.695), (59.185005, 58.797, 50.137), (59.379, 58.553005, 49.853), 
(58.76, 59.068, 49.681), (59.003, 59.054, 49.878002), (59.049, 58.967, 50.051), 
(59.219, 58.476006, 49.948), (58.948, 58.588005, 50.085), (58.922, 58.747, 49.766003), 
(59.124, 58.916, 49.978004)]

calc_center_3pts(a, b, c)[source]#

Finds the average x, y, z position of three (x, y, z) Vectors.

Takes in three Vectors generated using the pdb.xyz method and finds their center. Works with three points that make up a triangle (like those within a 6-member ring).

Parameters:

aVector: x, y, z coordinates of the first point.
bVector: x, y, z coordinates of the second point.
cVector: x, y, z coordinates of the third point.

Returns:

midpointVector: One Vector with (x, y, z) coordinates at the center of the three input Vectors.

See also

Base: Python Class that represents a nucleotide base
calculate_bottaro_values_for_frame: Calculates the r, rho, and theta values as expressed in the Bottaro paper.
create_base_from_coords_list: Combines C2, C4, C6 positions with midpoint positions for a given frame

Examples

>>> import stacker as st
>>> print(st.calc_center_3pts(st.Vector(0,0,0), st.Vector(1,1,1), st.Vector(2,2,2)))
[ 1.0
  1.0
  1.0 ]

class Base(c2_coords, c4_coords, c6_coords, midpoint_coords)[source]#

Bases: object

Represents a nucleotide base with x, y, z coordinates for C2, C4, and C6 atoms, and their average position at a single frame.

This class defines a data type ‘Base’ that consists of coordinates for the atoms C2, C4, and C6, as well as the midpoint of these atoms for a single residue at a single frame.

Attributes:

c2_coordsVector: (x, y, z) coordinates for the atom C2.
c4_coordsVector: (x, y, z) coordinates for the atom C4.
c6_coordsVector: (x, y, z) coordinates for the atom C6.
midpoint_coordsVector: (x, y, z) coordinates representing the midpoint of C2, C4, and C6.

__init__(c2_coords, c4_coords, c6_coords, midpoint_coords)[source]#

Initialize a Base instance.

Parameters:

c2_coordstuple: (x, y, z) coordinates for C2.
c4_coordstuple: (x, y, z) coordinates for C4.
c6_coordstuple: (x, y, z) coordinates for C6.
midpoint_coordstuple: (x, y, z) coordinates for the midpoint.

create_base_from_coords_list(frame, C2_coords, C4_coords, C6_coords, midpoint_coords)[source]#

Combines C2, C4, C6 positions with midpoint positions for a given frame.

Takes a frame (0-indexed) and outputs a Base instance with the x, y, z locations of C2, C4, C6, and midpoint of the same residue at that frame.

Parameters:

frameint: Frame number (0-indexed).
C2_coordslist: List of (x, y, z) coordinates of C2 atom in some residue for each frame.
C4_coordslist: List of (x, y, z) coordinates of C4 atom in some residue for each frame.
C6_coordslist: List of (x, y, z) coordinates of C6 atom in some residue for each frame.
midpoint_coordslist: List of (x, y, z) coordinates of the midpoint of residue for each frame.

Returns:

Base: An instance of Base with coordinates at the specified frame.

See also

calc_center_3pts: Finds the average x, y, z position of three (x, y, z) Vectors

Notes

frame is 0-indexed because this is how mdtraj handles mdtraj.Trajectory

correct_theta_sign(rho, y_axis, theta)[source]#

Corrects the sign of an angle theta with the x-axis within a plane defined by a given y-axis.

When calculating the angle theta between two vectors in 3D space, once the vectors move >180 degrees apart, the angle becomes the shortest path. To have 360 degrees of freedom, we calculate theta within the plane by checking if it faces the same direction as a vector y, and correcting otherwise.

Parameters:

rhoVector: The vector compared to the x-axis to form theta.
y_axisVector: Directional vector to define a plane with x-axis; orthogonal to x-axis.
thetafloat: The calculated angle of rho with x-axis to be corrected.

Returns:

float: Theta as calculated on the plane.

calculate_bottaro_values_for_frame(perspective_base_coords, viewed_midpoint)[source]#

Calculates the r, rho, and theta values as expressed in the Bottaro paper.

Calculates the r, rho, and theta values between two nucleotides in a single frame as presented in Figure 1 of Bottaro et. al (https://doi.org/10.1093/nar/gku972).

Parameters:

perspective_base_coordsBase: List of the x, y, z coords of C2, C4, C6 and their midpoint for the perspective residue in a single frame.
viewed_midpointVector: x, y, z position of the midpoint of the viewed nucleotide at the same frame.

Returns:

list: A list containing 3 floats from Bottaro; structure: [r_dist, rho_dist, theta].

References

[1] Sandro Bottaro, Francesco Di Palma, Giovanni Bussi, The role of nucleobase interactions in RNA structure and dynamics, Nucleic Acids Research, Volume 42, Issue 21, 1 December 2014, Pages 13306–13314, https://doi.org/10.1093/nar/gku972

write_bottaro_to_csv(pdb='', outcsv='', pers_res=-1, view_res=-1, res1_atoms={'C2', 'C4', 'C6'}, res2_atoms={'C2', 'C4', 'C6'}, index=1)[source]#

Write the r, rho, and theta values used to make a Pairwise Stacking Fingerprint (PSF) from a trajectory PDB to a CSV.

Calculates the r, rho, and theta values as described in Bottaro et al. from a perspective nucleotide residue to a viewed nucleotide residue per frame. Writes the results to a CSV file.

Parameters:

pdbstr: Filename of PDB containing information for ONLY two residues (perspective and viewed nucleotide) at each frame.
outcsvstr: Filename of CSV file to write to.
pers_resint, default = -1: Residue index of the perspective residue whose plane to project onto (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and perspective id is the first res_id.
view_resint, default = -1: Residue index of the viewed residue whose midpoint to project to perspective residue plane (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and viewed id is the second res_id.
res1_atomsset, default = {“C2”, “C4”, “C6”}: Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 1 to find center of geometry for perspective nucleotide.
res2_atomsset, default = {“C2”, “C4”, “C6”}: Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 2 to find center of geometry for viewed nucleotide.
indexint, default = 1: Index of the residues. 1-indexed (default) means residue ids start at 1. cpptraj uses 1-indexed residues. mdtraj PDB outputs will be 0-indexed.

See also

filter_traj_to_pdb: convert a trajectory and topology file into a single filtered PDB file to input here
visualize_two_residue_movement_scatterplot: Visualize Output CSV data as a PSF scatterplot from this data
visualize_two_residue_movement_heatmap: Visualize Output CSV data as a PSF heatmap from this data

References

[1] Sandro Bottaro, Francesco Di Palma, Giovanni Bussi, The role of nucleobase interactions in RNA structure and dynamics, Nucleic Acids Research, Volume 42, Issue 21, 1 December 2014, Pages 13306–13314, https://doi.org/10.1093/nar/gku972

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, pers_res=perspective_residue, view_res=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> print("".join(open(output_csv_name).readlines()[:10]))
frame,r_dist,rho_dist,theta
0,7.5253415,6.5321836,204.02934901525177
1,6.884639,6.0513134,199.40647902703924
2,7.301847,6.151191,205.1906453260924
3,6.5815425,5.461494,199.5421877249345
4,7.0760417,5.3919506,204.0150121540755
5,7.2589145,6.3483577,201.32674968542617
6,7.4929285,6.414151,205.92967194025135
7,7.1484976,6.035165,202.88441276229827
8,7.344863,5.541237,217.30043061558888

write_psf_data(pdb='', outcsv='', pers_res=-1, view_res=-1, res1_atoms={'C2', 'C4', 'C6'}, res2_atoms={'C2', 'C4', 'C6'}, index=1)[source]#

Alias for write_bottaro_to_csv().

Write the r, rho, and theta values used to make a Pairwise Stacking Fingerprint (PSF) from a trajectory PDB to a CSV.

Calculates the r, rho, and theta values as described in Bottaro et al. from a perspective nucleotide residue to a viewed nucleotide residue per frame. Writes the results to a CSV file.

Parameters:

pdbstr: Filename of PDB containing information for ONLY two residues (perspective and viewed nucleotide) at each frame.
outcsvstr: Filename of CSV file to write to.
pers_resint, default = -1: Residue index of the perspective residue whose plane to project onto (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and perspective id is the first res_id.
view_resint, default = -1: Residue index of the viewed residue whose midpoint to project to perspective residue plane (0-/1-index changed by index variable, default 1-indexed). If -1, a 2-residue PDB is assumed and viewed id is the second res_id.
res1_atomsset, default = {“C2”, “C4”, “C6”}: Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 1 to find center of geometry for perspective nucleotide.
res2_atomsset, default = {“C2”, “C4”, “C6”}: Set of the atom names (e.g., “C2”, “C4”, “C6”) to use from residue 2 to find center of geometry for viewed nucleotide.
indexint, default = 1: Index of the residues. 1-indexed (default) means residue ids start at 1. cpptraj uses 1-indexed residues. mdtraj PDB outputs will be 0-indexed.

See also

filter_traj_to_pdb: convert a trajectory and topology file into a single filtered PDB file to input here
visualize_two_residue_movement_scatterplot: Visualize Output CSV data as a PSF scatterplot from this data
visualize_two_residue_movement_heatmap: Visualize Output CSV data as a PSF heatmap from this data

References

[1] Sandro Bottaro, Francesco Di Palma, Giovanni Bussi, The role of nucleobase interactions in RNA structure and dynamics, Nucleic Acids Research, Volume 42, Issue 21, 1 December 2014, Pages 13306–13314, https://doi.org/10.1093/nar/gku972

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, pers_res=perspective_residue, view_res=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> print("".join(open(output_csv_name).readlines()[:10]))
frame,r_dist,rho_dist,theta
0,7.5253415,6.5321836,204.02934901525177
1,6.884639,6.0513134,199.40647902703924
2,7.301847,6.151191,205.1906453260924
3,6.5815425,5.461494,199.5421877249345
4,7.0760417,5.3919506,204.0150121540755
5,7.2589145,6.3483577,201.32674968542617
6,7.4929285,6.414151,205.92967194025135
7,7.1484976,6.035165,202.88441276229827
8,7.344863,5.541237,217.30043061558888

stacker.vector module#

Vector class used for SSF and PSF Calculation

This module defines the Vector datatype that’s used to store position (x,y,z) information for the atoms and residues in an MD Simulation. Documentation is provided for reference.

class Vector(x, y, z)[source]#

Bases: object

Represents a 3D vector with x, y, and z components.

This class defines a data type ‘Vector’ that represents a 3D vector with x, y, and z components, assuming the vector originates at the origin (0,0,0). Expands the numpy.array object.

Attributes:

xfloat: The x-component of the vector.
yfloat: The y-component of the vector.
zfloat: The z-component of the vector.

Methods

`calculate_cross_product`(b)	Calculate the cross product of two vectors.
`calculate_projection`(b)	Calculate the projection of this vector onto vector b.
`magnitude`()	Calculate the magnitude (length) of the vector.
`scale`(a)	Scale the vector by a scalar a.

__init__(x, y, z)[source]#

Initialize a Vector instance.

Parameters:

xfloat: The x-component of the vector.
yfloat: The y-component of the vector.
zfloat: The z-component of the vector.

calculate_cross_product(b)[source]#

Calculate the cross product of two vectors.

Calculates the cross product of two vectors, which is the unit vector that is perpendicular to both vectors.

Parameters:

bVector: The second vector to calculate the cross product with.

Returns:

Vector: The resulting vector from the cross product.

magnitude()[source]#

Calculate the magnitude (length) of the vector.

Returns:

float: The magnitude (length) of the vector.

calculate_projection(b)[source]#

Calculate the projection of this vector onto vector b.

Parameters:

bVector: The vector to project onto.

Returns:

Vector: The resulting vector from the projection.

scale(a)[source]#

Scale the vector by a scalar a.

Parameters:

afloat: The scalar to scale the vector by.

Returns:

Vector: The scaled vector.

stacker.visualization module#

Visualize the SSFs, PSFs, and other analyses

This module includes the functions used to visualize plots, including SSFs and PSFs. The data inputs to these plot functions are provided by the other modules.

create_parent_directories(outfile_prefix)[source]#

Creates necessary parent directories to write an outfile given a prefix

Parameters:

outfile_prefixstr: The filepath of the output file, including the path where the file will be saved.

Examples

>>> create_parent_directories('/path/to/output/file.txt')
# This will create the directories '/path/to/output' if they do not exist.
>>> create_parent_directories('output/file.txt')
# This will create the directory 'output' if it does not exist.

create_axis_labels(res_indices, tick_distance=10)[source]#

Designates the axis labels to use in the SSF plot.

Helper function for visualizing SSFs. Returns the x-axis tick positions and labels to use on the ticks based on the residues used in a specific SSF analysis. Meant to be used when many disjoint sets of residue indices are used. Ticks will be present every tick_distance residues in a collection of adjacent residues, and a tick will exist at both ends of any consecutive residue sequence.

Parameters:

res_indiceslist: The list of residue indices used in the pairwise analysis. Parameter residue_desired passed to filter_traj()
tick_distanceint, default = 10: Distance between ticks in blocks of consecutive residues.

Returns:

tick_locationslist: List of tick positions (0-based) to place labels on in the axes.
tick_labelslist: List of labels to place at the adjacent tick locations.

See also

filter_traj: Filters an input trajectory to desired residues

Examples

Residues 0-12,98-100 were used. The SSF will label 0,10,12,98,100, provided in the second returned list. The first returned list gives the positions on the axes to place each label.

>>> create_axis_labels([0,1,2,3,4,5,6,7,8,9,10,11,12,98,99,100])
[0, 10, 12, 13, 15], [0, 10, 12, 98, 100]

>>> create_axis_labels([94,95,96,97,98,99,100,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428])
[0, 6, 7, 17, 27], [94, 100, 408, 418, 428]

display_arrays_as_video(ssfs, res_indices, seconds_per_frame=10, tick_distance=10, outfile_prefix='', scale_limits=(0, 7), outfile='', scale_style='bellcurve', xy_line=True, **kwargs)[source]#

Displays SSF data to output or writes SSF as a PNG

Visualizes the data for an SSF for a trajectory or a single frame. Takes an SSF array outputted from get_residue_distance_for_frame, get_residue_distance_for_trajectory, or system_stacking_fingerprints and treats them as frames in a video, filling in a grid at position i, j by the value at i, j in the array.

Parameters:

ssfsarray_like

List or array of 2D NumPy arrays representing SSFs, output of system_stacking_fingerprints

res_indiceslist or str

The list of residue indices used in the pairwise analysis. Parameter residue_desired passed to filter_traj() Accepts smart-indexed str representing a list of residues (e.g ‘1-5,6,39-48’)

seconds_per_frameint, default = 10

Number of seconds to display each matrix for.

tick_distanceint, default = 10

Distance between ticks in blocks of consecutive residues.

outfilestr

Image output filepath to write a single SSF to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

outfile_prefixstr

Prefix for image filepath to write multiple frames to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

scale_limitstuple, default = (0, 7)

Limits of the color scale.

scale_style{‘bellcurve’, ‘gradient’}, default = ‘bellcurve’

Style of color scale.

xy_linebool, default = True

Draw x = y line to separate matrix halves.

**kwargsdict, optional

Additional keyword arguments to customize the plot:

fontsizeint, default = 10
Font size for all text elements.
fig_widthfloat, default = 8
Width of the figure in inches.
fig_heightfloat, default = 8
Height of the figure in inches.
title_fontsizeint, default = fontsize
Font size for the title.
legend_fontsizeint, default = fontsize
Font size for the legend.
cb_fontsizeint, default = fontsize
Font size for the colorbar tick labels.
xaxis_fontsizeint, default = fontsize
Font size for the x-axis tick labels.
yaxis_fontsizeint, default = fontsize
Font size for the y-axis tick labels.

Returns:

None: Displays video of NumPy arrays.

See also

create_axis_labels: Designates the axis labels to use in the SSF plot.
get_residue_distance_for_frame: Calculates System Stacking Fingerprint (SSF) between all residues in a given frame.
get_residue_distance_for_trajectory: get SSF data for all frames of a trajectory
system_stacking_fingerprints: Alias for this get_residue_distance_for_trajectory
display_ssfs: Alias for this function.

Examples

>>> import stacker as st
>>> import mdtraj as md
>>> trajectory_file = 'stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> trj = md.load(trajectory_file, top = topology_file)
>>> residue_selection_query = 'resi 90 to 215'
>>> frames_to_include = [1,2,3,4,5]
>>> trj_sub = trj.atom_slice(trj.top.select(residue_selection_query))
>>> resSeqs = [res.resSeq for res in trj_sub.topology.residues]
>>> frames = st.get_residue_distance_for_trajectory(trj_sub, frames_to_include, threads = 5)
Frame 2 done.
Frame 3 done.
Frame 1 done.
Frame 5 done.
Frame 4 done.
>>> st.display_arrays_as_video([st.get_frame_average(frames)], resSeqs, seconds_per_frame=10)
# Displays SSF for each frame of this trajectory to standard output

display_ssfs(numpy_arrays, res_indices, seconds_per_frame=10, tick_distance=10, outfile_prefix='', scale_limits=(0, 7), outfile='', scale_style='bellcurve', xy_line=True, **kwargs)[source]#

Alias for display_arrays_as_video().

display_arrays_as_video(: ssfs, res_indices, seconds_per_frame = 10, tick_distance = 10, outfile_prefix = ‘’, scale_limits = (0,7), outfile = ‘’, scale_style = ‘bellcurve’, xy_line = True, **kwargs

)

Displays SSF data to output or writes SSF as a PNG

Visualizes the data for an SSF for a trajectory or a single frame. Takes an SSF array outputted from get_residue_distance_for_frame, get_residue_distance_for_trajectory, or system_stacking_fingerprints and treats them as frames in a video, filling in a grid at position i, j by the value at i, j in the array.

Parameters:

ssfsarray_like

List or array of 2D NumPy arrays representing SSFs, output of system_stacking_fingerprints

res_indiceslist or str

The list of residue indices used in the pairwise analysis. Parameter residue_desired passed to filter_traj() Accepts smart-indexed str representing a list of residues (e.g ‘1-5,6,39-48’)

seconds_per_frameint, default = 10

Number of seconds to display each matrix for.

tick_distanceint, default = 10

Distance between ticks in blocks of consecutive residues.

outfilestr

Image output filepath to write a single SSF to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

outfile_prefixstr

Prefix for image filepath to write multiple frames to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

scale_limitstuple, default = (0, 7)

Limits of the color scale.

scale_style{‘bellcurve’, ‘gradient’}, default = ‘bellcurve’

Style of color scale.

xy_linebool, default = True

Draw x = y line to separate matrix halves.

**kwargsdict, optional

Additional keyword arguments to customize the plot:

fontsizeint, default = 10
Font size for all text elements.
fig_widthfloat, default = 8
Width of the figure in inches.
fig_heightfloat, default = 8
Height of the figure in inches.
title_fontsizeint, default = fontsize
Font size for the title.
legend_fontsizeint, default = fontsize
Font size for the legend.
cb_fontsizeint, default = fontsize
Font size for the colorbar tick labels.
xaxis_fontsizeint, default = fontsize
Font size for the x-axis tick labels.
yaxis_fontsizeint, default = fontsize
Font size for the y-axis tick labels.

Returns:

None: Displays video of NumPy arrays.

See also

create_axis_labels: Designates the axis labels to use in the SSF plot.
get_residue_distance_for_frame: Calculates System Stacking Fingerprint (SSF) between all residues in a given frame.
get_residue_distance_for_trajectory: get SSF data for all frames of a trajectory
system_stacking_fingerprints: Alias for this get_residue_distance_for_trajectory
display_ssfs: Alias for this function.

Examples

>>> import stacker as st
>>> import mdtraj as md
>>> trajectory_file = 'stacker/testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'stacker/testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> trj = md.load(trajectory_file, top = topology_file)
>>> residue_selection_query = 'resi 90 to 215'
>>> frames_to_include = [1,2,3,4,5]
>>> trj_sub = trj.atom_slice(trj.top.select(residue_selection_query))
>>> resSeqs = [res.resSeq for res in trj_sub.topology.residues]
>>> frames = st.get_residue_distance_for_trajectory(trj_sub, frames_to_include, threads = 5)
Frame 2 done.
Frame 3 done.
Frame 1 done.
Frame 5 done.
Frame 4 done.
>>> st.display_arrays_as_video([st.get_frame_average(frames)], resSeqs, seconds_per_frame=10)
# Displays SSF for each frame of this trajectory to standard output

set_polar_grid(**kwargs)[source]#

Set up axes for PSF.

Creates polar plot background for two-residue movement comparison with theta 0 to 360, a radial maximum of 15 Angstroms, and a visualization of the perspective residue at the center.

Parameters:

**kwargsdict, optional

Additional keyword arguments to customize the plot:

fontsizeint, default = 10
Font size for all text elements.
fig_widthfloat, default = 5
Width of the figure in inches.
fig_heightfloat, default = 5
Height of the figure in inches.
title_fontsizeint, default = fontsize
Font size for the title.
legend_fontsizeint, default = fontsize
Font size for the legend.
cb_fontsizeint, default = fontsize
Font size for the colorbar labels.
xaxis_fontsizeint, default = fontsize
Font size for the x-axis labels.
yaxis_fontsizeint, default = fontsize
Font size for the y-axis labels.

Returns:

axmatplotlib.projections.polar.PolarAxes: Axis object for the created polar plot.

visualize_two_residue_movement_scatterplot(csv, plot_outfile='', frame_list={}, **kwargs)[source]#

Creates scatterplot of two-residue movement relative to each other.

Takes the data created in residue_movement and visualizes it as a polar coordinate scatterplot similar to the Figure D link in Proposal Feature 4.

Parameters:

csvstr

Filepath to csv file containing data on the movement of two residues relative to each other (r, rho, and theta values). Created in residue_movement.

plot_outfilestr

Filepath of the image file to write to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

frame_listset, default = {}

Set of frames to use in csv, if empty use all frames.

**kwargsdict, optional

Additional keyword arguments to customize the plot:

fontsizeint, default = 10
Font size for all text elements.
fig_widthfloat, default = 5
Width of the figure in inches.
fig_heightfloat, default = 5
Height of the figure in inches.
xaxis_fontsizeint, default = fontsize
Font size for the theta-axis labels.
yaxis_fontsizeint, default = fontsize
Font size for the rho-axis labels.

Returns:

None

See also

write_bottaro_to_csv: Creates CSV file that is inputted here

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, perspective_residue_num=perspective_residue, viewed_residue_num=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> st.visualize_two_residue_movement_scatterplot('testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv', 
...                                                 plot_outfile='testing/script_tests/visualization/tUAG_aCUA_+1GCU_GC_plot_10frames_scatter.png')

visualize_two_residue_movement_heatmap(csv, plot_outfile='', frame_list={}, **kwargs)[source]#

Creates heatmap of two-residue movement relative to each other.

2D shaded contour plot of the density of points in the visualize_two_residue_movement_scatterplot() scatterplot.

Parameters:

csvstr

Filepath to csv file containing data on the movement of two residues relative to each other (r, rho, and theta values). Created in residue_movement.

plot_outfilestr

Filepath of the image file to write to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

frame_listset, default = {}

Set of frames to use in csv, if empty use all frames.

**kwargsdict, optional

Additional keyword arguments to customize the plot:

fontsizeint, default = 10
Font size for all text elements.
fig_widthfloat, default = 5
Width of the figure in inches.
fig_heightfloat, default = 5
Height of the figure in inches.
xaxis_fontsizeint, default = fontsize
Font size for the theta-axis labels.
yaxis_fontsizeint, default = fontsize
Font size for the rho-axis labels.
cb_fontsizeint, default = 10
Font size for the colorbar tick labels.

Returns:

None

See also

write_bottaro_to_csv: Creates CSV file that is inputted here

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, pers_res=perspective_residue, view_res=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> st.visualize_two_residue_movement_heatmap('testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv', 
...                                                 plot_outfile='testing/script_tests/visualization/tUAG_aCUA_+1GCU_GC_plot_10frames_heat.png')

display_psf_scatter(csv, plot_outfile='', frame_list={}, **kwargs)[source]#

Alias for visualize_two_residue_movement_scatterplot().

Creates scatterplot of two-residue movement relative to each other.

Takes the data created in residue_movement and visualizes it as a polar coordinate scatterplot similar to the Figure D link in Proposal Feature 4.

Parameters:

csvstr

Filepath to csv file containing data on the movement of two residues relative to each other (r, rho, and theta values). Created in residue_movement.

plot_outfilestr

Filepath of the image file to write to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

frame_listset, default = {}

Set of frames to use in csv, if empty use all frames.

**kwargsdict, optional

Additional keyword arguments to customize the plot:

fontsizeint, default = 10
Font size for all text elements.
fig_widthfloat, default = 5
Width of the figure in inches.
fig_heightfloat, default = 5
Height of the figure in inches.
xaxis_fontsizeint, default = fontsize
Font size for the theta-axis labels.
yaxis_fontsizeint, default = fontsize
Font size for the rho-axis labels.

Returns:

None

See also

write_bottaro_to_csv: Creates CSV file that is inputted here

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, perspective_residue_num=perspective_residue, viewed_residue_num=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> st.visualize_two_residue_movement_scatterplot('testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv', 
...                                                 plot_outfile='testing/script_tests/visualization/tUAG_aCUA_+1GCU_GC_plot_10frames_scatter.png')

display_psf_heatmap(csv, plot_outfile='', frame_list={}, **kwargs)[source]#

Alias for visualize_two_residue_movement_heatmap().

Creates heatmap of two-residue movement relative to each other.

2D shaded contour plot of the density of points in the visualize_two_residue_movement_scatterplot() scatterplot.

Parameters:

csvstr

Filepath to csv file containing data on the movement of two residues relative to each other (r, rho, and theta values). Created in residue_movement.

plot_outfilestr

Filepath of the image file to write to. Format inferred from file extension. png, pdf, ps, eps, and svg supported.

frame_listset, default = {}

Set of frames to use in csv, if empty use all frames.

**kwargsdict, optional

Additional keyword arguments to customize the plot:

fontsizeint, default = 10
Font size for all text elements.
fig_widthfloat, default = 5
Width of the figure in inches.
fig_heightfloat, default = 5
Height of the figure in inches.
xaxis_fontsizeint, default = fontsize
Font size for the theta-axis labels.
yaxis_fontsizeint, default = fontsize
Font size for the rho-axis labels.
cb_fontsizeint, default = 10
Font size for the colorbar tick labels.

Returns:

None

See also

write_bottaro_to_csv: Creates CSV file that is inputted here

Examples

>>> import stacker as st
>>> trajectory_file = 'testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd'
>>> topology_file = 'testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop'
>>> pdb_filename = 'testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb'
>>> output_csv_name = "testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv"
>>> perspective_residue = 426 # 1-indexed
>>> viewed_residue = 427 # 1-indexed
>>> st.filter_traj_to_pdb(trj_file=trajectory_file, top_file=topology_file, pdb=pdb_filename,
...                        residues={perspective_residue,viewed_residue}, atoms={"C2", "C4", "C6"})
WARNING: Residue Indices are expected to be 1-indexed
Reading trajectory...
Reading topology...
Filtering trajectory...
WARNING: Output filtered traj atom, residue, and chain indices are zero-indexed
WARNING: Output file atom, residue, and chain indices are zero-indexed
Filtered trajectory written to:  testing/script_tests/residue_movement/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb
>>> st.write_bottaro_to_csv(pdb_filename, output_csv_name, pers_res=perspective_residue, view_res=viewed_residue)
Output values written to testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv
>>> st.visualize_two_residue_movement_heatmap('testing/script_tests/residue_movement/tUAG_aCUA_+1GCU_GC_plot.csv', 
...                                                 plot_outfile='testing/script_tests/visualization/tUAG_aCUA_+1GCU_GC_plot_10frames_heat.png')

exception NoResidues[source]#

Bases: Exception

Raised if user tries to make SSF with a trajectory of <1 residue

__init__(*args, **kwargs)#

add_note(object, /)#: Exception.add_note(note) – add a note to the exception

args#

with_traceback(object, /)#: Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

Module contents#

The Python Functions that control Command Line Options

This module contains the Python routines called when StACKER is run in the command line.

run_python_command()[source]#

Reads the user’s passed in command line and runs the command

Reads the command line input, runs the associated command with the added flags.

convert_to_python_command()[source]#

Converts a parsed command to use to the correct subroutine and runs the routine

Converts the specified script to a python command and runs it with the associated inputs based on the flags.

filter_traj_routine()[source]#

Executes the routine for filtering an input trajectory file and converting it to a PDB file.

This function uses provided command-line arguments to call filter_traj_to_pdb() with specified inputs, such as the trajectory file, topology file, output file location, residues to retain, and atom names to retain.

Raises:

ResEmpty: If the –residues argument is not provided.
AtomEmpty: If the –atom_names argument is not provided.

See also

filter_traj_to_pdb: The core function that performs the trajectory filtering and PDB generation.

Notes

The args.residues argument should be a list of residue indices to retain.
The args.atom_names argument should be a comma-separated string of atom names to retain.
Output directories are created automatically if they do not exist.

Examples

Command-line usage with sample arguments:

$ stacker 
    -s filter_traj 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -o testing/command_line_tests/filter/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb 
    -r 426,427 -a C2,C4,C6

bottaro_routine()[source]#

Executes the residue movement routine to generate Bottaro values for a trajectory.

This routine calculates the r, rho, and theta values for each frame of a PDB trajectory between two specified residues and stores them in a CSV file. Optionally, it visualizes the movement using heatmaps or scatter plots.

Parameters:

None: The routine relies on global args, which must be set via command-line arguments or equivalent argument parsing.

Raises:

AtomEmpty: If required atom names or residue indices are not provided.

See also

filter_traj_to_pdb: Filters trajectory and generates an intermediate PDB.
write_bottaro_to_csv: Writes residue movement calculations to a CSV.
visualize_two_residue_movement_heatmap: Creates a heatmap for residue movement.
visualize_two_residue_movement_scatterplot: Creates a scatter plot for residue movement.

Notes

args.trajectory and args.topology are required for generating an intermediate PDB if one is not already provided.
Supports heatmap and scatter plot visualizations for residue movement.
Removes intermediate files (pdb and csv) if –no_inter flag is specified.

Examples

Command-line usage:

$ stacker -s bottaro 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -pdb testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd.pdb 
    -o testing/command_line_tests/bottaro/tUAG_aCUA_+1GCU_GC_plot.csv 
    -p 426 
    -v 427 
    -pa C2,C4,C6 
    -va C2,C4,C6 
    -pt scatter

$ stacker -s bottaro 
    -pdb testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat_mdcrd_3200frames.pdb 
    -o testing/command_line_tests/bottaro/tUAG_aCUA_+1GCU_GC_plot_3200frames.csv 
    -p 426 
    -v 427 
    -pa C2,C4,C6 
    -va C2,C4,C6 
    -pt heat

res_distance_routine()[source]#

Calculates the distance between the centers of mass of two specified residues.

This routine filters the trajectory based on the specified residues and atom names, and calculates the distances for either a single frame or multiple frames if bootstrapping is enabled.

Parameters:

None: The routine relies on global args for all input values.

Raises:

ResEmpty: If fewer or more than two residues are specified, or if no residues are provided.
AtomEmpty: If no atom names are provided.

See also

filter_traj: Filters the trajectory for the specified residues and atom names.
calculate_residue_distance: Calculates the distance between residues for a given frame.

Notes

The args.residues argument must contain exactly two residues.
The args.atom_names argument is required to specify the atoms involved in the calculation.
If the –bootstrap argument is provided, distances are calculated across a specified number of randomly sampled frames.
If no bootstrap is performed, the distance is calculated for a single frame.

Examples

Command-line usage:

$ stacker -s res_distance 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -f 2 
    --residues 426,427 
    --atom_names C2,C4,C6

system_routine()[source]#

Runs the System Stacking Fingerprint (SSF) routine to generate a single SSF for a specified frame or range of frames.

This routine processes a trajectory to create SSFs based on inter-residue distances and outputs either visualizations or raw data for further analysis. The user can provide input fingerprints or generate them from trajectory files.

Parameters:

None: The routine relies on global args for all input values.

Raises:

FileNotFoundError: If the specified trajectory or topology file does not exist.
ValueError: If the provided input files have incompatible dimensions or unexpected formatting.

See also

filter_traj: Filters the trajectory for the specified residues.
get_residue_distance_for_trajectory: Computes inter-residue distances for a trajectory.
combine_frames: Combines two SSFs into a single array for comparison.
display_arrays_as_video: Visualizes SSF arrays and saves them as video files.

Notes

If args.input is provided, the routine will use pre-calculated fingerprint data.
If args.residues is specified, only those residues are considered in the calculation.
SSFs can be created for individual frames, a list of frames, or all frames in the trajectory.
If args.input_B is provided, two SSFs are combined for analysis.

Examples

Command-line usage:

$ stacker -s system 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -r 90-215 
    -fl 1-2 

$ stacker -s ssf 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -r 90-215 
    -fl 1-2 
    -g 10 
    -o testing/command_line_tests/pairwise/5JUP_N2_tUAG_aCUA_+1GCU_nowat_pairwise_avg_1to2.png 
    -d testing/command_line_tests/pairwise/5JUP_N2_tUAG_aCUA_+1GCU_data_1to2.txt.gz

combine_frames(frames_A, frames_B)[source]#

Combines two 2D numpy arrays (frames_A and frames_B) into a single array.

This function takes two 2D numpy arrays of the same shape and combines them into a new array. The upper triangular part (excluding the diagonal) of the resulting array is filled with the corresponding elements from frames_A, while the lower triangular part (including the diagonal) is filled with the corresponding elements from frames_B.

Parameters:

frames_Anumpy.ndarray: A 2D numpy array.
frames_Bnumpy.ndarray: A 2D numpy array of the same shape as frames_A.

Returns:

numpy.ndarray: A new 2D numpy array with combined elements from frames_A and frames_B.

Examples

>>> import stacker as st
>>> import numpy as np
>>> frames_A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> frames_B = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]])
>>> st.combine_frames(frames_A, frames_B)
array([[9., 2., 3.],
       [6., 5., 6.],

[3., 2., 1.]])

stack_events_routine()[source]#

Identifies residue pairs with the most π-stacking interactions.

This routine analyzes a molecular dynamics trajectory and identifies residue pairs with the smallest center of geometry (COG) distances, typically indicating strong π-stacking. It outputs the top n_events residue pairings based on user input.

Parameters:

None: This function relies on global args for input values.

Raises:

FileNotFoundError: If the trajectory or topology files are not found.
ValueError: If input parameters are invalid or incompatible with the data.

See also

filter_traj: Filters trajectory data based on residue selection.
get_top_stacking: Extracts top stacking events from a trajectory.

Examples

Command-line usage:

$ stacker -s stack_events 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -r 90-215 
    -f 1 
    -n 5

$ stacker -s stack_events 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -r 90-100 
    -fl 1-10 
    -n 5

$ stacker -s stack_events 
    -trj testing/first10_5JUP_N2_tUAG_aCUA_+1GCU_nowat.mdcrd 
    -top testing/5JUP_N2_tUAG_aCUA_+1GCU_nowat.prmtop 
    -r 90-215 
    -n 5 
    -i testing/command_line_tests/pairwise/5JUP_N2_tUAG_aCUA_+1GCU_data_1to2.txt

compare_routine()[source]#

Compares pi-stacking events between two trajectories.

This routine analyzes two trajectory-derived stacking event files and identifies residue pairs with the largest change in average distances between the two systems. It processes the data, merges it, calculates discrepancies, and outputs the sorted results.

Parameters:

None: This function relies on global args for input values.

Raises:

FileNotFoundError: If one or both input files are not found.
ValueError: If the input files have incompatible formats or lack the required headers.

See also

find_row_with_header: Identifies the header row in a data file.
preprocess_df: Prepares the input dataframes for comparison.
pd.merge: Merges the two dataframes for discrepancy calculations.

Notes

The input files must contain tab-separated values with the header Row Column Value.
Residue pairings are sorted alphabetically before comparison to ensure consistent results.

Examples

Command-line usage:

$ stacker -s compare 
    -A /home66/esakkas/STACKER/SCRIPTS/slurmLogs_fingerprint/out_fingerprint_2418986 
    -B /home66/esakkas/STACKER/SCRIPTS/slurmLogs_fingerprint/out_fingerprint_2418997 
    -SA _tUAG_aCUA_+1GCU 
    -SB _tUAG_aCUA_+1CGU

exception InvalidRoutine[source]#

Bases: Exception

Specified command line routine was invalid

This is raised when the stacker -s ROUTINE was given an invalid ROUTINE.

Examples

Command Line:

$ stacker -s blah

__init__(*args, **kwargs)#

add_note(object, /)#: Exception.add_note(note) – add a note to the exception

args#

with_traceback(object, /)#: Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception ResEmpty[source]#

Bases: Exception

No Residues to subset to were provided

This is raised when a routine is given an invalid number of residues to subset to.

See also

fiter_traj_routine
bottaro_routine

Examples

Command Line:

$ stacker -s filter_traj -r -a C2,C4,C6

__init__(*args, **kwargs)#

add_note(object, /)#: Exception.add_note(note) – add a note to the exception

args#

with_traceback(object, /)#: Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception AtomEmpty[source]#

Bases: Exception

No Atomnames to subset to were provided

This is raised when a routine is given an invalid number of atomnames to subset to.

See also

fiter_traj_routine
bottaro_routine

Examples

Command Line:

$ stacker -s filter_traj -a -r 426,427

__init__(*args, **kwargs)#

add_note(object, /)#: Exception.add_note(note) – add a note to the exception

args#

with_traceback(object, /)#: Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

exception FrameEmpty[source]#

Bases: Exception

No Frames present in trajectory

__init__(*args, **kwargs)#

add_note(object, /)#: Exception.add_note(note) – add a note to the exception

args#

with_traceback(object, /)#: Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

block_printing()[source]#

Disable printing to standard output

References

[1] https://stackoverflow.com/a/8391735

enable_printing()[source]#

Enable printing to standard output

References

[1] https://stackoverflow.com/a/8391735

stacker package

Contents

stacker package#

Submodules#

stacker.file_manipulation module#

stacker.kmeans module#

stacker.pairwise_distance module#

stacker.residue_movement module#

stacker.vector module#

stacker.visualization module#

Module contents#