Datasets

MolDynPlot includes several dataset classes that build on Dataset will additions specific for molecular dynamics simulation data.

CorrDataset

class moldynplot.dataset.CorrDataset.CorrDataset(verbose=1, debug=0, **kwargs)

Bases: moldynplot.myplotspec.Dataset.Dataset

Represents correlations between different datasets.

classmethod get_cache_key(*args, **kwargs)

Generates tuple of arguments to be used as key for dataset cache.

static get_cache_message(cache_key)

Generates message to be used when reloading previously-loaded dataset.

Parameters:cache_key (tuple) – key with which dataset object is stored in dataset cache
Returns:str – message to be used when reloading previously-loaded dataset

HSQCDataset

class moldynplot.dataset.HSQCDataset.HSQCDataset(hoffset=0, noffset=0, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.myplotspec.Dataset.Dataset

Represents two-dimensional NMR data.

hsqc_df

DataFrame – DataFrame whose two-dimensional index corresponds to hydrogen and nitrogen chemical shift in ppm and whose columns correspond to intensity

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • hoffset (float, optional) – Offset added to 1H dimension
  • noffset (float, optional) – Offset added to 15N dimension
  • outfile (str, optional) – Path to output file; may contain environment variables
  • interactive (bool) – Provide iPython prompt and reading and processing data
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

read(**kwargs)

Reads HSQC data from one or more infiles into a DataFrame.

Parameters:
  • infile{s} (str) – Path(s) to input file(s); may contain environment variables and wildcards
  • dataframe_kw (dict) – Keyword arguments passed to DataFrame (hdf5 only)
  • read_csv_kw (dict) – Keyword arguments passed to read_csv (text only)
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
Returns:

DataFrame – DataFrame

MDGXDataset

class moldynplot.dataset.MDGXDataset.MDGXDataset(infile, selections=None, **kwargs)

Bases: moldynplot.myplotspec.Dataset.Dataset

Represents MDGX force field parameterization data.

classmethod get_cache_key(infile, selections=None, *args, **kwargs)

Generates tuple of arguments to be used as key for dataset cache.

SAXSDataset

class moldynplot.dataset.SAXSDataset.SAXSDataset(infile, address=None, dataset_cache=None, **kwargs)

Bases: moldynplot.myplotspec.Dataset.Dataset

Represents Small-Angle X-ray Scattering Data.

Initializes dataset.

Parameters:
  • infile (str) – Path to input file, may contain environment variables
  • address (str) – Address within hdf5 file from which to load dataset (hdf5 only)
  • slice (slice) – Slice to load from hdf5 dataset (hdf5 only)
  • dataframe_kw (dict) – Keyword arguments passed to pandas.DataFrame(...) (hdf5 only)
  • read_csv_kw (dict) – Keyword arguments passed to pandas.read_csv(...) (text only)
  • verbose (int) – Level of verbose output
  • debug (int) – Level of debug output
  • kwargs (dict) – Additional keyword arguments
scale(scale, **kwargs)

Scales SAXS intensity, either by a constant or to match the intensity of a target dataset.

Parameters:
  • scale (float, str) – If float, proportion by which to scale intensity; if str, path to input file to which intensity will be scaled, may contain environment variables
  • curve_fit_kw (dict) – Keyword arguments passed to scipy.optimize.curve_fit (scale to match target dataset only)
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments

SAXSDiffDataset

class moldynplot.dataset.SAXSDataset.SAXSDiffDataset(dataset_cache=None, **kwargs)

Bases: moldynplot.dataset.SAXSDataset.SAXSDataset

Represents Small Angle X-ray Scattering difference data.

SAXSExperimentDataset

class moldynplot.dataset.SAXSDataset.SAXSExperimentDataset(scale=False, **kwargs)

Bases: moldynplot.dataset.SAXSDataset.SAXSDataset

Represents Small Angle X-ray Scattering experimental data.

Parameters:
  • infile (str) – Path to input file, may contain environment variables
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments

SequenceDataset

Processes data that is a function of amino acid sequence


Command-line interface

Optional arguments

Argument Description
-h, --help show this help message and exit

Subcommands

Argument Description
sequence Process standard data
chemical_shift Process NMR chemical shift data
relax Process NMR relaxation data
ired Process NMR relaxation data calculated from MD simulation using the iRED method as implemented in cpptraj

sequence subcommand

Process standard data

Optional arguments
Argument Description
-h, --help show this help message and exit
-v, --verbose enable verbose output, may be specified more than once
-q, --quiet disable verbose output
-d, --debug enable debug output, may be specified more than once
-I, --interactive enable interactive ipython terminal after loading and processing data
Input
Argument Description
-indexfile  INDEXFILE text file from which to load residue names; should list amino acids in the form ‘XAA:#’ separated by whitespace; if omitted will be taken from rows of first infile; may contain environment variables
-infiles  INFILE [INFILE ...] file(s) from which to load data; may be text or hdf5; may contain environment variables and wildcards
Output
Argument Description
-outfile  OUTFILE text or hdf5 file to which processed DataFrame will be output; may contain environment variables

chemical_shift subcommand

Process NMR chemical shift data

Optional arguments
Argument Description
-h, --help show this help message and exit
-v, --verbose enable verbose output, may be specified more than once
-q, --quiet disable verbose output
-d, --debug enable debug output, may be specified more than once
-I, --interactive enable interactive ipython terminal after loading and processing data
Input
Argument Description
-delays  DELAY [DELAY ...] delays for each infile, if infiles represent a series; number of delays must match number of infiles
-indexfile  INDEXFILE text file from which to load residue names; should list amino acids in the form ‘XAA:#’ separated by whitespace; if omitted will be taken from rows of first infile; may contain environment variables
Action
Argument Description
-relax  [CALC_RELAX] Calculate relaxation rates and standard errors; may additionally specify kind of relaxation being measured (e.g. r1, r2)
Input
Argument Description
-infiles  INFILE [INFILE ...] file(s) from which to load data; may be text or hdf5; may contain environment variables and wildcards
Output
Argument Description
-outfile  OUTFILE text or hdf5 file to which processed DataFrame will be output; may contain environment variables

relax subcommand

Process NMR relaxation data

Optional arguments
Argument Description
-h, --help show this help message and exit
-v, --verbose enable verbose output, may be specified more than once
-q, --quiet disable verbose output
-d, --debug enable debug output, may be specified more than once
-I, --interactive enable interactive ipython terminal after loading and processing data
Input
Argument Description
-indexfile  INDEXFILE text file from which to load residue names; should list amino acids in the form ‘XAA:#’ separated by whitespace; if omitted will be taken from rows of first infile; may contain environment variables
-infiles  INFILE [INFILE ...] file(s) from which to load data; may be text or hdf5; may contain environment variables and wildcards
Output
Argument Description
-outfile  OUTFILE text or hdf5 file to which processed DataFrame will be output; may contain environment variables

ired subcommand

Process NMR relaxation data calculated from MD simulation using the iRED method as implemented in cpptraj

Optional arguments
Argument Description
-h, --help show this help message and exit
-v, --verbose enable verbose output, may be specified more than once
-q, --quiet disable verbose output
-d, --debug enable debug output, may be specified more than once
-I, --interactive enable interactive ipython terminal after loading and processing data
Input
Argument Description
-infiles  INFILE [INFILE ...] File(s) from which to load data; may be text or hdf5; if text, may be pandas-formatted DataFrames, or may be cpptraj-formatted iRED output; may contain environment variables and wildcards
-indexfile  INDEXFILE text file from which to load residue names; should list amino acids in the form ‘XAA:#’ separated by whitespace; if omitted will be taken from rows of first infile; may contain environment variables
Output
Argument Description
-outfile  OUTFILE text or hdf5 file to which processed DataFrame will be output; may contain environment variables

SequenceDataset

class moldynplot.dataset.SequenceDataset.SequenceDataset(calc_pdist=False, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.myplotspec.Dataset.Dataset

Represents data that is a function of amino acid sequence.

sequence_df

DataFrame – DataFrame whose index corresponds to amino acid residue number in the form XAA:# and whose columns are a series of quantities specific to each residue. Standard errors of these quantities may be represented by adjacent columns with ‘ se’ appended to their names.

               r1     r1 se        r2     r2 se  ...
residue
GLN:2    2.451434  0.003734  5.041334  0.024776  ...
TYR:3    2.443613  0.004040  5.138383  0.025376  ...
LYS:4    2.511626  0.004341  5.589428  0.026236  ...
...      ...       ...       ...       ...       ...
Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • use_indexes (list) – Residue indexes to select from DataFrame, once DataFrame has already been loaded
  • calc_pdist (bool) – Calculate probability distribution using calc_pdist()
  • dataset_cache (dict) – Cache of previously-loaded Datasets
  • interactive (bool) – Provide iPython prompt and reading and processing data
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

classmethod get_cache_key(**kwargs)

Generates key for dataset cache.

See SequenceDataset for argument details.

Returns:tuple – Cache key; contains arguments sufficient to reconstruct dataset
read(**kwargs)

Reads sequence from one or more infiles into a DataFrame.

Extends Dataset with option to read in residue indexes.

calc_pdist(**kwargs)

Calculates probability distribution across sequence.

Parameters:
  • df (DataFrame) – DataFrame; probability distribution will be calculated for each column using rows as data points
  • pdist_kw (dict) – Keyword arguments used to configure probability distribution calculation
  • pdist_kw[columns] (list) – Columns for which to calculate probability distribution
  • pdist_kw[mode] (ndarray, dict) – Method of calculating probability distribution; eventually will support ‘hist’ for histogram and ‘kde’ for kernel density estimate, though presently only kde is implremented
  • pdist_kw[grid] (ndarray, dict, optional) – Grid on which to calculate kernel density estimate; may be a single ndarray that will be applied to all columns or a dictionary whose keys are column names and values are ndarrays corresponding to the grid for each column; for any column for which grid is not specified, a grid of 1000 points between the minimum value minus three times the standard deviation and the maximum value plots three times the standard deviation will be used
  • pdist_kw[bandwidth] (float, dict, str, optional) – Bandwidth to use for kernel density estimates; may be a single float that will be applied to all columns or a dictionary whose keys are column names and values are floats corresponding to the bandwidth for each column; for any column for which bandwidth is not specified, the standard deviation will be used; alternatively may be ‘se’, in which case the standard error of each value will be used
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
Returns:

dict – Dictionary whose keys are columns in df and values are DataFrames whose indexes are the grid for that column and contain a single column ‘probability’ containing the normalized probability at each grid point

ChemicalShiftDataset

class moldynplot.dataset.SequenceDataset.ChemicalShiftDataset(delays=None, calc_relax=False, calc_pdist=False, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.dataset.SequenceDataset.SequenceDataset

Represents an NMR peak list data.

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • delays (list) – Delays corresponding to series of infiles; used to name columns of merged sequence DataFrame
  • use_indexes (list) – Residue indexes to select from DataFrame, once DataFrame has already been loaded
  • calc_pdist (bool) – Calculate probability distribution
  • pdist_kw (dict) – Keyword arguments used to configure probability distribution calculation
  • dataset_cache (dict) – Cache of previously-loaded Datasets
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

read(**kwargs)

Reads sequence from one or more infiles into a DataFrame.

Extends Dataset with option to read in residue indexes.

calc_relax(**kwargs)

Calculates relaxation rates.

Parameters:
  • df (DataFrame) – DataFrame; probability distribution will be calculated for each column using rows as data points
  • relax_kw (dict) – Keyword arguments used to configure relaxation rate calculation
  • relax_kw[kind] (str) – Kind of relaxation rate being calculated; will be used to name column
  • relax_kw[intensity_method] (str) – Metric to use for peak instensity; may be ‘height’ (default) or ‘volume’
  • relax_kw[error_method] (str) – Metric to use for error calculation; may be ‘rmse’ for root-mean-square error (default) or ‘mae’ for mean absolute error
  • relax_kw[n_synth_datasets] (int) – Number of synthetic datasets to use for error calculation
Returns:

DataFrame – Sequence DataFrame with additional columns for relaxation rate and standard error

RelaxDataset

class moldynplot.dataset.SequenceDataset.RelaxDataset(calc_pdist=False, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.dataset.SequenceDataset.SequenceDataset

Represents NMR relaxation data as a function of residue number.

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • use_indexes (list) – Residue indexes to select from DataFrame, once DataFrame has already been loaded
  • calc_pdist (bool) – Calculate probability distribution
  • pdist_kw (dict) – Keyword arguments used to configure probability distribution calculation
  • dataset_cache (dict) – Cache of previously-loaded Datasets
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

write_for_relax(outfile, **kwargs)

Writes sequence DataFrame in format readable by relax.

IREDDataset

class moldynplot.dataset.SequenceDataset.IREDDataset(calc_pdist=False, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.dataset.SequenceDataset.RelaxDataset

Represents iRED NMR relaxation data as a function of residue number.

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • use_indexes (list) – Residue indexes to select from DataFrame, once DataFrame has already been loaded
  • calc_pdist (bool) – Calculate probability distribution
  • pdist_kw (dict) – Keyword arguments used to configure probability distribution calculation
  • dataset_cache (dict) – Cache of previously-loaded Datasets
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

static average_independent(relax_dfs=None, order_dfs=None, **kwargs)

Calculates the average and standard error of a set of independent iRED datasets.

Parameters:
  • relax_dfs (list) – DataFrames containing data from relax infiles
  • order_dfs (list) – DataFrames containing data from order infiles
  • kwargs (dict) – Additional keyword arguments
Returns:

df (DataFrame) – Averaged DataFrame including relax and order

read(**kwargs)

Reads iRED sequence data from one or more infiles into a DataFrame.

infiles may contain relaxation data, order parameters, or both. If more than one infile is provided, the resulting DataFrame will contain their average, and the standard error will be calculated assuming the infiles represent independent samples.

After generating the DataFrame from infiles, the index may be set by loading a list of residue names and numbers in the form XAA:# from indexfile. This is useful when loading data from files that do not specify residue names.

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • dataframe_kw (dict) – Keyword arguments passed to DataFrame (hdf5 only)
  • read_csv_kw (dict) – Keyword arguments passed to read_csv (text only)
  • indexfile (str) – Path to index file; may contain environment variables
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
Returns:

df (DataFrame) – iRED sequence DataFrame

TimeSeriesDataset

Processes data that is a function of time


Command-line interface

Optional arguments

Argument Description
-h, --help show this help message and exit

Subcommands

Argument Description
timeseries Process standard data
ired Process NMR relaxation data calculated from MD simulation using the iRED method as implemented in cpptraj
pre Process NMR paramagnetic relaxation enhancement data

timeseries subcommand

Process standard data

Optional arguments
Argument Description
-h, --help show this help message and exit
-v, --verbose enable verbose output, may be specified more than once
-q, --quiet disable verbose output
-d, --debug enable debug output, may be specified more than once
-I, --interactive enable interactive ipython terminal after loading and processing data
Action
Argument Description
-dt  DT time between frames
-toffset  TOFFSET offset to add to index (time or frame number)
-downsample  DOWNSAMPLE factor by which to downsample data
--pdist calculate probability distribution over timeseries
Input
Argument Description
-infiles  INFILE [INFILE ...] file(s) from which to load data; may be text or hdf5; may contain environment variables and wildcards
Output
Argument Description
-outfile  OUTFILE text or hdf5 file to which processed DataFrame will be output; may contain environment variables

ired subcommand

Process NMR relaxation data calculated from MD simulation using the iRED method as implemented in cpptraj

Optional arguments
Argument Description
-h, --help show this help message and exit
-v, --verbose enable verbose output, may be specified more than once
-q, --quiet disable verbose output
-d, --debug enable debug output, may be specified more than once
-I, --interactive enable interactive ipython terminal after loading and processing data
Action
Argument Description
--mean Calculate mean and standard error over timeseries
-dt  DT time between frames
-toffset  TOFFSET offset to add to index (time or frame number)
-downsample  DOWNSAMPLE factor by which to downsample data
--pdist calculate probability distribution over timeseries
Input
Argument Description
-infiles  INFILE [INFILE ...] File(s) from which to load data; may be text or hdf5; if text, may be pandas-formatted DataFrames, or may be cpptraj-formatted iRED output; may contain environment variables and wildcards
-indexfile  INDEXFILE text file from which to load residue names; should list amino acids in the form ‘XAA:#’ separated by whitespace; if omitted will be taken from rows of first infile; may contain environment variables
Output
Argument Description
-outfile  OUTFILE text or hdf5 file to which processed DataFrame will be output; may contain environment variables

pre subcommand

Process NMR paramagnetic relaxation enhancement data

Optional arguments
Argument Description
-h, --help show this help message and exit
-v, --verbose enable verbose output, may be specified more than once
-q, --quiet disable verbose output
-d, --debug enable debug output, may be specified more than once
-I, --interactive enable interactive ipython terminal after loading and processing data
Action
Argument Description
-dt  DT time between frames
-toffset  TOFFSET offset to add to index (time or frame number)
-downsample  DOWNSAMPLE factor by which to downsample data
--pdist calculate probability distribution over timeseries
Input
Argument Description
-infiles  INFILE [INFILE ...] file(s) from which to load data; may be text or hdf5; may contain environment variables and wildcards
Output
Argument Description
-outfile  OUTFILE text or hdf5 file to which processed DataFrame will be output; may contain environment variables

TimeSeriesDataset

class moldynplot.dataset.TimeSeriesDataset.TimeSeriesDataset(dt=None, toffset=None, downsample=None, calc_pdist=False, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.myplotspec.Dataset.Dataset

Represents data as a function of time.

timeseries_df

DataFrame – DataFrame whose index corresponds to time as represented by frame number or chemical time and whose columns are a series of quantities as a function of time.

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • dt (float) – Time interval between points; units unspecified
  • toffset (float) – Time offset to be added to all points (i.e. time of first point)
  • downsample (int) – Interval by which to downsample points
  • downsample_mode (str) – Method of downsampling; may be ‘mean’ or ‘mode’
  • calc_pdist (bool) – Calculate probability distribution
  • pdist_key (str) – Column of which to calculate probability distribution
  • kde_kw (dict) – Keyword arguments passed to sklearn.neighbors.KernelDensity; key argument is ‘bandwidth’
  • grid (ndarray) – Grid on which to calculate probability distribution
  • interactive (bool) – Provide iPython prompt and reading and processing data
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
  • todo (.) –
    • Calculate pdist using histogram
    • Verbose pdist
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

downsample(downsample, downsample_mode=u'mean', **kwargs)

Downsamples time series.

Parameters:
  • downsample (int) – Interval by which to downsample points
  • downsample_mode (str) – Method of downsampling; may be ‘mean’ or ‘mode’
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
calc_pdist(**kwargs)

Calcualtes probability distribution of time series.

Parameters:
  • pdist_kw (dict) – Keyword arguments used to configure probability distribution calculation
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
timeseries_to_sequence(**kwargs)

Calculates the mean and standard error over a timeseries.

Parameters:
  • {timeseries}d{ata}f{rame} (DataFrame) – Timeseries over which to calculate mean and standard error; if omitted looks for timeseries_df
  • block_kw (dict) – Keyword arguments passed to fpblockaverager.FPBlockAverager
  • block_kw[all_factors] (bool) – Use all factors by which the
  • is divisible rather than only factors of two (dataset) –
  • block_kw[min_n_blocks] (int) – Minimum number of blocks after transformation
  • block_kw[max_cut] (float) – Maximum proportion of dataset of omit in transformation
  • block_kw[fit_exp] (bool) – Fit exponential curve
  • block_kw[fit_sig] (bool) – Fit sigmoid curve
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
Returns:

DataFrame – Sequence dataframe including mean and standard error for each column in timeseries_df

IREDTimeSeriesDataset

class moldynplot.dataset.TimeSeriesDataset.IREDTimeSeriesDataset(dt=None, toffset=None, downsample=None, calc_pdist=False, calc_mean=False, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.dataset.TimeSeriesDataset.TimeSeriesDataset, moldynplot.dataset.SequenceDataset.IREDDataset

Represents iRED NMR relaxation data as a function of time and residue number.

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • dt (float) – Time interval between points; units unspecified
  • toffset (float) – Time offset to be added to all points (i.e. time of first point)
  • downsample (int) – Interval by which to downsample points
  • downsample_mode (str) – Method of downsampling; may be ‘mean’ or ‘mode’
  • calc_pdist (bool) – Calculate probability distribution
  • interactive (bool) – Provide iPython prompt and reading and processing data
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

static concatenate_timeseries(timeseries_dfs=None, relax_dfs=None, order_dfs=None, **kwargs)

Concatenates a series of iRED datasets.

Parameters:
  • relax_dfs (list) – DataFrames containing data from relax infiles
  • order_dfs (list) – DataFrames containing data from order infiles
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
Returns:

df (DataFrame) – Averaged DataFrame including relax and order

read(**kwargs)

Reads iRED time series data from one or more infiles into a DataFrame.

NatConTimeSeriesDataset

class moldynplot.dataset.TimeSeriesDataset.NatConTimeSeriesDataset(downsample=None, calc_pdist=True, **kwargs)

Bases: moldynplot.dataset.TimeSeriesDataset.TimeSeriesDataset

Manages native contact datasets.

Parameters:
  • infile (str) – Path to input file, may contain environment variables
  • usecols (list) – Columns to select from DataFrame, once dataframe has already been loaded
  • dt (float) – Time interval between points; units unspecified
  • toffset (float) – Time offset to be added to all points (i.e. time of first point)
  • cutoff (float) – Minimum distance within which a contact is considered to be formed
  • downsample (int) – Interval by which to downsample points using mode
  • pdist (bool) – Calculate probability distribution
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments

PRETimeSeriesDataset

class moldynplot.dataset.TimeSeriesDataset.PRETimeSeriesDataset(dt=None, toffset=None, downsample=None, calc_pdist=False, outfile=None, interactive=False, **kwargs)

Bases: moldynplot.dataset.TimeSeriesDataset.TimeSeriesDataset, moldynplot.dataset.SequenceDataset.RelaxDataset

Represents paramagnetic relaxation enhancement data as a function of time and residue number.

Parameters:
  • infile{s} (list) – Path(s) to input file(s); may contain environment variables and wildcards
  • dt (float) – Time interval between points; units unspecified
  • toffset (float) – Time offset to be added to all points (i.e. time of first point)
  • downsample (int) – Interval by which to downsample points
  • downsample_mode (str) – Method of downsampling; may be ‘mean’ or ‘mode’
  • calc_pdist (bool) – Calculate probability distribution
  • pdist_key (str) – Column of which to calculate probability distribution
  • kde_kw (dict) – Keyword arguments passed to sklearn.neighbors.KernelDensity; key argument is ‘bandwidth’
  • grid (ndarray) – Grid on which to calculate probability distribution
  • interactive (bool) – Provide iPython prompt and reading and processing data
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments
static construct_argparser(parser_or_subparsers=None, **kwargs)

Adds arguments to an existing argument parser, constructs a subparser, or constructs a new parser

Parameters:
  • (ArgumentParser, _SubParsersAction, (parser_or_subparsers) – optional): If ArgumentParser, existing parser to which arguments will be added; if _SubParsersAction, collection of subparsers to which a new argument parser will be added; if None, a new argument parser will be generated
  • kwargs (dict) – Additional keyword arguments
Returns:

ArgumentParser – Argument parser or subparser

SAXSTimeSeriesDataset

class moldynplot.dataset.TimeSeriesDataset.SAXSTimeSeriesDataset(infile, address=u'saxs', downsample=None, calc_mean=False, calc_error=True, error_method=u'std', scale=False, **kwargs)

Bases: moldynplot.dataset.TimeSeriesDataset.TimeSeriesDataset, moldynplot.dataset.SAXSDataset.SAXSDataset

Manages Small Angle X-ray Scattering time series datasets.

Parameters:
  • infile (str) – Path to input file, may contain environment variables
  • usecols (list) – Columns to select from DataFrame, once dataframe has already been loaded
  • dt (float) – Time interval between points; units unspecified
  • toffset (float) – Time offset to be added to all points (i.e. time of first point)
  • downsample (int) – Interval by which to downsample points
  • verbose (int) – Level of verbose output
  • kwargs (dict) – Additional keyword arguments

H5Dataset

class moldynplot.dataset.H5Dataset(**kwargs)

Bases: object

Class for managing hdf5 datasets

Warning

Will be reimplemented or removed eventually

Parameters:
  • infiles (list) – List of infiles
  • infile (str) – Alternatively, single infile
load(infiles, **kwargs)

Loads data from h5 files.

Parameters:infiles (list) – infiles