ParallelAnalysisBase

class mdcraft.analysis.base.ParallelAnalysisBase(trajectory: ReaderBase, verbose: bool = False, **kwargs)[source]

Bases: SerialAnalysisBase

A multithreaded analysis base object.

Parameters:
trajectoryMDAnalysis.coordinates.base.ReaderBase

Simulation trajectory.

verbosebool, default: True

Determines whether detailed progress is shown.

**kwargs

Additional keyword arguments to pass to MDAnalysis.analysis.base.AnalysisBase.

Methods

get_supported_backends

Tuple with backends supported by the core library for a given class.

run

Performs the calculation in parallel.

save

Saves results to a binary or archive file in NumPy format.

classmethod get_supported_backends()

Tuple with backends supported by the core library for a given class. User can pass either one of these values as backend=... to run() method, or a custom object that has apply method (see documentation for run()):

  • ‘serial’: no parallelization

  • ‘multiprocessing’: parallelization using multiprocessing.Pool

  • ‘dask’: parallelization using dask.delayed.compute(). Requires installation of mdanalysis[dask]

If you want to add your own backend to an existing class, pass a backends.BackendBase subclass (see its documentation to learn how to implement it properly), and specify unsupported_backend=True.

Returns:
tuple

names of built-in backends that can be used in run(backend=...)()

Added in version 2.8.0: ..

property parallelizable

Boolean mark showing that a given class can be parallelizable with split-apply-combine procedure. Namely, if we can safely distribute _single_frame() to multiple workers and then combine them with a proper _conclude() call. If set to False, no backends except for serial are supported.

Note

If you want to check parallelizability of the whole class, without explicitly creating an instance of the class, see _analysis_algorithm_is_parallelizable. Note that you setting it to other value will break things if the algorithm behind the analysis is not trivially parallelizable.

Returns:
bool

if a given AnalysisBase subclass instance is parallelizable with split-apply-combine, or not

Added in version 2.8.0: ..

run(start: int = None, stop: int = None, step: int = None, frames: slice | ndarray[int] = None, verbose: bool = None, *, n_jobs: int = None, module: str = 'multiprocessing', method: str = None, block: bool = True, **kwargs) ParallelAnalysisBase[source]

Performs the calculation in parallel.

Parameters:
startint, optional

Starting frame for analysis.

stopint, optional

Ending frame for analysis.

stepint, optional

Number of frames to skip between each analyzed frame.

framesslice or array-like, optional

Index or logical array of the desired trajectory frames.

verbosebool, optional

Determines whether detailed progress is shown.

n_jobsint, keyword-only, optional

Number of workers. If not specified, it is automatically set to either the minimum number of workers required to fully analyze the trajectory or the maximum number of CPU threads available.

modulestr, keyword-only, default: "multiprocessing"

Parallelization module to use for analysis.

Valid values: "dask", "joblib", and "multiprocessing".

methodstr, keyword-only, optional

Specifies which Dask scheduler, Joblib backend, or multiprocessing start method is used.

blockbool, keyword-only, default: True

Determines whether the trajectory is split into smaller blocks that are processed serially in parallel with other blocks. This “split–apply–combine” approach is generally faster since the trajectory attributes do not have to be packaged for each analysis run. Only available for module="dask".

**kwargs

Additional keyword arguments to pass to dask.compute(), joblib.Parallel, or multiprocessing.pool.Pool, depending on the value of module.

Returns:
selfParallelAnalysisBase

Parallel analysis base object.

save(file: str | TextIO, archive: bool = True, compress: bool = True, **kwargs) None

Saves results to a binary or archive file in NumPy format.

Parameters:
filestr or file

Filename or file-like object where the data will be saved. If file is a str, the .npy or .npz extension will be appended automatically if not already present.

archivebool, default: True

Determines whether the results are saved to a single archive file. If True, the data is stored in a .npz file. Otherwise, the data is saved to multiple .npy files.

compressbool, default: True

Determines whether the .npz file is compressed. Has no effect when archive=False.

**kwargs

Additional keyword arguments to pass to numpy.save(), numpy.savez(), or numpy.savez_compressed(), depending on the values of archive and compress.