correlation

mdcraft.lib.correlation.correlation(x: np.ndarray[float_t | complex_t], y: np.ndarray[float_t | complex_t] | None = None, /, axis: int | None = None, *, average: bool = False, fft: bool = True, symmetrize: bool = False, vector: bool = False) np.ndarray[float_t | complex_t][source]

Evaluates the autocorrelation function (ACF) \(\mathrm{R_\mathbf{XX}}(\tau)\) or cross-correlation function (CCF) \(\mathrm{R_\mathbf{XY}}(\tau)\) of time series \(\mathbf{X}(t)\) and \(\mathbf{Y}(t)\).

Using a naive sliding window technique, the ACF for a data set \(\mathbf{X}(t)\) can be computed using

\[\mathrm{R}_{\mathbf{XX}}(\tau) =\langle\mathbf{X}(t_0+\tau)\cdot\mathbf{X}^*(t_0)\rangle =\dfrac{1}{N_\tau}\sum_{j=1}^{N_\tau} \mathbf{X}(t_j+\tau)\cdot\mathbf{X}^*(t_j)\]

where \(\tau\) is the time lag, \(t_j\) is an arbitrary reference time, \(N_\tau\) is the number of possible reference times, and the asterisk (\(^*\)) denotes the complex conjugate.

Similarly, the CCF for data sets \(\mathbf{X}(t)\) and \(\mathbf{Y}(t)\) can be computed using

\[\mathrm{R}_{\mathbf{XY}}(\tau) =\langle\mathbf{X}(t_0+\tau)\cdot\mathbf{Y}^*(t_0)\rangle =\dfrac{1}{N_\tau}\sum_{j=1}^{N_\tau} \mathbf{X}(t_j+\tau)\cdot\mathbf{Y}^*(t_j)\]

To minimize statistical noise, the ACF/CCF is calculated for and averaged over all possible reference times \(t_j\). As such, this approach has a time complexity of \(\mathcal{O}(N^2)\), making it infeasible for large data sets.

Alternatively, the ACF/CCF can be efficiently computed using the fast convolution algorithm (FCA) [1] [2], which leverages the Wiener–Khinchin theorem. FCA uses fast Fourier transforms (FFT) and has a time complexity of \(\mathcal{O}(N\log{N})\).

Using FCA, the ACF for a data set \(\mathbf{X}(t)\) is computed using

\[\begin{split}\begin{gather*} \hat{\mathbf{X}}(\omega)=\mathcal{F}[\mathbf{X}(t)]\\ \mathrm{R}_{\mathbf{XX}}(\tau)=\mathcal{F}^{-1} [\hat{\mathbf{X}}(\omega)\hat{\mathbf{X}}^*(\omega)] \end{gather*}\end{split}\]

and the CCF for data sets \(\mathbf{X}(t)\) and \(\mathbf{Y}(t)\) is computed using

\[\mathrm{R}_{\mathbf{XY}}(\tau)=\mathcal{F}^{-1}[\mathcal{F} [\mathbf{X}(t)]\cdot\mathcal{F}[\mathbf{Y}(t)]]\]
Parameters:
xnumpy.ndarray, positional-only

Time evolution of scalar or \(d\)-dimensional data \(\mathbf{X}(t)\) for \(N\) entities over \(N_\mathrm{b}\) blocks of \(N_t\) times each.

Shape: \((N_t[,d])\), \((N_t,N[,d])\), \((N_\mathrm{b},N_t[,d])\), or \((N_\mathrm{b},N_t,N[,d])\).

ynumpy.ndarray, positional-only, optional

Time evolution of scalar or \(d\)-dimensional data \(\mathbf{Y}(t)\) for another \(N\) entities over \(N_\mathrm{b}\) blocks of \(N_t\) times each. If provided, the CCF for x and y is evaluated. Otherwise, the ACF for x is evaluated.

Shape: Same as x.

axisint, optional

Axis along which time evolves. If not specified, the axis is determined automatically using the shape of x.

averagebool, keyword-only, default: True

Specifies whether to average the ACF/CCF over all entities. Only available if x and y contain information for multiple entities.

fftbool, keyword-only, default: True

Specifies whether to use fast Fourier transforms (FFT) to evaluate the ACF/CCF.

symmetrizebool, keyword-only, default: False

Specifies whether to double the ACF or to combine the negative and positive time lags for the CCF.

vectorbool, keyword-only, default: False

Specifies whether x and y contain vectors. If True, the ACF/CCF is summed over the last axis.

Returns:
corrnumpy.ndarray

ACF \(\mathrm{R}_{\mathbf{XX}}(\tau)\) or CCF \(\mathrm{R}_{\mathbf{XY}}(\tau)\).

Shape:

For ACF, the shape is that of x but with the following modifications:

  • If average=True, the axis corresponding to the \(N\) entities is no longer present.

  • If vector=True, the last axis is no longer present.

For CCF, the shape is that of x but with the following modifications:

  • If average=True, the axis corresponding to the \(N\) entities is no longer present.

  • If symmetrize=False, the axis corresponding to the \(N_t\) times now has a length of \(2N_t-1\) to accomodate negative and positive time lags.

  • If vector=True, the last axis is no longer present.

References

[1]

Kneller, G. R.; Keiner, V.; Kneller, M.; Schiller, M. NMOLDYN: A Program Package for a Neutron Scattering Oriented Analysis of Molecular Dynamics Simulations. Computer Physics Communications 1995, 91 (1–3), 191–214. https://doi.org/10.1016/0010-4655(95)00048-K.

[2]

Calandrini, V.; Pellegrini, E.; Calligari, P.; Hinsen, K.; Kneller, G. R. NMoldyn - Interfacing Spectroscopic Experiments, Molecular Dynamics Simulations and Models for Time Correlation Functions. JDN 2011, 12, 201–232. https://doi.org/10.1051/sfn/201112010.