Title: | Change Point Detection |
---|---|
Description: | Time series analysis of network connectivity. Detects and visualizes change points between networks. Methods included in the package are discussed in depth in Baek, C., Gates, K. M., Leinwand, B., Pipiras, V. (2021) "Two sample tests for high-dimensional auto-covariances" <doi:10.1016/j.csda.2020.107067> and Baek, C., Gampe, M., Leinwand B., Lindquist K., Hopfinger J. and Gates K. (2023) “Detecting functional connectivity changes in fMRI data” <doi:10.1007/s11336-023-09908-7>. |
Authors: | Changryong Baek [aut, cre], Mattew Gampe [aut], Kathleen M. Gates [aut], Seok-Oh Jeong [aut], Vladas Pipiras [aut] |
Maintainer: | Changryong Baek <[email protected]> |
License: | Unlimited |
Version: | 0.3.0 |
Built: | 2025-03-04 03:31:54 UTC |
Source: | https://github.com/crbaek/detectr |
This dataset contains a simulated multivariate time series with two changepoints at time point 150 and 300. The dimension of the data is T=450 and p=20.
changesim
changesim
An object of class matrix
(inherits from array
) with 450 rows and 20 columns.
This function uses PCA-based method to find breaks. Simultaneous breaks are found from binary segmentation.
detectBinary( Y, Del, L, q = "fixed", alpha = 0.05, nboot = 199, n.cl, bsize = "log", bootTF = TRUE, scaleTF = TRUE, diagTF = TRUE, plotTF = TRUE )
detectBinary( Y, Del, L, q = "fixed", alpha = 0.05, nboot = 199, n.cl, bsize = "log", bootTF = TRUE, scaleTF = TRUE, diagTF = TRUE, plotTF = TRUE )
Y |
data: Y = length*dim |
Del |
Delta away from the boundary restriction |
L |
the number of factors |
q |
methods in calculating long-run variance of the test statistic. Default is "fixed" = length^(1/3). Adaptive selection method is also available via "andrews", or user specify the length |
alpha |
significance level of the test |
nboot |
the number of bootstrap sample for p-value. Default is 199. |
n.cl |
number of cores in parallel computing. The default is (machine cores - 1) |
bsize |
block size for the Block Wild Bootstrapping. Default is log(length), "sqrt" uses sqrt(length), "adaptive" determines block size using data dependent selection of Andrews |
bootTF |
determine whether the threshold is calculated from bootstrap or asymptotic |
scaleTF |
scale the variance into 1 |
diagTF |
include diagonal term of covariance matrix or not |
plotTF |
Draw plot to see test statistic and threshold |
tstathist The complete history of test statistics.
Brhist The sequence of breakpoints found from binary splitting
L The number of factors used in the procedure
q The estimated vectorized autocovariance on each regime.
crit The critical value to identify change point
bsize The block size of the bootstrap
diagTF If TRUE, the diagonal entry of covariance matrix is used in detecting connectivity changes.
bootTF If TRUE, bootstrap is used to find critical value
scaleTF If TRUE, the multivariate signal is studentized to have zero mean and unit variance.
out3= detectBinary(changesim, L=2, n.cl=1)
out3= detectBinary(changesim, L=2, n.cl=1)
This function implements the Dynamic Connectivity Regression (DCR) algorithm proposed by Cribben el al. (2012) to locate changepoints.
detectGlasso( Y, Del, p, lambda = "bic", nboot = 100, n.cl, bound = c(0.001, 1), gridTF = FALSE, plotTF = TRUE )
detectGlasso( Y, Del, p, lambda = "bic", nboot = 100, n.cl, bound = c(0.001, 1), gridTF = FALSE, plotTF = TRUE )
Y |
Input data of dimension length*dim (T times d) |
Del |
Delta away from the boundary restriction |
p |
Gep(p) distribution controls the size of stationary bootstrap. The mean block length is 1/p |
lambda |
two selections possible for optimal parameter of lambda. "bic" finds lambda from bic criteria, or user can directly input the penalty value |
nboot |
the number of bootstrap sample for p-value. Default is 100. |
n.cl |
number of cores in parallel computing. The default is (machine cores - 1) |
bound |
bound of bic search in "bic" rule. Default is (.001, 1) |
gridTF |
minimum bic is found by grid search. Default is FALSE |
plotTF |
Draw plot to see test statistic |
A list with component
br The estimated breakpoints including boundary (0, T)
brhist The sequence of breakpoints found from binary splitting
diffhist The history of BIC reduction on each step
W The estimated vectorized autocovariance on each regime.
WI The estimated vectorized precision matrix on each regime.
lambda The penalty parameter estimated on each regime.
pvalhist The empirical p-values on each binary splitting.
fitzero Detailed output at first stage. Useful in producing plot.
out1= detectGlasso(changesim, p=.2, n.cl=1)
out1= detectGlasso(changesim, p=.2, n.cl=1)
Change point detection using max-type statistic as in Jeong et. al (2016)
detectMaxChange( Y, m = c(30, 40, 50), margin = 30, thre.localfdr = 0.2, design.mat = NULL, plotTF = TRUE, n.cl )
detectMaxChange( Y, m = c(30, 40, 50), margin = 30, thre.localfdr = 0.2, design.mat = NULL, plotTF = TRUE, n.cl )
Y |
Input data matrix |
m |
window sizes |
margin |
margin |
thre.localfdr |
threshold for local fdr |
design.mat |
design matrix for analyzing task data |
plotTF |
Draw plot to see test statistic and threshold |
n.cl |
number of clusters for parallel computing |
CLX Test statistic corresponding to window size arranged in column
CLXLocalFDR The Local FDR calculated for each time point
br The final estimated break points
out2= detectMaxChange(changesim, m=c(30, 35, 40, 45, 50), n.cl=1)
out2= detectMaxChange(changesim, m=c(30, 35, 40, 45, 50), n.cl=1)
Change point detection using PCA and sliding method
detectSliding( Y, wd = 40, L, Del, q = "fixed", alpha = 0.05, nboot = 199, n.cl, bsize = "log", bootTF = TRUE, scaleTF = TRUE, diagTF = TRUE, plotTF = TRUE )
detectSliding( Y, wd = 40, L, Del, q = "fixed", alpha = 0.05, nboot = 199, n.cl, bsize = "log", bootTF = TRUE, scaleTF = TRUE, diagTF = TRUE, plotTF = TRUE )
Y |
data: Y = length*dim |
wd |
window size for sliding averages |
L |
the number of factors |
Del |
Delta away from the boundary restriction |
q |
methods in calculating long-run variance of the test statistic. Default is "fixed" = length^(1/3) or "andrews" implements data adaptive selection, or user specify the length |
alpha |
significance level of the test |
nboot |
the number of bootstrap sample for p-value. Default is 199. |
n.cl |
number of cores in parallel computing. The default is (machine cores - 1) |
bsize |
block size for the Block Wild Bootstrapping. Default is log(length), "sqrt" uses sqrt(length), "adaptive" determines block size using data dependent selection of Andrews |
bootTF |
determine whether the threshold is calculated from bootstrap or asymptotic |
scaleTF |
scale the variance into 1 |
diagTF |
include diagonal term of covariance matrix or not |
plotTF |
Draw plot to see test statistic and threshold |
sW The test statistic
L The number of factors used in the procedure
q The estimated vectorized autocovariance on each regime.
crit The critical value to identify change point
bsize The block size of the bootstrap
diagTF If TRUE, the diagonal entry of covariance matrix is used in detecting connectivity changes.
bootTF If TRUE, bootstrap is used to find critical value
scaleTF If TRUE, the multivariate signal is studentized to have zero mean and unit variance.
out4 = detectSliding(changesim, wd=40, L=2, n.cl=1)
out4 = detectSliding(changesim, wd=40, L=2, n.cl=1)
Defining variables and functions used in the internal functions
Id
preprocess(file = NULL, header = NULL, sep = NULL, signal = NULL, noise = NULL, butterfreq = NULL, model = NULL)
preprocess(file = NULL, header = NULL, sep = NULL, signal = NULL, noise = NULL, butterfreq = NULL, model = NULL)
file |
a data matrix or file name with columns as variables and rows as observations across time. |
header |
logical for whether or not there is a header in the data file. |
sep |
The spacing of the data files. "" indicates space-delimited, "/t" indicates tab-delimited, "," indicates comma delimited. Only necessary to specify if reading data in from physical directory. |
signal |
(optional) a character vector containing the names of variables that contain signal i.e., which variables to use to detect change point. The default (NULL) indicates all variables except those in 'noise' argument are considered signal. Example: signal = c("dDMN4", "vDMN5", "vDMN1", |
noise |
(optional) a character vector containing the names of variables that contain noise. The signal variables will be regressed on these variables and residuals used in change point detection. The default (NULL) indicates there are no noise variables. Example: noise = c("White.Matter1", "CSF1") |
butterfreq |
(optional) bandpass filter frequency ranges. Example: c(.04,.4) |
model |
(optional) syntax indicating which variables belong to which networks for first pass of data reduction that is user-specified. If no header naming convention follows "V#". Notation should follow lavaan syntax style. |
This function utilizes Dynamic Connectivity Regression (DCR) algorithm proposed by Cribben el al. (2012) to test the equality of connectivity in two fMRI signals.
testGlasso( subY1, subY2, p, lambda = "bic", nboot = 100, n.cl, bound = c(0.001, 1), gridTF = FALSE )
testGlasso( subY1, subY2, p, lambda = "bic", nboot = 100, n.cl, bound = c(0.001, 1), gridTF = FALSE )
subY1 |
a sample of size length*dim |
subY2 |
a sample of size length*dim |
p |
Gep(p) distribution controls the size of stationary bootstrap. The mean block length is 1/p |
lambda |
two selections possible for optimal parameter of lambda. "bic" finds lambda from bic criteria, or user can directly input the penalty value. |
nboot |
the number of bootstrap sample for p-value. Default is 100. |
n.cl |
number of cores in parallel computing. The default is (machine cores - 1) |
bound |
bound of bic search in "bic" rule. Default is (.001, 1) |
gridTF |
Utilize a grid search to optimize hyperparameters |
pval The empirical p-value for testing the equality of connectivity structure
rho The sequence of penalty parameter based on the combined sample, subY1 and subY2.
fit0 Output of glasso for combined sample
fit1 Output of glasso for subY1
fit2 Output of glasso for subY2
test1= testGlasso(testsim$X, testsim$Y, n.cl=1)
test1= testGlasso(testsim$X, testsim$Y, n.cl=1)
This function produces three test results based on max-type block bootstrap (BMB), long-run variance block bootstrapping with lagged-window estimator (LVBWR) and sum-type block bootstrap (BSUM). See Baek el al. (2019) for details.
testMax(subY1, subY2, diagTF = TRUE, nboot, q = "fixed", n.cl)
testMax(subY1, subY2, diagTF = TRUE, nboot, q = "fixed", n.cl)
subY1 |
a sample of size length*dim |
subY2 |
a sample of size length*dim |
diagTF |
include diagonal term of covariance matrix or not |
nboot |
number of bootstrap sample, default is 2000 |
q |
methods in calculating long-run variance of the test statistic. Default is "fixed" = length^(1/3) or "andrews" implements data adaptive selection, or user specify the length |
n.cl |
number of cores in parallel computing. The default is (machine cores - 1) |
tstat Test statistic for testing the equality of connectivity structure
pval The p-value for testing the equality of connectivity structure
q The tuning parameter used in calculating long-run variance
test2 = testMax(testsim$X, testsim$Y, n.cl=1)
test2 = testMax(testsim$X, testsim$Y, n.cl=1)
This function performs PCA-test for testing the equality of connectivity in two fMRI signals
testPCA(subY1, subY2, L = 2, nlag, diagTF = TRUE)
testPCA(subY1, subY2, L = 2, nlag, diagTF = TRUE)
subY1 |
a sample of size length*dim |
subY2 |
a sample of size length*dim |
L |
the number of factors |
nlag |
is the number of ACF lag to be used in the test, default is 2, Default is nlag = floor(N^(1/3)) |
diagTF |
include diagonal term of covariance matrix or not |
tstat Test statistic
pval Returns the p-value
df The degree of freedom in PCA-best test
L The number of factors used in the test
diagTF If true, the diagonal entry of covariance matrix is used in testing
test3 = testPCA(testsim$X, testsim$Y, L=2)
test3 = testPCA(testsim$X, testsim$Y, L=2)
This dataset contains a simulated multivariate time series with two different autocovariances. It is a list data with two variables X and Y. Each multivariate time series had dimension of T=150 and p=20
testsim
testsim
An object of class list
of length 2.