The package involves estimation and inference for locally-stationary time series using the tools and methods from [1,2,3]. The estimation procedure is nonparametric and based on the sieve methods as in [4]. The inference procedure is based on a bootstrapping procedure as in [5]. In addition, the package also provides functions to generate Daubechies and Coiflet wavelet by Cascade algorithm and to process data visualization.
library(ggplot2)
library(Sie2nts)
library(plotly)
The package contains various orthogonal basis functions on
[0,1].bs.gene()
function aims at generating the value of
k-th basis function. bs.plot()
function gives the plot of
basis functions. The Option type for all functions in this package is
used for choosing basis. “Legen” for Legendre polynomials, “Cheby” for
first kind Chebyshev polynomials, “tri” for trigonometric polynomials,
“cos” for cosine polynomials, “sin” for sine polynomials, “Cspli” for
splines means Class of splines functions. “db1” to “db20” for
Daubechies1 wavelet basis to Daubechies20 wavelet basis and “cf1” to
“cf5” for Coiflet1 wavelet basis to Coiflet5 wavelet basis. Brief
definitions are given for understanding.
The Legendre polynomial of degree n can be obtained using Rodrigue’s formula \[ P_{n}(x) = \frac{1}{2^nn!} \frac{d^n}{dx^n}(x^2 -1)^n, -1 \le x\le1. \]
In this package, we use the normalized Legendre polynomial
\[ P^*_{n}(x) = \begin{cases} 1, \quad &n=0;\\ \sqrt{2n+1}P_n(2x-1), \quad &n>0. \end{cases} \tag{1} \]
# The first parameter "Legen" represents the type of basis and the second parameter 5 determines the 5-th basis function of Legendre basis.300 indicates the number of values got from k-th order of Legendre basis function, the default value is 500. There are some others options when other type of basis is chosen, more details in the manual description.
bs.gene("Legen", 5, 300)
#The first parameter "Legen" represents the type of basis and the second parameter 5 determines the 5-th basis function of Legendre basis. There are some others options when other type of basis is chosen, more details in the manual description.
bs.plot("Legen", 5, title = "Legen basis")
Chebyshev polynomials kind 1 is used in this package and it can be obtain by recurrence relation
\[ \begin{aligned} T_0(x) & = 1 \\ T_1(x) & = x \\ T_{n+1}(x) & = 2xT_n(x) - T_{n-1}(x) \end{aligned} \]
We get the normalized score by Riemann integral for each basis to normalization.
bs.gene("Cheby", 5)
bs.plot("Cheby", 5, title = "Chebyshev basis")
These 3 polynomials are defined in [4]. We give a brief summary. Let \(\text{TriPol}(J_n)\) denote the space of trigonometric polynomials on [0, 1] of degree \(J_n\) or less; we also give the normalize score, that is,
\[ \text{TriPol}(J_n) = \Big\{a_0 + \sum_{k=1}^{J_n}[a_k\cos(2k\pi x) + b_k\sin(2k\pi x)], x \in[0,1]: a_k = b_k = \sqrt{2} \in \mathcal{R}, a_0 = 1 \Big\} \]
bs.gene("tri", 3)
bs.plot("tri", 3, title = "trigonometric basis") # the number of lines in the plot is 1 + 2*(3-1) = 5
Let \(CosPol(J_n)\) denote the space of cosine polynomials on [0, 1] of degree \(J_n\) or less; we also give the normalize score, that is,
\[ \text{CosPol}(J_n) = \Big\{a_0 + \sum_{k=1}^{J_n}a_k\cos(k\pi x), x \in[0,1]: a_k = \sqrt{2}, a_0 = 1 \Big\} \]
bs.gene("cos", 5)
bs.plot("cos", 5, title = "cosine basis")
Let \(SinPol(J_n)\) denote the space of sine polynomials on [0, 1] of degree \(J_n\) or less; we also give the normalize score, that is,
\[ \text{SinPol}(J_n) = \Big\{\sum_{k=1}^{J_n}a_k\sin(k\pi x), x \in[0,1]: a_k = \sqrt{2} \Big\} \]
bs.gene("sin", 5)
bs.plot("sin", 5, title = "sine basis")
Univariate splines is defined in [4], we also use the function
splineDesign()
in the R base library splines
to generate Splines functions. We get the normalized score by Riemann
integral for each basis to normalization. Let \(J_n\) knots satisfy (2.17) in [4] page
5571. \(Spl(r, J_n)\) denote the space
of splines of order r (or of degree m ≡ r − 1) with \(J_n\).Since,
\[ \text{Spl}(r, J_n) = \Bigg\{\sum_{k=0}^{r-1}a_kx^k + \sum_{j=1}^{J_n} b_j[\max\{x-t_j, 0 \}]^{r-1},x \in[0,1]: a_k,b_j \in \mathcal R \Bigg\} \]
In this option, the first input “c” is knots plus 2 that represent 0 and 1. “or” indicates the order of splines, so the number of basis is number of knots + 2 - 2 plus the number of order. When functions automatically choose the number of basis for splines, the number is not less than the order of spline.
bs.gene("Cspli", 3, 500, c=5, or =3) # c-2+or
bs.plot("Cspli", 3, or = 3, "Spline basis")
This package also provides function generate_wv_table()
to generate scaling function of Daubechies and Coiflets wavelet by
Cascade Algorithm. Users can input the low-pass filter coefficients to
generate. With more number of iterations function will return more
accurate wavelet table. However, the number of points should carefully
determine. For a given \(J_n\) and
\(J_0\), we will consider the following
periodized wavelets on [0, 1], for \(N \in
\mathbb{N}\),
\[ \Bigg\{ \varphi_{J_nk}(x), 0 \le k \le 2^{J_n} -1 \Bigg\} \] Where \(\varphi(x)\) is the scaling wavelet function satisfies the recursion equation \(\varphi(x) = \sqrt{2} \sum_{k=0}^{2N-1} h_k\varphi(2x -k)\) supported on \([0, 2N -1]\) and \(h_0, ..., h_{2N-1}\) are the constant filter coefficients satisfying for \(l = 0,1,...,N-1\)
\[ \sum_{k=2l}^{2N-1+2l} h_kh_{k-2l} = \begin{cases} 1, \quad &l=0,\\ 0, \quad &l \ne0. \end{cases} \tag{1} \] When users input \(k\) as parameter for number of basis for Daubechies and Coiflet wavelet. The return will have \(2^k\) basis. \(k=1,2,3\) are recommend to use. Higher \(k\) is time consuming.
# This example generate Daubechies 2 wavelet scaling function
coef = c(0.48296291314469025, 0.836516303737469, 0.22414386804185735, -0.12940952255092145)
tb = generate_wv_table(coef, 31680, 5)
plot(tb)
bs.gene("db10", 3) # it returns whole table of Daubechies10 wavelet basis with 2^3 number of basis
bs.plot("db10", 3, "Daubechies10 wavelet basis")
## Warning: package 'stringr' was built under R version 4.2.2
bs.gene("cf1", 3) # it returns whole table of Coiflet1 wavelet basis with 2^3 number of basis
bs.plot("cf1", 3, "Coiflet1 wavelet basis")
We consider the AR(1) and AR(2) models to check our results. This
user’s guide contains 2 functions for generating 2 different types of
time series named generate_AR1()
,
generate_AR2()
. For each function, the input n indicates
the size of the sample and v is the variance of the sample. Parameter v
is for adjusting purpose. The output is one vector contain n length time
series basis on 2 different models respectively.
generate_AR1()
simulates the time series, \[
x_i = 0.2 +0.6\cos(2\pi\frac{i}{n})x_{i-1} + \epsilon_i
\] where \(\epsilon_i\) are
i.i.d Gaussian with variance \(v\) for
\(i = 1, ..., n\)
generate_AR2()
simulates the time series, \[
x_i = 0.6\sin(2\pi\frac{i}{n})x_{i-1} + 0.4\cos(2\pi\frac{i}{n})x_{i-2}
+ \epsilon_i
\] where \(\epsilon_i\) are
i.i.d Gaussian with variance \(v\) for
\(i = 1, ..., n\)
This package contains 4 functions for estimation.
fix.fit()
and fix.pacf()
focus on users
specific and it requires that users select the number of basis, the lag
for time series manually for getting AR approximation coefficients and
estimation of PACF. sie.auto.fit()
and
sie.auto.pacf()
will get estimation of AR approximation
coefficients and estimation of PACF automatically basis on different
rules. This package provides 3 rules.The first choice is “LOOCV” uses
Leave-One-Out-Cross-Validation. The second choice is “CV” which uses the
Cross-Validation method, it takes \(3*log_2(n)\) size as validation set where
\(n\) is the number of total
observations. The third choice is “Elbow”. This method similar as
“LOOCV”, however, it need to set the threshold manually. The function
will choose the smallest tuning parameters once the value of LOOCV is
less than threshold and get the estimation. Users can use
fix.plot()
(for fix.fit()
),
sie.auto.plot()
(for sie.auto.fit()
) and
sie.plot.pacf()
to present the plot of results of
estimation. In addition, sie.plot.pacf()
can plot the PACF
with 2 dimensions or 3 dimensions.
The section 3 in [2] shows the process of estimation of AR approximation coefficients. Given locally time series \(\big\{ x_i \big\}\) and denoted nonparametric sieve estimators of \(\big\{\phi_j(\cdot)\big\}\). It can be approximated via a finite and diverging term basis expansion, given \(\big\{\alpha_k(t)\big\}\) some pre-chosen basis function on [0, 1]. we have that,
\[ \phi_j(\frac{i}{n}) = \sum_{k=1}^{c} a_{jk}\alpha_{k}(\frac{i}{n}) + O(c^{-d}), 0 \le j \le b, i > b \]
Moreover, for i > b, write
\[ x_i = \sum_{j = 0}^{b} \sum_{k=1}^{c} a_{jk}z_{kj} + \epsilon_i + O_{l^2}(\cdot), \]
where \(z_{kj} \equiv z_{kj}(\frac{i}{n}) := \alpha_k(\frac{i}{n})x_{i-j}\) for \(j \ge 1\) and \(z_{k0} = \alpha_k(\frac{i}{n})\). Details of term \(O_{l^2}(\cdot)\) can be found in [2]. We can estimate all the \(a^{'}_{jk}s\) using ordinary least squares regression with a diverging number of predictors.In particular, we write all \(a_{jk}\) \(j=0,1,2,3...b,\) \(k=1,2,...,c\) as a vector \(\boldsymbol{\beta} \in \mathbb{R^{n-b}}\) and \(\boldsymbol{Y}\) is the design matrix and given by, \[ Y = \left(\begin{array}{cc} Z^i_{10} & ...&Z^i_{c0}&...&Z^i_{1b}&...&Z^i_{cb}\\ ... & ...&...&...&...&...&...\\ Z^n_{10} & ...&Z^n_{c0}&...&Z^n_{1b}&...&Z^n_{cb} \end{array}\right) \in R^{[(n-b)]*[(b+1)*c]} \]
Where \(Z^i_{kj}=\alpha_k(\frac{i}{n})x_{i-j}, j \ge 1, i > b\) and \(Z^i_{k0}=\alpha_k(\frac{i}{n}), i > b\). Then the OLS estimator for \(\boldsymbol{\beta}\) can be written as \(\hat \beta = (Y^*Y)^{-1}Y^*x\) where \(x= (x_{b+1}, ..., x_n)^* \in R^{n-b} \\\). After estimating \(a^{'}_{jk}s\), \(\phi_j(\frac{i}{n})\) is estimated as,
\[ \hat \phi_j(\frac {i}{n}) = \hat \beta^* \mathbb{B}_j(\frac{i}{n}) \]
Where \(\mathbb{B}_j(\frac{i}{n})\) has \(b+1\) blocks and the \(j\)-th block is \(B(\frac{i}{n})=(\alpha_1(\frac{i}{n}),...,\alpha_c(\frac{i}{n}))^* \in R^c\), \(j=0,1,2..,b\) where \(\alpha_i\) are basis functions and zeros otherwise.
fix.fit(time.series,5,1, type = "Legen", 600)
# Some other example can be used
#fix.fit(time.series,5,1, type = "Cheby")
#fix.fit(time.series,2,1, type = "db10")
#fix.fit(time.series,3,1, type = "tri")
#fix.fit(time.series,3,1, type = "cos")
#fix.fit(time.series,8,1, type = "sin")
#fix.fit(time.series,2,1, type = "cf1")
sie.auto.fit(time.series, type = "Legen")
# Some other example can be used
#sie.auto.fit(time.series, type = "Cheby", method = "Elbow", threshold = 0.02)
#sie.auto.fit(time.series, type = "db10")
#sie.auto.fit(time.series, type = "db10", method = "CV")
#sie.auto.fit(time.series, type = "tri")
#sie.auto.fit(time.series, type = "cos")
#sie.auto.fit(time.series, type = "sin")
#sie.auto.fit(time.series, type = "cf1")
# Some other example can be used
#fix.plot(fix.fit(time.series,5,1, type = "Cheby"), "Cheby", "fitted plot")
#fix.plot(fix.fit(time.series,2,1, type = "db10"), "db10")
#fix.plot(fix.fit(time.series,3,1, type = "tri"), "tri")
#fix.plot(fix.fit(time.series,3,1, type = "cos"), "cos")
#fix.plot(fix.fit(time.series,10,1, type = "sin"), "sin")
#fix.plot(fix.fit(time.series,2,1, type = "cf1"), "cf1")
# first term indicates the result from fix.fit(), second "Legen" determine the type
fix.plot(fix.fit(time.series,5,1, type = "Legen", 600), "Legen")
## [[1]]
# Some other example can be used
#sie.auto.plot(sie.auto.fit(time.series, type = "Cheby", method = "Elbow", threshold = 0.02), "Cheby")
#sie.auto.plot(sie.auto.fit(time.series, type = "db10"), "db10")
#sie.auto.plot(sie.auto.fit(time.series, type = "db10", method = "CV"), "db10", title = "cv db10")
#sie.auto.plot(sie.auto.fit(time.series, type = "tri"), "tri")
#sie.auto.plot(sie.auto.fit(time.series, type = "cos"), "cos")
#sie.auto.plot(sie.auto.fit(time.series, type = "sin"), "sin")
#sie.auto.plot(sie.auto.fit(time.series, type = "cf1"), "cf1")
# first term indicates the result from sie.auto.fit(), second "Legen" determine the type. The return is a list which contains 3 plot related to the estimation of coefficient, Elbow point and cross validation in order.
sie.auto.plot(sie.auto.fit(time.series, type = "Legen"), "Legen")[[1]]
sie.auto.plot(sie.auto.fit(time.series, type = "Legen"), "Legen")[[2]]
sie.auto.plot(sie.auto.fit(time.series, type = "Legen"), "Legen")[[3]]
Basis on the results in [3], when \(i>b\), there exists a smooth function of time \(\rho_j(t)\), such that for each lag j, the PACF \(\rho_{i,j}\) can be well approximated by \(\rho_{j}(\frac{i}{n}) :=\phi_{j,j}(\frac{i}{n})\), where \(i =1,2,...,n\), \(b\) is the lag used in estimation. Actually, the PACF for each lag is the function of AR approximation coefficients we get from above. Thus, recall the method of estimation of AR approximation coefficients. We can get the Estimation of PACF.
fix.pacf(time.series,5,1, type = "Legen", 600)
#fix.pacf(time.series,5,2, type = "Cheby")
#fix.pacf(time.series,2,3, type = "db10")
#fix.pacf(time.series,3,6, type = "tri")
#fix.pacf(time.series,3,7, type = "cos")
#fix.pacf(time.series,8,8, type = "sin")
#fix.pacf(time.series,2,9, type = "cf1")
sie.auto.pacf(time.series, c =5, lag = 1, type = "Legen", m = 600)
#sie.auto.pacf(time.series,c =5,lag = 2, type = "Cheby")
#sie.auto.pacf(time.series,c =2,lag = 3, type = "db10")
#sie.auto.pacf(time.series,c =3,lag = 4, type = "tri")
#sie.auto.pacf(time.series,c =3,lag = 4, type = "cos")
#sie.auto.pacf(time.series,c =2,lag = 4, type = "cf1")
# Some other example can be used
#sie.plot.pacf(time.series, c=5, lag = 10, "Legen", ops = "3d")
#sie.plot.pacf(time.series, c=5, lag = 5, "Cheby")
#sie.plot.pacf(time.series, c=2, lag = 5, "db10")
#sie.plot.pacf(time.series, c=2, lag = 5, "cf1")
#sie.plot.pacf(time.series,c= 10, lag = 5, "sin")
#sie.plot.pacf(time.series, c=3, lag = 5, "cos")
#sie.plot.pacf(time.series, c=2, lag = 5, "tri")
sie.plot.pacf(time.series, c = 5, lag = 5, type = "Legen", title = "pacf plot", m = 600)
sie.plot.pacf(time.series, c = 5, lag = 5, type = "Legen", ops = "3d", m = 600)
After AR approximation, users can also use sie.predict()
function to get the predicting. The first input is time series data, and
the second input is the output from fix.fit()
or
sie.auto.fit()
function.The third parameter indicates the
number of forecasting points. Based on the results of [3], given locally
time series \(\{x_1,...,x_n \}\) with
lag \(b\), for the \(h\)-step ahead prediction, we use
\[ \hat x^{b}_{n+h} = \phi_{0,h}(1) + \sum_{j = h}^{b} \phi_{j,h}(1)x_{n+1-j}, n \ge b+h. \]
sie.predict(time.series, fix.fit(time.series,5,1, type = "Legen"), 5)
#sie.predict(time.series,sie.auto.fit(time.series, type = "Legen") ,7)
This algorithm based on multiplier bootstrap procedure and as a
practical procedure to implement the stability test. The inputs are
tuning parameters \(b_{*}\),\(c\) and \(m\), time series \(\big\{ x_i \big\}\) and sieve basis
functions.The package provides 4 functions for inference.
fix.test()
and auto.test()
are used for test
of Stability for AR Approximations. fix.pacf.test()
and
auto.pacf.test()
for test of PACF. In general, \(\alpha = 0.05\) set for significant
level.
Tuning parameters are chosen by data-driven procedure proposed in
[2]. In particular, for \(b_{*}\),\(c\).
1. Given integer \(l=\left[ 3log_{2}n
\right]\). Divide the time series into 2 parts: Training part
\(\{ x_i \}_{i=1}^{n-l}\), Validation
part \(\{ x_i \}_{i=n-l+1}^{n}\)
2. Propose \((b, c)\) and a sequence of
pairs \((b_i, c_j)\), \(i=1,..,u\), \(j =
1,...,v\)
3.For each \((b_i, c_j)\), fit a
time-varying AR(\(b_i\)) with \(c_j\) sieve basis expansion using the
training data set.
4.Using the validation part and find \((b_{i0}, c_{j0})\) with the minimum sample
MSE of forecast.Let \(\hat x_{n-l+1, ij}, ...,
\hat x_{n, ij}\) be the forecast of \(x_{n-l+1}, ..., x_n\), respectively using
the parameter pair \((b_i, c_j)\),
\((b_{i0}, c_{j0})\) is given,
\[ (b_{i0}, c_{j0}) := \underset{((i,j):1\le i\le u, 1\le j \le v)}{\operatorname{argmin}} \frac{1}{l} \sum_{k=n-l+1}^{n} (x_k - \hat x_{k,ij})^2. \]
For \(m\), Minimum volatility(MV)
method is used to choose window size \(m\) for the scalar covariance
function.
1. Given large value \(m_{n_0}\),
choose a sequence of window sizes \(m_{-h_0+1}
<...< m_1<m_2<...<m_{n_0} <...<m_{n_0+h_0}\)
with a neighborhood control parameter \(h_0
> 0\) and get \(\hat
\Omega_{m_j}\) by replacing \(m\) with \(m_j\), \(j =
-h_0+1,2,...,n_0+h_0\), where \(\hat
\Omega\) is given,
\[ \hat \Omega := \frac {1}{(n-m-b+1)m} \sum_{i=b+1}^{n-m} \left[\left(\sum_{j=i}^{i+m} h_j\right)\otimes\left(B(\frac{i}{n})\right) \right] \times \left[\left(\sum_{j=i}^{i+m} h_j\right)\otimes\left(B(\frac{i}{n})\right) \right]^*. \] \(\otimes\) is Kronecker product.
2. For each \(m_j\), \(j=1,2,...,m_{n_0}\), calculate matrix norm
error of \(\hat \Omega_{m_j}\) in the
\(h_0\)-neighborhood, and we choose the
\(m\) by, \[
\hat m := \underset{m_1 \le m \le m_{n_0}}{\operatorname{argmin}} se(m)
\] where \(se(m)\) is given,
\[
se(m_j) := se \left( \big\{ \hat \Omega_{m_j+k}\big\}_{k=-h_0}^{h_0}
\right) = \left[ \frac{1}{2h_0} \sum_{k=-h_0}^{h_0} ||\hat
\Omega_{m_j+k}- \bar{\hat\Omega}_{m_j} ||^2 \right]^{1/2},
\] where \(\bar{\hat\Omega}_{m_j} =
\sum_{k=-h_0}^{h_0} \hat \Omega_{m_j+k} /(2h_0 + 1)\).
We use \(h_0=3\) and adopt this choice
in the package, Empirical speaking, \(m =
n^{(\frac{1}{3})}\) is a good choice where n is the size of data
set.
Basis on section 3 in [2]. Formally, for test of Stability for AR Approximations, the null hypothesis is
\[ H_0:\phi_j(\cdot) \: \text{is a constant function on} \: [0,1], \ j = 1,2,...,b_*. \]
To test \(H_0\), we proceed to provide \(\mathcal{L}^2\) test statistics in terms of the AR approximation estimation,
\[ T = \sum_{j=1}^{b_*} \int_{0}^{1} \left( \hat \phi_j(t) -\bar{\hat \phi_j} \right)^2 dt, \bar{\hat \phi_j} = \int_{0}^{1} \hat \phi_j(t)dt. \] T should be small under the null because \(H_0\) is equivalent to \(\phi_j(t) = \bar \phi_j\) for \(j = 1,2,...,b_*\) where \(\bar{\phi_j} = \int_{0}^{1} \phi_j(t)dt\).
In this package, a high-dimensional multiplier bootstrap statistics is used to mimic the distributions of \(nT\). For some positive integer \(m\), denote
\[ \hat \Phi = \frac {1}{\sqrt{n-m-b_{*}+1}\sqrt{m}} \sum_{i=b_{*}+1}^{n-m} [(\sum_{j=i}^{i+m}\hat h_j)\otimes(B(\frac{i}{n}))] R_i \] where \(R_i, i= b_*+1,...,n-m\) are i.i.d standard Gaussian. and \(\hat \Phi \in R^{(b^*+1)*c}\). \(\hat h_i = x_i \hat \epsilon^{b_{*}}_{i} \in R^{b_*+1}\), where \(x_i= (1, x_{i-1},...,x_{i-b_{*}})^*\) and \(\hat \epsilon^{b_{*}}_{i}\) is given by,
\[ \hat \epsilon^{b_{*}}_{i}=x_i - \hat \phi_0(\frac {i}{n}) - \sum_{j=1}^{b^*} \hat \phi_j(\frac {i}{n})x_{i-j} \]
where \(\hat \phi_j(\frac {i}{n}) = \hat \beta^* B_j(\frac{i}{n})\) obtained in the estimation process. Now, calculating \(\mathcal {\hat T} \in R^{[(b_*+1)*c][(b_*+1)*c]}\) where \(\mathcal {\hat T}:= \hat \Phi^*\hat \Gamma\hat \Phi\). where,
\[ \hat \Gamma := \hat \Sigma^{-1}I_{b_*c}\mathcal W\hat \Sigma^{-1} \\ \hat \Sigma^{-1} = n(Y^*Y)^{-1} \in R^{[(b_*+1)*c][(b_*+1)*c]} \] \(I_{b_*c} \in R^{[(b_*+1)*c][(b_*+1)*c]}\) is diagonal matrix whose non-zero entries are ones in the lower \((b_*c)(b_*c)\) major part. \(W= I - \bar B\bar B^* \in R^{c*c}\) and \(\mathcal W\) is a diagonal block matrix with diagonal block \(W\). In details,
\[ \mathcal W = \left(\begin{array}{cc} W & ...&0\\ ... & ...&...\\ 0 & ...&W \end{array}\right) \in R^{[(b_*+1)*c][(b_*+1)*c]} \]
\[ W = \left(\begin{array}{cc} 1-\alpha_1\alpha_1&-\alpha_1\alpha_2 & ...&-\alpha_1\alpha_c\\ -\alpha_2\alpha_1 & 1-\alpha_2\alpha_2&...&...\\ ... & ...&...&... \\ -\alpha_c\alpha_1&...&...&1-\alpha_c\alpha_c \end{array}\right) \in R^{c*c} \]
\[ \bar B = \int_{0}^{1} B(t) \; dt = \left(\begin{array}{cc} \alpha_1 = \int_{0}^{1} \alpha_1(t) \; dt \\ ... \\ \alpha_c = \int_{0}^{1} \alpha_c(t) \; dt \end{array}\right) \in R^c \] the \(I \in R^{c*c}\) is the diagonal matrix. Now, we can calculate \(nT\) and generate B(the default is 1000 in the package) i.i.d copies of \(\{\hat \Phi^{(k)}\}^{B}_{k=1}\). Then, we get \(\mathcal {\hat T_k}\) for \(\{\hat \Phi^{(k)}\}^{B}_{k=1}\). Let \(\mathcal {\hat T_{(1)}} \le\mathcal {\hat T_{(2)}} \le \cdot \cdot \cdot \le \mathcal {\hat T_{(B)}}\) be the order statistics of \(\mathcal {\hat T_k}\), \(k=1,2,...,B\). Reject \(H_0\) at level \(\alpha\) if \(nT > \mathcal {\hat T_{[B(1-\alpha)]}}\) where \([x]\) denotes the largest integer smaller or equal to \(x\). The p value is given by \(1-\frac{B^*}{B}\) where \(B^*=max\{r: \mathcal{\hat T_r} \le nT\}\).
fix.test(time.series, 5, 1, type = "Legen", 600, m = 9)
#fix.test(time.series,5,1, type = "Cheby")
#fix.test(time.series,2,1, type = "db10")
#fix.test(time.series,3,1, type = "tri")
#fix.test(time.series,3,1, type = "cos")
#fix.test(time.series,10,1, type = "sin")
#fix.test(time.series,2,1, type = "cf1", 500)
auto.test(time.series, type = "Legen")
#auto.test(time.series, type = "Cheby", method = "Elbow", threshold = 0.02)
#auto.test(time.series, type = "db10")
#auto.test(time.series, type = "db10", method = "CV")
#auto.test(time.series, type = "tri")
#auto.test(time.series, type = "cos")
#auto.test(time.series, type = "sin")
#auto.test(time.series, type = "cf1")
Test of PACF is processed by means of the multiplier bootstrap procedure for test of stability. the null hypothesis is
\[ H_0: \text{The PACF is approximatedly of lag} \ b_*. \]
To test \(H_0\), we proceed to provide test statistics denotes as \(T\) in terms of the AR approximation estimation,
\[ T = \sum_{j=b_*+1}^{b_0} \int_{0}^{1} \left( \hat \phi_j(t) \right)^2 dt. \] where \(b_0\) is a large number and \(b_0 > b_*\). \(T\) should be small under the null because \(H_0\) is equivalent to \(\left( \hat \phi_j(t) \right)^2 = 0\) for \(j > b_*\). Recall the statistics used in Test of Stability for AR Approximations. Only one term is modified which is \(\hat \Gamma\), Suppose the true lag for time series data is \(b_*\) and given number of lags with \(\left\{1,..,b_*,..,b_0 \right\}\)
\[ \hat \Gamma_j := \hat \Sigma^{-1}I_{(b_0-b_j)*c}\hat \Sigma^{-1} \] where \(j = 1,..,b_*, .., b_0\). We will get \(b_0\) p values basis on the procedure.Given significant level \(\alpha\). The true \(b_*\) is the smallest \(b_j\) s.t p value is larger than \(\alpha\).
fix.pacf.test(time.series,5, type = "Legen", B.s = 800, m = 9)
#fix.pacf.test(time.series,5, type = "Cheby")
#fix.pacf.test(time.series,2, type = "db10")
#fix.pacf.test(time.series,3, type = "tri")
#fix.pacf.test(time.series,3, type = "cos")
#fix.pacf.test(time.series,10, type = "sin")
#fix.pacf.test(time.series,2, type = "cf1")
auto.pacf.test(time.series, 3, 8, type = "Legen")
#auto.pacf.test(time.series, type = "Legen", method = "Elbow")
#auto.pacf.test(time.series, type = "Cheby", method = "Elbow", threshold = 0.02)
#auto.pacf.test(time.series, type = "db10", method = "CV")
#auto.pacf.test(time.series, type = "tri")
#auto.pacf.test(time.series, type = "cos")
#auto.pacf.test(time.series, type = "sin")
#auto.pacf.test(time.series, type = "cf1")
This test table shows some results of test with Legendere, basis functions. The experiments were executed by authors. In detail, we use the AR1 and AR2 stationary and non-stationary models to simulate the probability of error type one and the probability of error type two in order to verify the accuracy and power of the test. n is the data size. c, b and m are tuning parameters. Nonstationary AR1 and AR2 are given in the General simulation setup section. The stationary AR1 and stationary AR2 are given,
Stationary AR1 process, \[
x_i = 0.5x_{i-1} + \epsilon_i
\] where \(\epsilon_i\) are
i.i.d Gaussian with variance \(v\) for
\(i = 1, ..., n\)
Stationary AR2 process, \[
x_i = 0.4x_{i-1} + 0.6x_{i-2} + \epsilon_i
\] where \(\epsilon_i\) are
i.i.d Gaussian with variance \(v\) for
\(i = 1, ..., n\)
Test results are shown,
Basis | Time series | n | c | b | m | Times of reject/Times of testing |
---|---|---|---|---|---|---|
Legendre | stationary AR1 | 786 | 2 | 1 | 8 | 15/300 |
Legendre | nonstationary AR1 | 512 | 5 | 1 | 9 | 1000/1000 |
Legendre | stationary AR2 | 1024 | 2 | 2 | 9 | 45/1000 |
Legendre | nonstationary AR2 | 786 | 2 | 2 | 9 | 1000/1000 |
Legendre | nonstationary AR2 | 786 | 2 | 2 | MV method | 1000/1000 |
trigonometric | stationary AR1 | 1024 | 2 | 1 | 9 | 56/1000 |
trigonometric | stationary AR2 | 1024 | 2 | 2 | 9 | 14/300 |
trigonometric | nonstationary AR1 | 1024 | 2 | 1 | 9 | 999/1000 |
trigonometric | nonstationary AR2 | 1024 | 2 | 2 | 9 | 300/300 |
db10 | stationary AR1 | 786 | 2 | 1 | 8 | 7/150 |
db10 | stationary AR2 | 786 | 2 | 2 | 8 | 12/150 |
db10 | nonstationary AR1 | 786 | 2 | 1 | 8 | 150/150 |
db10 | nonstationary AR2 | 786 | 2 | 2 | 8 | 150/150 |
sie.plot.pacf()
provided in this package for PACF plot.
This section perform one PACF plot by simulation data and 3 examples
about plotting PACF for time series by real data. The first one is U.K.
National Accounts data This data set was obtained from the Office for
National Statistics contains values of the U.K.gross-valued added (GVA),
which is a component of gross domestic product (GDP). Observations are
recorded quarterly from quarter one 1955 to quarter three 2010 and
consists of 223 observations.The second data set is Precipitation in
Eastport, U.S (need finding) it has monthly precipitation in millimeters
from January 1887 until December 1950 (768 observations in total) at a
location in Eastport. The third dataset is Euro-Dollar exchange rate
from January 1999 until October 2017 and it is obtained from EuroStat.
We use log returns of the monthly Euro-Dollar exchange rate. Some data
clean process may need to be involved such as removing outliers or
taking difference.
sie.plot.pacf(time.series.2, 5, 8, type = "Legen", title = "pacf plot")
sie.plot.pacf(time.series.2, 5, 8, type = "Legen", ops = "3d")
#sie.plot.pacf(time.series.2, 5, 1, "Legen", title = "pacf plot")
#sie.plot.pacf(time.series.2, 5, 10, "Legen", ops = "3d")
#sie.plot.pacf(time.series.2, 5, 5, "Cheby")
#sie.plot.pacf(time.series.2, 2, 5, "db10")
#sie.plot.pacf(time.series.2, 2, 5, "cf1")
# sie.plot.pacf(time.series.2, 5, 5, "sin")
# sie.plot.pacf(time.series.2, 3, 5, "cos")
#sie.plot.pacf(time.series.2, 2, 5, "tri")
t4.2 = read.csv(file = "4.2table.csv")
timese4.2 = as.numeric(t4.2[82:304,2])
timese4.2 = diff(diff(timese4.2))
timese4.2 = timese4.2[-170]
timese4.2 = timese4.2[-211]
timese4.2 = timese4.2[-170]
plot.ts(timese4.2, xlab = "t", ylab = "gross-valued added (GVA)")
sie.plot.pacf(timese4.2, c = 2, lag = 5, type = "db10", title = "pacf plot")
# 4.3 Precipitation in Eastport, U.S (need finding)
# The left panel in Figure 4 shows monthly precipitation in millimetres from January 1887 until December 1950 (768 obser- vations) at a location in Eastport.
t4.3 = read.table(file = "PEAS.txt",encoding = "ASCII")
timese4.3 = c()
for(i in 1:dim(t4.3)[1]){
timese4.3 = append(timese4.3, t4.3[i,])
}
timese4.3 = unname(unlist(timese4.3))
plot.ts(timese4.3, xlab = "t", ylab = "Precipitation in Eastport, U.S")
sie.plot.pacf(timese4.3, c = 2, lag = 5, type = "Legen", title = "pacf plot")
t4.4 = read.table(file = 'table4.4.csv', sep = '\t', header = TRUE)
timese4.4 = c()
for(i in 1:dim(t4.4)[1]){
timese4.4[i] = as.numeric(unlist(strsplit(t4.4[i,], ","))[length(unlist(strsplit(t4.4[i,], ",")))])
}
timese4.4 = diff(log(timese4.4))
plot.ts(timese4.4, xlab = "t", ylab = "Euro-Dollar exchange rate(log)")
sie.plot.pacf(timese4.4, c = 3, lag = 5, type = "Legen", title = "pacf plot")
[1] Ding, Xiucai, and Zhou, Zhou. “Estimation and inference for precision matrices of nonstationary time series.” The Annals of Statistics 48(4) (2020): 2455-2477.
[2] Ding, Xiucai, and Zhou, Zhou. “Auto-regressive approximations to non-stationary time series, with inference and applications.” Available online, 2021.
[3] Ding, Xiucai, and Zhou, Zhou. “On the partial autocorrelation function for locally stationary time series: characterization, estimation, inference and applications.” Available online, 2022.
[4] Chen, Xiaohong. “Large Sample Sieve Estimation of Semi-Nonparametric Models.” Handbook of Econometrics, 6(B): 5549–5632,2007.
[5] Zhou, Zhou, “Heteroscedasticity and Autocorrelation Robust Structural Change Detection.” Journal of the American Statistical Association, 108(502): 726–740,2013.
UC Davis, xcading@ucdavis.edu, xiucaiding89@gmail.com↩︎
UC Davis, mrcqian@ucdavis.edu↩︎