1 Introduction

The package involves estimation and inference for locally-stationary time series using the tools and methods from [1,2,3]. The estimation procedure is nonparametric and based on the sieve methods as in [4]. The inference procedure is based on a bootstrapping procedure as in [5]. In addition, the package also provides functions to generate Daubechies and Coiflet wavelet by Cascade algorithm and to process data visualization.

library(ggplot2)
library(Sie2nts)
library(plotly)

2 Basis functions contained in the package

The package contains various orthogonal basis functions on [0,1].bs.gene() function aims at generating the value of k-th basis function. bs.plot() function gives the plot of basis functions. The Option type for all functions in this package is used for choosing basis. “Legen” for Legendre polynomials, “Cheby” for first kind Chebyshev polynomials, “tri” for trigonometric polynomials, “cos” for cosine polynomials, “sin” for sine polynomials, “Cspli” for splines means Class of splines functions. “db1” to “db20” for Daubechies1 wavelet basis to Daubechies20 wavelet basis and “cf1” to “cf5” for Coiflet1 wavelet basis to Coiflet5 wavelet basis. Brief definitions are given for understanding.

2.1 Legendre polynomials

The Legendre polynomial of degree n can be obtained using Rodrigue’s formula \[ P_{n}(x) = \frac{1}{2^nn!} \frac{d^n}{dx^n}(x^2 -1)^n, -1 \le x\le1. \]

In this package, we use the normalized Legendre polynomial

\[ P^*_{n}(x) = \begin{cases} 1, \quad &n=0;\\ \sqrt{2n+1}P_n(2x-1), \quad &n>0. \end{cases} \tag{1} \]

# The first parameter "Legen" represents the type of basis and the second parameter 5 determines the 5-th basis function of Legendre basis.300 indicates the number of values got from k-th order of Legendre basis function, the default value is 500. There are some others options when other type of basis is chosen, more details in the manual description. 

bs.gene("Legen", 5, 300)

#The first parameter "Legen" represents the type of basis and the second parameter 5 determines the 5-th basis function of Legendre basis. There are some others options when other type of basis is chosen, more details in the manual description. 

bs.plot("Legen", 5, title = "Legen basis")

2.2 Chebyshev polynomials

Chebyshev polynomials kind 1 is used in this package and it can be obtain by recurrence relation

\[ \begin{aligned} T_0(x) & = 1 \\ T_1(x) & = x \\ T_{n+1}(x) & = 2xT_n(x) - T_{n-1}(x) \end{aligned} \]

We get the normalized score by Riemann integral for each basis to normalization.

bs.gene("Cheby", 5)

bs.plot("Cheby", 5, title = "Chebyshev basis")

2.3 trigonometric, cosine and sine polynomials

These 3 polynomials are defined in [4]. We give a brief summary. Let \(\text{TriPol}(J_n)\) denote the space of trigonometric polynomials on [0, 1] of degree \(J_n\) or less; we also give the normalize score, that is,

\[ \text{TriPol}(J_n) = \Big\{a_0 + \sum_{k=1}^{J_n}[a_k\cos(2k\pi x) + b_k\sin(2k\pi x)], x \in[0,1]: a_k = b_k = \sqrt{2} \in \mathcal{R}, a_0 = 1 \Big\} \]

bs.gene("tri", 3)

bs.plot("tri", 3, title = "trigonometric basis") # the number of lines in the plot is 1 + 2*(3-1) = 5

Let \(CosPol(J_n)\) denote the space of cosine polynomials on [0, 1] of degree \(J_n\) or less; we also give the normalize score, that is,

\[ \text{CosPol}(J_n) = \Big\{a_0 + \sum_{k=1}^{J_n}a_k\cos(k\pi x), x \in[0,1]: a_k = \sqrt{2}, a_0 = 1 \Big\} \]

bs.gene("cos", 5)

bs.plot("cos", 5, title = "cosine basis")

Let \(SinPol(J_n)\) denote the space of sine polynomials on [0, 1] of degree \(J_n\) or less; we also give the normalize score, that is,

\[ \text{SinPol}(J_n) = \Big\{\sum_{k=1}^{J_n}a_k\sin(k\pi x), x \in[0,1]: a_k = \sqrt{2} \Big\} \]

bs.gene("sin", 5)

bs.plot("sin", 5, title = "sine basis")

2.4 Class of Splines functions

Univariate splines is defined in [4], we also use the function splineDesign() in the R base library splines to generate Splines functions. We get the normalized score by Riemann integral for each basis to normalization. Let \(J_n\) knots satisfy (2.17) in [4] page 5571. \(Spl(r, J_n)\) denote the space of splines of order r (or of degree m ≡ r − 1) with \(J_n\).Since,

\[ \text{Spl}(r, J_n) = \Bigg\{\sum_{k=0}^{r-1}a_kx^k + \sum_{j=1}^{J_n} b_j[\max\{x-t_j, 0 \}]^{r-1},x \in[0,1]: a_k,b_j \in \mathcal R \Bigg\} \]

In this option, the first input “c” is knots plus 2 that represent 0 and 1. “or” indicates the order of splines, so the number of basis is number of knots + 2 - 2 plus the number of order. When functions automatically choose the number of basis for splines, the number is not less than the order of spline.

bs.gene("Cspli", 3, 500, c=5, or =3)  # c-2+or

bs.plot("Cspli", 3, or = 3, "Spline basis")

2.5 Cascade Algorithm

This package also provides function generate_wv_table() to generate scaling function of Daubechies and Coiflets wavelet by Cascade Algorithm. Users can input the low-pass filter coefficients to generate. With more number of iterations function will return more accurate wavelet table. However, the number of points should carefully determine. For a given \(J_n\) and \(J_0\), we will consider the following periodized wavelets on [0, 1], for \(N \in \mathbb{N}\),

\[ \Bigg\{ \varphi_{J_nk}(x), 0 \le k \le 2^{J_n} -1 \Bigg\} \] Where \(\varphi(x)\) is the scaling wavelet function satisfies the recursion equation \(\varphi(x) = \sqrt{2} \sum_{k=0}^{2N-1} h_k\varphi(2x -k)\) supported on \([0, 2N -1]\) and \(h_0, ..., h_{2N-1}\) are the constant filter coefficients satisfying for \(l = 0,1,...,N-1\)

\[ \sum_{k=2l}^{2N-1+2l} h_kh_{k-2l} = \begin{cases} 1, \quad &l=0,\\ 0, \quad &l \ne0. \end{cases} \tag{1} \] When users input \(k\) as parameter for number of basis for Daubechies and Coiflet wavelet. The return will have \(2^k\) basis. \(k=1,2,3\) are recommend to use. Higher \(k\) is time consuming.

# This example generate Daubechies 2 wavelet scaling function
coef = c(0.48296291314469025, 0.836516303737469, 0.22414386804185735, -0.12940952255092145) 
tb = generate_wv_table(coef, 31680, 5)
plot(tb)

2.6 Daubechies wavelet

bs.gene("db10", 3) # it returns whole table of Daubechies10 wavelet basis with 2^3 number of basis

bs.plot("db10", 3, "Daubechies10 wavelet basis")

## Warning: package 'stringr' was built under R version 4.2.2

2.7 Coiflet wavelet

bs.gene("cf1", 3) # it returns whole table of Coiflet1 wavelet basis with 2^3 number of basis

bs.plot("cf1", 3, "Coiflet1 wavelet basis")

3 General simulation setup

We consider the AR(1) and AR(2) models to check our results. This user’s guide contains 2 functions for generating 2 different types of time series named generate_AR1(), generate_AR2(). For each function, the input n indicates the size of the sample and v is the variance of the sample. Parameter v is for adjusting purpose. The output is one vector contain n length time series basis on 2 different models respectively.
generate_AR1() simulates the time series, \[ x_i = 0.2 +0.6\cos(2\pi\frac{i}{n})x_{i-1} + \epsilon_i \] where \(\epsilon_i\) are i.i.d Gaussian with variance \(v\) for \(i = 1, ..., n\)

generate_AR2() simulates the time series, \[ x_i = 0.6\sin(2\pi\frac{i}{n})x_{i-1} + 0.4\cos(2\pi\frac{i}{n})x_{i-2} + \epsilon_i \] where \(\epsilon_i\) are i.i.d Gaussian with variance \(v\) for \(i = 1, ..., n\)

4 Estimation

This package contains 4 functions for estimation. fix.fit() and fix.pacf() focus on users specific and it requires that users select the number of basis, the lag for time series manually for getting AR approximation coefficients and estimation of PACF. sie.auto.fit() and sie.auto.pacf() will get estimation of AR approximation coefficients and estimation of PACF automatically basis on different rules. This package provides 3 rules.The first choice is “LOOCV” uses Leave-One-Out-Cross-Validation. The second choice is “CV” which uses the Cross-Validation method, it takes \(3*log_2(n)\) size as validation set where \(n\) is the number of total observations. The third choice is “Elbow”. This method similar as “LOOCV”, however, it need to set the threshold manually. The function will choose the smallest tuning parameters once the value of LOOCV is less than threshold and get the estimation. Users can use fix.plot() (for fix.fit()), sie.auto.plot() (for sie.auto.fit()) and sie.plot.pacf() to present the plot of results of estimation. In addition, sie.plot.pacf() can plot the PACF with 2 dimensions or 3 dimensions.

4.1 Estimation of AR approximation coefficients

The section 3 in [2] shows the process of estimation of AR approximation coefficients. Given locally time series \(\big\{ x_i \big\}\) and denoted nonparametric sieve estimators of \(\big\{\phi_j(\cdot)\big\}\). It can be approximated via a finite and diverging term basis expansion, given \(\big\{\alpha_k(t)\big\}\) some pre-chosen basis function on [0, 1]. we have that,

\[ \phi_j(\frac{i}{n}) = \sum_{k=1}^{c} a_{jk}\alpha_{k}(\frac{i}{n}) + O(c^{-d}), 0 \le j \le b, i > b \]

Moreover, for i > b, write

\[ x_i = \sum_{j = 0}^{b} \sum_{k=1}^{c} a_{jk}z_{kj} + \epsilon_i + O_{l^2}(\cdot), \]

where \(z_{kj} \equiv z_{kj}(\frac{i}{n}) := \alpha_k(\frac{i}{n})x_{i-j}\) for \(j \ge 1\) and \(z_{k0} = \alpha_k(\frac{i}{n})\). Details of term \(O_{l^2}(\cdot)\) can be found in [2]. We can estimate all the \(a^{'}_{jk}s\) using ordinary least squares regression with a diverging number of predictors.In particular, we write all \(a_{jk}\) \(j=0,1,2,3...b,\) \(k=1,2,...,c\) as a vector \(\boldsymbol{\beta} \in \mathbb{R^{n-b}}\) and \(\boldsymbol{Y}\) is the design matrix and given by, \[ Y = \left(\begin{array}{cc} Z^i_{10} & ...&Z^i_{c0}&...&Z^i_{1b}&...&Z^i_{cb}\\ ... & ...&...&...&...&...&...\\ Z^n_{10} & ...&Z^n_{c0}&...&Z^n_{1b}&...&Z^n_{cb} \end{array}\right) \in R^{[(n-b)]*[(b+1)*c]} \]

Where \(Z^i_{kj}=\alpha_k(\frac{i}{n})x_{i-j}, j \ge 1, i > b\) and \(Z^i_{k0}=\alpha_k(\frac{i}{n}), i > b\). Then the OLS estimator for \(\boldsymbol{\beta}\) can be written as \(\hat \beta = (Y^*Y)^{-1}Y^*x\) where \(x= (x_{b+1}, ..., x_n)^* \in R^{n-b} \\\). After estimating \(a^{'}_{jk}s\), \(\phi_j(\frac{i}{n})\) is estimated as,

\[ \hat \phi_j(\frac {i}{n}) = \hat \beta^* \mathbb{B}_j(\frac{i}{n}) \]

Where \(\mathbb{B}_j(\frac{i}{n})\) has \(b+1\) blocks and the \(j\)-th block is \(B(\frac{i}{n})=(\alpha_1(\frac{i}{n}),...,\alpha_c(\frac{i}{n}))^* \in R^c\), \(j=0,1,2..,b\) where \(\alpha_i\) are basis functions and zeros otherwise.

fix.fit(time.series,5,1, type = "Legen", 600)

# Some other example can be used
#fix.fit(time.series,5,1, type = "Cheby")
#fix.fit(time.series,2,1, type = "db10")
#fix.fit(time.series,3,1, type = "tri")
#fix.fit(time.series,3,1, type = "cos")
#fix.fit(time.series,8,1, type = "sin")
#fix.fit(time.series,2,1, type = "cf1")

sie.auto.fit(time.series, type = "Legen")

# Some other example can be used
#sie.auto.fit(time.series, type = "Cheby", method = "Elbow", threshold = 0.02)
#sie.auto.fit(time.series, type = "db10")
#sie.auto.fit(time.series, type = "db10", method = "CV")
#sie.auto.fit(time.series, type = "tri")
#sie.auto.fit(time.series, type = "cos")
#sie.auto.fit(time.series, type = "sin")
#sie.auto.fit(time.series, type = "cf1")

# Some other example can be used
#fix.plot(fix.fit(time.series,5,1, type = "Cheby"), "Cheby", "fitted plot")
#fix.plot(fix.fit(time.series,2,1, type = "db10"), "db10")
#fix.plot(fix.fit(time.series,3,1, type = "tri"), "tri")
#fix.plot(fix.fit(time.series,3,1, type = "cos"), "cos")
#fix.plot(fix.fit(time.series,10,1, type = "sin"), "sin")
#fix.plot(fix.fit(time.series,2,1, type = "cf1"), "cf1")

# first term indicates the result from fix.fit(), second "Legen" determine the type
fix.plot(fix.fit(time.series,5,1, type = "Legen", 600), "Legen")

## [[1]]

# Some other example can be used
#sie.auto.plot(sie.auto.fit(time.series, type = "Cheby", method = "Elbow", threshold = 0.02), "Cheby")
#sie.auto.plot(sie.auto.fit(time.series, type = "db10"), "db10")
#sie.auto.plot(sie.auto.fit(time.series, type = "db10", method = "CV"), "db10", title = "cv db10")
#sie.auto.plot(sie.auto.fit(time.series, type = "tri"), "tri")
#sie.auto.plot(sie.auto.fit(time.series, type = "cos"), "cos")
#sie.auto.plot(sie.auto.fit(time.series, type = "sin"), "sin")
#sie.auto.plot(sie.auto.fit(time.series, type = "cf1"), "cf1")

# first term indicates the result from sie.auto.fit(), second "Legen" determine the type. The return is a list which contains 3 plot related to the estimation of coefficient, Elbow point and cross validation in order. 

sie.auto.plot(sie.auto.fit(time.series, type = "Legen"), "Legen")[[1]]

sie.auto.plot(sie.auto.fit(time.series, type = "Legen"), "Legen")[[2]]

sie.auto.plot(sie.auto.fit(time.series, type = "Legen"), "Legen")[[3]]

4.2 Estimation of PACF

Basis on the results in [3], when \(i>b\), there exists a smooth function of time \(\rho_j(t)\), such that for each lag j, the PACF \(\rho_{i,j}\) can be well approximated by \(\rho_{j}(\frac{i}{n}) :=\phi_{j,j}(\frac{i}{n})\), where \(i =1,2,...,n\), \(b\) is the lag used in estimation. Actually, the PACF for each lag is the function of AR approximation coefficients we get from above. Thus, recall the method of estimation of AR approximation coefficients. We can get the Estimation of PACF.

fix.pacf(time.series,5,1, type = "Legen", 600)
#fix.pacf(time.series,5,2, type = "Cheby")
#fix.pacf(time.series,2,3, type = "db10")
#fix.pacf(time.series,3,6, type = "tri")
#fix.pacf(time.series,3,7, type = "cos")
#fix.pacf(time.series,8,8, type = "sin")
#fix.pacf(time.series,2,9, type = "cf1")

sie.auto.pacf(time.series, c =5, lag = 1, type = "Legen", m = 600)
#sie.auto.pacf(time.series,c =5,lag = 2, type = "Cheby")
#sie.auto.pacf(time.series,c =2,lag = 3, type = "db10")
#sie.auto.pacf(time.series,c =3,lag = 4, type = "tri")
#sie.auto.pacf(time.series,c =3,lag = 4, type = "cos")
#sie.auto.pacf(time.series,c =2,lag = 4, type = "cf1")

# Some other example can be used
#sie.plot.pacf(time.series, c=5, lag = 10, "Legen", ops = "3d") 
#sie.plot.pacf(time.series, c=5, lag = 5, "Cheby") 
#sie.plot.pacf(time.series, c=2, lag = 5, "db10") 
#sie.plot.pacf(time.series, c=2, lag = 5, "cf1") 
#sie.plot.pacf(time.series,c= 10, lag = 5, "sin") 
#sie.plot.pacf(time.series, c=3, lag = 5, "cos") 
#sie.plot.pacf(time.series, c=2, lag = 5, "tri") 

sie.plot.pacf(time.series, c = 5, lag = 5, type = "Legen", title = "pacf plot", m = 600)

sie.plot.pacf(time.series, c = 5, lag = 5, type = "Legen", ops = "3d", m = 600)

4.3 Prediction based on AR approximation

After AR approximation, users can also use sie.predict() function to get the predicting. The first input is time series data, and the second input is the output from fix.fit() or sie.auto.fit() function.The third parameter indicates the number of forecasting points. Based on the results of [3], given locally time series \(\{x_1,...,x_n \}\) with lag \(b\), for the \(h\)-step ahead prediction, we use

\[ \hat x^{b}_{n+h} = \phi_{0,h}(1) + \sum_{j = h}^{b} \phi_{j,h}(1)x_{n+1-j}, n \ge b+h. \]

sie.predict(time.series, fix.fit(time.series,5,1, type = "Legen"), 5)
#sie.predict(time.series,sie.auto.fit(time.series, type = "Legen") ,7)

5 Inference

This algorithm based on multiplier bootstrap procedure and as a practical procedure to implement the stability test. The inputs are tuning parameters \(b_{*}\),\(c\) and \(m\), time series \(\big\{ x_i \big\}\) and sieve basis functions.The package provides 4 functions for inference. fix.test() and auto.test() are used for test of Stability for AR Approximations. fix.pacf.test() and auto.pacf.test() for test of PACF. In general, \(\alpha = 0.05\) set for significant level.

5.1 Choosing tuning parameters \(b_{*}\),\(c\) and \(m\)

Tuning parameters are chosen by data-driven procedure proposed in [2]. In particular, for \(b_{*}\),\(c\).
1. Given integer \(l=\left[ 3log_{2}n \right]\). Divide the time series into 2 parts: Training part \(\{ x_i \}_{i=1}^{n-l}\), Validation part \(\{ x_i \}_{i=n-l+1}^{n}\)
2. Propose \((b, c)\) and a sequence of pairs \((b_i, c_j)\), \(i=1,..,u\), \(j = 1,...,v\)
3.For each \((b_i, c_j)\), fit a time-varying AR(\(b_i\)) with \(c_j\) sieve basis expansion using the training data set.
4.Using the validation part and find \((b_{i0}, c_{j0})\) with the minimum sample MSE of forecast.Let \(\hat x_{n-l+1, ij}, ..., \hat x_{n, ij}\) be the forecast of \(x_{n-l+1}, ..., x_n\), respectively using the parameter pair \((b_i, c_j)\), \((b_{i0}, c_{j0})\) is given,

\[ (b_{i0}, c_{j0}) := \underset{((i,j):1\le i\le u, 1\le j \le v)}{\operatorname{argmin}} \frac{1}{l} \sum_{k=n-l+1}^{n} (x_k - \hat x_{k,ij})^2. \]

For \(m\), Minimum volatility(MV) method is used to choose window size \(m\) for the scalar covariance function.
1. Given large value \(m_{n_0}\), choose a sequence of window sizes \(m_{-h_0+1} <...< m_1<m_2<...<m_{n_0} <...<m_{n_0+h_0}\) with a neighborhood control parameter \(h_0 > 0\) and get \(\hat \Omega_{m_j}\) by replacing \(m\) with \(m_j\), \(j = -h_0+1,2,...,n_0+h_0\), where \(\hat \Omega\) is given,

\[ \hat \Omega := \frac {1}{(n-m-b+1)m} \sum_{i=b+1}^{n-m} \left[\left(\sum_{j=i}^{i+m} h_j\right)\otimes\left(B(\frac{i}{n})\right) \right] \times \left[\left(\sum_{j=i}^{i+m} h_j\right)\otimes\left(B(\frac{i}{n})\right) \right]^*. \] \(\otimes\) is Kronecker product.

2. For each \(m_j\), \(j=1,2,...,m_{n_0}\), calculate matrix norm error of \(\hat \Omega_{m_j}\) in the \(h_0\)-neighborhood, and we choose the \(m\) by, \[ \hat m := \underset{m_1 \le m \le m_{n_0}}{\operatorname{argmin}} se(m) \] where \(se(m)\) is given, \[ se(m_j) := se \left( \big\{ \hat \Omega_{m_j+k}\big\}_{k=-h_0}^{h_0} \right) = \left[ \frac{1}{2h_0} \sum_{k=-h_0}^{h_0} ||\hat \Omega_{m_j+k}- \bar{\hat\Omega}_{m_j} ||^2 \right]^{1/2}, \] where \(\bar{\hat\Omega}_{m_j} = \sum_{k=-h_0}^{h_0} \hat \Omega_{m_j+k} /(2h_0 + 1)\).

We use \(h_0=3\) and adopt this choice in the package, Empirical speaking, \(m = n^{(\frac{1}{3})}\) is a good choice where n is the size of data set.

5.2 Test of Stability for AR Approximations

Basis on section 3 in [2]. Formally, for test of Stability for AR Approximations, the null hypothesis is

\[ H_0:\phi_j(\cdot) \: \text{is a constant function on} \: [0,1], \ j = 1,2,...,b_*. \]

To test \(H_0\), we proceed to provide \(\mathcal{L}^2\) test statistics in terms of the AR approximation estimation,

\[ T = \sum_{j=1}^{b_*} \int_{0}^{1} \left( \hat \phi_j(t) -\bar{\hat \phi_j} \right)^2 dt, \bar{\hat \phi_j} = \int_{0}^{1} \hat \phi_j(t)dt. \] T should be small under the null because \(H_0\) is equivalent to \(\phi_j(t) = \bar \phi_j\) for \(j = 1,2,...,b_*\) where \(\bar{\phi_j} = \int_{0}^{1} \phi_j(t)dt\).

In this package, a high-dimensional multiplier bootstrap statistics is used to mimic the distributions of \(nT\). For some positive integer \(m\), denote

\[ \hat \Phi = \frac {1}{\sqrt{n-m-b_{*}+1}\sqrt{m}} \sum_{i=b_{*}+1}^{n-m} [(\sum_{j=i}^{i+m}\hat h_j)\otimes(B(\frac{i}{n}))] R_i \] where \(R_i, i= b_*+1,...,n-m\) are i.i.d standard Gaussian. and \(\hat \Phi \in R^{(b^*+1)*c}\). \(\hat h_i = x_i \hat \epsilon^{b_{*}}_{i} \in R^{b_*+1}\), where \(x_i= (1, x_{i-1},...,x_{i-b_{*}})^*\) and \(\hat \epsilon^{b_{*}}_{i}\) is given by,

\[ \hat \epsilon^{b_{*}}_{i}=x_i - \hat \phi_0(\frac {i}{n}) - \sum_{j=1}^{b^*} \hat \phi_j(\frac {i}{n})x_{i-j} \]

where \(\hat \phi_j(\frac {i}{n}) = \hat \beta^* B_j(\frac{i}{n})\) obtained in the estimation process. Now, calculating \(\mathcal {\hat T} \in R^{[(b_*+1)*c][(b_*+1)*c]}\) where \(\mathcal {\hat T}:= \hat \Phi^*\hat \Gamma\hat \Phi\). where,

\[ \hat \Gamma := \hat \Sigma^{-1}I_{b_*c}\mathcal W\hat \Sigma^{-1} \\ \hat \Sigma^{-1} = n(Y^*Y)^{-1} \in R^{[(b_*+1)*c][(b_*+1)*c]} \] \(I_{b_*c} \in R^{[(b_*+1)*c][(b_*+1)*c]}\) is diagonal matrix whose non-zero entries are ones in the lower \((b_*c)(b_*c)\) major part. \(W= I - \bar B\bar B^* \in R^{c*c}\) and \(\mathcal W\) is a diagonal block matrix with diagonal block \(W\). In details,

\[ \mathcal W = \left(\begin{array}{cc} W & ...&0\\ ... & ...&...\\ 0 & ...&W \end{array}\right) \in R^{[(b_*+1)*c][(b_*+1)*c]} \]

\[ W = \left(\begin{array}{cc} 1-\alpha_1\alpha_1&-\alpha_1\alpha_2 & ...&-\alpha_1\alpha_c\\ -\alpha_2\alpha_1 & 1-\alpha_2\alpha_2&...&...\\ ... & ...&...&... \\ -\alpha_c\alpha_1&...&...&1-\alpha_c\alpha_c \end{array}\right) \in R^{c*c} \]

\[ \bar B = \int_{0}^{1} B(t) \; dt = \left(\begin{array}{cc} \alpha_1 = \int_{0}^{1} \alpha_1(t) \; dt \\ ... \\ \alpha_c = \int_{0}^{1} \alpha_c(t) \; dt \end{array}\right) \in R^c \] the \(I \in R^{c*c}\) is the diagonal matrix. Now, we can calculate \(nT\) and generate B(the default is 1000 in the package) i.i.d copies of \(\{\hat \Phi^{(k)}\}^{B}_{k=1}\). Then, we get \(\mathcal {\hat T_k}\) for \(\{\hat \Phi^{(k)}\}^{B}_{k=1}\). Let \(\mathcal {\hat T_{(1)}} \le\mathcal {\hat T_{(2)}} \le \cdot \cdot \cdot \le \mathcal {\hat T_{(B)}}\) be the order statistics of \(\mathcal {\hat T_k}\), \(k=1,2,...,B\). Reject \(H_0\) at level \(\alpha\) if \(nT > \mathcal {\hat T_{[B(1-\alpha)]}}\) where \([x]\) denotes the largest integer smaller or equal to \(x\). The p value is given by \(1-\frac{B^*}{B}\) where \(B^*=max\{r: \mathcal{\hat T_r} \le nT\}\).

fix.test(time.series, 5, 1, type = "Legen", 600, m = 9)
#fix.test(time.series,5,1, type = "Cheby")
#fix.test(time.series,2,1, type = "db10")
#fix.test(time.series,3,1, type = "tri")
#fix.test(time.series,3,1, type = "cos")
#fix.test(time.series,10,1, type = "sin")
#fix.test(time.series,2,1, type = "cf1", 500)

auto.test(time.series, type = "Legen")
#auto.test(time.series, type = "Cheby", method = "Elbow", threshold = 0.02)
#auto.test(time.series, type = "db10")
#auto.test(time.series, type = "db10", method = "CV")
#auto.test(time.series, type = "tri")
#auto.test(time.series, type = "cos")
#auto.test(time.series, type = "sin")
#auto.test(time.series, type = "cf1")

5.3 Test of PACF

Test of PACF is processed by means of the multiplier bootstrap procedure for test of stability. the null hypothesis is

\[ H_0: \text{The PACF is approximatedly of lag} \ b_*. \]

To test \(H_0\), we proceed to provide test statistics denotes as \(T\) in terms of the AR approximation estimation,

\[ T = \sum_{j=b_*+1}^{b_0} \int_{0}^{1} \left( \hat \phi_j(t) \right)^2 dt. \] where \(b_0\) is a large number and \(b_0 > b_*\). \(T\) should be small under the null because \(H_0\) is equivalent to \(\left( \hat \phi_j(t) \right)^2 = 0\) for \(j > b_*\). Recall the statistics used in Test of Stability for AR Approximations. Only one term is modified which is \(\hat \Gamma\), Suppose the true lag for time series data is \(b_*\) and given number of lags with \(\left\{1,..,b_*,..,b_0 \right\}\)

\[ \hat \Gamma_j := \hat \Sigma^{-1}I_{(b_0-b_j)*c}\hat \Sigma^{-1} \] where \(j = 1,..,b_*, .., b_0\). We will get \(b_0\) p values basis on the procedure.Given significant level \(\alpha\). The true \(b_*\) is the smallest \(b_j\) s.t p value is larger than \(\alpha\).

fix.pacf.test(time.series,5, type = "Legen", B.s  = 800, m = 9)
#fix.pacf.test(time.series,5, type = "Cheby")
#fix.pacf.test(time.series,2, type = "db10")
#fix.pacf.test(time.series,3, type = "tri")
#fix.pacf.test(time.series,3, type = "cos")
#fix.pacf.test(time.series,10, type = "sin")
#fix.pacf.test(time.series,2, type = "cf1")

auto.pacf.test(time.series, 3, 8, type = "Legen")
#auto.pacf.test(time.series, type = "Legen", method = "Elbow")
#auto.pacf.test(time.series, type = "Cheby", method = "Elbow", threshold = 0.02)
#auto.pacf.test(time.series, type = "db10", method = "CV")
#auto.pacf.test(time.series, type = "tri")
#auto.pacf.test(time.series, type = "cos")
#auto.pacf.test(time.series, type = "sin")
#auto.pacf.test(time.series, type = "cf1")

5.4 Test results table

This test table shows some results of test with Legendere, basis functions. The experiments were executed by authors. In detail, we use the AR1 and AR2 stationary and non-stationary models to simulate the probability of error type one and the probability of error type two in order to verify the accuracy and power of the test. n is the data size. c, b and m are tuning parameters. Nonstationary AR1 and AR2 are given in the General simulation setup section. The stationary AR1 and stationary AR2 are given,

Stationary AR1 process, \[ x_i = 0.5x_{i-1} + \epsilon_i \] where \(\epsilon_i\) are i.i.d Gaussian with variance \(v\) for \(i = 1, ..., n\)

Stationary AR2 process, \[ x_i = 0.4x_{i-1} + 0.6x_{i-2} + \epsilon_i \] where \(\epsilon_i\) are i.i.d Gaussian with variance \(v\) for \(i = 1, ..., n\)

Test results are shown,

Basis	Time series	n	c	b	m	Times of reject/Times of testing
Legendre	stationary AR1	786	2	1	8	15/300
Legendre	nonstationary AR1	512	5	1	9	1000/1000
Legendre	stationary AR2	1024	2	2	9	45/1000
Legendre	nonstationary AR2	786	2	2	9	1000/1000
Legendre	nonstationary AR2	786	2	2	MV method	1000/1000
trigonometric	stationary AR1	1024	2	1	9	56/1000
trigonometric	stationary AR2	1024	2	2	9	14/300
trigonometric	nonstationary AR1	1024	2	1	9	999/1000
trigonometric	nonstationary AR2	1024	2	2	9	300/300
db10	stationary AR1	786	2	1	8	7/150
db10	stationary AR2	786	2	2	8	12/150
db10	nonstationary AR1	786	2	1	8	150/150
db10	nonstationary AR2	786	2	2	8	150/150

6 PACF plot with Real Data Example

sie.plot.pacf() provided in this package for PACF plot. This section perform one PACF plot by simulation data and 3 examples about plotting PACF for time series by real data. The first one is U.K. National Accounts data This data set was obtained from the Office for National Statistics contains values of the U.K.gross-valued added (GVA), which is a component of gross domestic product (GDP). Observations are recorded quarterly from quarter one 1955 to quarter three 2010 and consists of 223 observations.The second data set is Precipitation in Eastport, U.S (need finding) it has monthly precipitation in millimeters from January 1887 until December 1950 (768 observations in total) at a location in Eastport. The third dataset is Euro-Dollar exchange rate from January 1999 until October 2017 and it is obtained from EuroStat. We use log returns of the monthly Euro-Dollar exchange rate. Some data clean process may need to be involved such as removing outliers or taking difference.

6.1 Simulation AR(2) (Appriximation)

sie.plot.pacf(time.series.2, 5, 8, type = "Legen", title = "pacf plot")

sie.plot.pacf(time.series.2, 5, 8, type = "Legen", ops = "3d")

#sie.plot.pacf(time.series.2, 5, 1, "Legen", title = "pacf plot") 
#sie.plot.pacf(time.series.2, 5, 10, "Legen", ops = "3d") 
#sie.plot.pacf(time.series.2, 5, 5, "Cheby") 
#sie.plot.pacf(time.series.2, 2, 5, "db10") 
#sie.plot.pacf(time.series.2, 2, 5, "cf1") 
# sie.plot.pacf(time.series.2, 5, 5, "sin") 
# sie.plot.pacf(time.series.2, 3, 5, "cos") 
#sie.plot.pacf(time.series.2, 2, 5, "tri")

6.2 gross-valued added (GVA)

t4.2 = read.csv(file = "4.2table.csv")

timese4.2 = as.numeric(t4.2[82:304,2])
timese4.2 = diff(diff(timese4.2))

timese4.2 = timese4.2[-170]
timese4.2 = timese4.2[-211]
timese4.2 = timese4.2[-170]

plot.ts(timese4.2, xlab = "t", ylab = "gross-valued added (GVA)")

sie.plot.pacf(timese4.2, c = 2, lag = 5, type = "db10", title = "pacf plot")

6.3 Precipitation in Eastport, U.S

# 4.3 Precipitation in Eastport, U.S (need finding)
# The left panel in Figure 4 shows monthly precipitation in millimetres from January 1887 until December 1950 (768 obser- vations) at a location in Eastport.
t4.3 = read.table(file = "PEAS.txt",encoding = "ASCII")
timese4.3 = c()
for(i in 1:dim(t4.3)[1]){
  timese4.3 = append(timese4.3, t4.3[i,])
}
timese4.3 = unname(unlist(timese4.3))

plot.ts(timese4.3, xlab = "t", ylab = "Precipitation in Eastport, U.S")

sie.plot.pacf(timese4.3, c = 2, lag = 5, type = "Legen", title = "pacf plot")

6.4 Euro-Dollar exchange rate

t4.4 = read.table(file = 'table4.4.csv', sep = '\t', header = TRUE)

timese4.4 = c()
for(i in 1:dim(t4.4)[1]){
  timese4.4[i] = as.numeric(unlist(strsplit(t4.4[i,], ","))[length(unlist(strsplit(t4.4[i,], ",")))])

}
timese4.4 = diff(log(timese4.4))
plot.ts(timese4.4, xlab = "t", ylab = "Euro-Dollar exchange rate(log)")

sie.plot.pacf(timese4.4, c = 3, lag = 5, type = "Legen", title = "pacf plot")

Reference

[1] Ding, Xiucai, and Zhou, Zhou. “Estimation and inference for precision matrices of nonstationary time series.” The Annals of Statistics 48(4) (2020): 2455-2477.

[2] Ding, Xiucai, and Zhou, Zhou. “Auto-regressive approximations to non-stationary time series, with inference and applications.” Available online, 2021.

[3] Ding, Xiucai, and Zhou, Zhou. “On the partial autocorrelation function for locally stationary time series: characterization, estimation, inference and applications.” Available online, 2022.

[4] Chen, Xiaohong. “Large Sample Sieve Estimation of Semi-Nonparametric Models.” Handbook of Econometrics, 6(B): 5549–5632,2007.

[5] Zhou, Zhou, “Heteroscedasticity and Autocorrelation Robust Structural Change Detection.” Journal of the American Statistical Association, 108(502): 726–740,2013.

UC Davis, xcading@ucdavis.edu, xiucaiding89@gmail.com ↩︎
UC Davis, mrcqian@ucdavis.edu ↩︎

A user’s guide to the R package ‘Sie2nts’

Xiucai Ding¹, Chen Qian²

1 Introduction

2 Basis functions contained in the package

2.1 Legendre polynomials

2.2 Chebyshev polynomials

2.3 trigonometric, cosine and sine polynomials

2.4 Class of Splines functions

2.5 Cascade Algorithm

2.6 Daubechies wavelet

2.7 Coiflet wavelet

3 General simulation setup

4 Estimation

4.1 Estimation of AR approximation coefficients

4.2 Estimation of PACF

4.3 Prediction based on AR approximation

5 Inference

5.1 Choosing tuning parameters \(b_{*}\),\(c\) and \(m\)

5.2 Test of Stability for AR Approximations

5.3 Test of PACF

5.4 Test results table

6 PACF plot with Real Data Example

6.1 Simulation AR(2) (Appriximation)

6.2 gross-valued added (GVA)

6.3 Precipitation in Eastport, U.S

6.4 Euro-Dollar exchange rate

Reference

A user’s guide to the R package ‘Sie2nts’

Xiucai Ding1, Chen Qian2

1 Introduction

2 Basis functions contained in the package

2.1 Legendre polynomials

2.2 Chebyshev polynomials

2.3 trigonometric, cosine and sine polynomials

2.4 Class of Splines functions

2.5 Cascade Algorithm

2.6 Daubechies wavelet

2.7 Coiflet wavelet

3 General simulation setup

4 Estimation

4.1 Estimation of AR approximation coefficients

4.2 Estimation of PACF

4.3 Prediction based on AR approximation

5 Inference

5.1 Choosing tuning parameters \(b_{*}\),\(c\) and \(m\)

5.2 Test of Stability for AR Approximations

5.3 Test of PACF

5.4 Test results table

6 PACF plot with Real Data Example

6.1 Simulation AR(2) (Appriximation)

6.2 gross-valued added (GVA)

6.3 Precipitation in Eastport, U.S

6.4 Euro-Dollar exchange rate

Reference

Xiucai Ding¹, Chen Qian²