Using robust FPCA to identify outliers in functional time series, with applications to the electricity market

  1. Juan M.Vilar 1
  2. Paula Raña 1
  3. Germán Aneiros 1
  1. 1 Departamento de Matemáticas, Universidade da Coruña
Revista:
Sort: Statistics and Operations Research Transactions

ISSN: 1696-2281

Año de publicación: 2016

Volumen: 40

Número: 2

Páginas: 321-348

Tipo: Artículo

Otras publicaciones en: Sort: Statistics and Operations Research Transactions

Resumen

This study proposes two methods for detecting outliers in functional time series. Both methods take dependence in the data into account and are based on robust functional principal component analysis. One method seeks outliers in the series of projections on the first principal component. The other obtains uncontaminated forecasts for each data set and determines that those observations whose residuals have an unusually high norm are considered outliers. A simulation study shows the performance of these proposed procedures and the need to take dependence in the time series into account. Finally, the usefulness of our methodology is illustrated in two real datasets from the electricity market: daily curves of electricity demand and price in mainland Spain, for the year 2012.

Referencias bibliográficas

  • Aneiros-Pérez, G., Cardot, H, Estévez-Perez, G., and Vieu, P. (2004). Maximum ozone concentration forecasting by functional non-parametric approaches. Environmetrics, 15, 675–685.
  • Aneiros, G., Vilar, J.M., Cao, R., and Muñoz-San-Roque, A. (2013). Functional prediction for the residual demand in electricity spot markets. IEEE Transactions on Power Systems, 28, 4201–4208.
  • Aneiros, G., Vilar, J., and Raña, P. (2016). Short-term forecast of daily curves of electricity demand and price. Electrical Power and Energy Systems, 80, 96–108.
  • Antoch, J., Prchal, L., De Rosa, M.R., and Sarda, P. (2010). Electricity consumption prediction with functional linear regression using spline estimators. Journal of Applied Statistics, 37, 2027–2041.
  • Antoniadis, A., Paparoditis, E., and Sapatinas, T. (2006). A functional wavelet kernel approach for time series prediction. Journal of the Royal Statistics Society B, 68, 837–857.
  • Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram. Biostatistics, 15, 603–619.
  • Baı́llo, A., Cuesta-Albertos, J.A., and Cuevas, A. (2011). Supervised classification for a family of Gaussian functional models. Scandinavian Journal of Statistics, 38, 480–498.
  • Besse, P.C., Cardot, H., and Stephenson, D. (2000). Autoregressive forecasting of some functional climatic variations. Scandinavian Journal of Statistics, 27, 673–688.
  • Boente, G., and Fraiman, R. (2000). Kernel-based functional principal components. Statistics and Probability Letters, 48, 335–345.
  • Cardot, H., Ferraty, F., and Sarda, P. (1999). Functional linear model. Statistics and Probability Letters, 45, 11–22.
  • Cho, H., Goude, Y., Brossat, X., and Yao, Q. (2013). Modeling and forecasting daily electricity load curves: a hybrid approach. Journal of the American Statistical Association, 108, 7–21.
  • Cryer, J.D., and Chan, K.S. (2008). Time Series Analysis, New York: Springer.
  • Cuevas, A. (2014). A partial overview of the theory of statistics with functional data. Journal of Statistical Planning and Inference, 147, 1–23.
  • Cuevas, A., Febrero, M., and Fraiman, R. (2006). On the use of the bootstrap for estimating functions with functional data. Computational Statistics and Data Analysis, 51, 1063–1074.
  • Cuevas, A., Febrero, M., and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Computational Statistics, 22, 481–496.
  • Delsol, L., Ferraty, F., and Vieu, P. (2011). Structural test in regression on functional variables. Journal of Multivariate Analysis, 102, 422–447.
  • Febrero, M., Galeano, P., and González-Manteiga, W. (2007). Functional analysis of NOx levels: location and scale estimation and outlier detection. Computational Statistics, 22, 411–427.
  • Febrero, M., Galeano, P., and González-Manteiga, W. (2008). Outlier detection in functional data by depth measures, with application to identify abnormal NOx levels. Environmetrics, 19, 331–345.
  • Ferraty, F., and Romain,Y. (Eds.) (2011). The Oxford Handbook of Functional Data Analysis, Oxford: Oxford University Press.
  • Ferraty, F., van Keilegom, I., and Vieu, P. (2012). Regression when both response and predictor are functions. Journal of Multivariate Analysis, 109, 10–28.
  • Ferraty, F., and Vieu, P. (2002). The functional nonparametric model and application to spectrometric data. Computational Statistics, 17, 545–564.
  • Ferraty, F., and Vieu, P. (2006). Nonparametric Functional Data Analysis, New York: Springer-Verlag.
  • Fraiman, R., and Svarc, M. (2013). Resistant estimates for high dimensional and functional data based on random projections. Computational Statistics and Data Analysis, 58, 326–338.
  • Garcı́a-Portugués, E., González-Manteiga, W., and Febrero-Bande, M. (2014). A goodness-of-fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23, 761–778.
  • Gervini, D. (2012). Outlier detection and trimmed estimation for general functional data. Statistica Sinica, 22, 1639–1660.
  • González-Manteiga, W., and Martı́nez-Calvo, A. (2011). Bootstrap in functional linear regression. Journal of Statistical Planning and Inference, 141, 453–461.
  • Hall, P. (2011). Principal component analysis for functional data: methodology, theory, and discussion. The Oxford Handbook of Functional Data Analysis, F. Ferraty and Y. Romain, Eds., Oxford: Oxford University Press, 210–234.
  • Hall, P., Müller, H.G., and Wang, J.L. (2006). Properties of principal component methods for functional and longitudinal data analysis. Annals of Statistics, 34, 1493–1517.
  • Horváth, L., and Kokoszka, P. (2012). Inference for Functional Data with Applications, NewYork: Springer.
  • Hubert, M., Rousseeuw, P.J., and Verboven, S. (2002). A fast method of robust principal components with applications to chemometrics. Chemometrics and Intelligent Laboratory Systems, 60, 101–111.
  • Hyndman, R.J. (1996). Computing and graphing highest density regions. The American Statistician, 50, 120–126.
  • Hyndman, R.J., and Ullah, M.S. (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics and Data Analysis, 51, 4942–4956.
  • Hyndman, R.J., and Booth, H. (2008). Stochastic population forecasts using functional data models for mortality, fertility and migration. International Journal of Forecasting, 24, 323–342.
  • Hyndman, R.J., and Shang, H.L. (2010). Rainbow plots, bagplots, and boxplots for functional data. Journal of Computational and Graphical Statistics, 19, 29–45.
  • Künsch, H.R. (1989). The jacknife and the bootstrap for general stationary observations. Annals of Statistics, 17, 1217–1241.
  • Li, Y., and Hsing, T. (2007). On rates of convergence in functional linear regression. Journal of Multivariate Analysis, 98, 1782–1804.
  • Liebl, D. (2013). Modeling and forecasting electricity spot prices: a functional data perspective. Annals of Applied Statistics, 7, 1562–1592.
  • Ocaña, F.A., Aguilera, A.M., and Escabias, M. (2007). Computational considerations in functional principal component analysis. Computational Statistics, 22, 449–465.
  • Ramsay, J.O., and Silverman, B.W. (2005). Functional Data Analysis. New York: Springer-Verlag.
  • Raña, P., Aneiros, G., and Vilar, J.M. (2015). Detection of outliers in functional time series. Environmetrics, 26, 178–191.
  • Sawant, P., Billor, N., and Shin, H. (2012). Functional outlier detection with robust functional principal component analysis. Computational Statistics, 27, 83–102.
  • Sguera, C., Galeano, P., and Lillo, R. (2014). Spatial depth-based classification for functional data. Test, 23, 725–750.
  • Shang, H.L. (2014). Bayesian bandwidth estimation for a functional nonparametric regression model with mixed types of regressors and unknown error density. Journal of Nonparametric Statistics, 26, 599– 615.
  • Sun, Y., and Genton, M.G. (2011). Functional boxplots. Journal of Computational and Graphical Statistics, 20, 316–334.
  • Tsay, R.S., Peña, D., and Pankratz, A.E. (2000). Outliers in multivariate time series. Biometrika, 87, 789– 804.
  • Vilar, J.M., Cao, R., and Aneiros, G. (2012). Forecasting next-day electricity demand and price using nonparametric functional methods. International Journal of Electrical Power and Energy Systems, 39, 48–55.
  • Yao, F., Müller, H.G., and Wang, J.L. (2005a). Functional linear regression analysis for longitudinal data. Annals of Statistics, 33, 2873–2903.
  • Yao, F., Müller, H.G., and Wang, J.L. (2005b). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100, 577–590.
  • Yu, G., Zou, C., and Wang, Z. (2012). Outlier detection in functional observations with applications to profile monitoring. Technometrics, 54, 308–318.