Modified Classification Method of Multivariate Time Series Based on Shapelets

Authors: Karpenko A.P., Sotnikov P.I. Published: 12.04.2017
Published in issue: #2(113)/2017  
DOI: 10.18698/0236-3933-2017-2-46-65

Category: Informatics, Computer Engineering and Control | Chapter: System Analysis, Control, and Information Processing  
Keywords: time series, classification, shapelets, genetic algorithm, brain-computer interface

We consider the classification of multivariate time series using a paradigm, called shapelets. Instead of exhaustive search among all subsequences of the original time series, we suggest using a genetic algorithm for shapelets discovering. The problem of shapelets discovering is considered as a one-criterion optimization task. The quality of candidates acts as an objective function. Variable parameters are candidate attributes that define their position in the original dataset. We also propose measuring the quality of shapelets by assessing the classification accuracy. The assessment is made on a new dataset, where each object represents the distance vector from a shapelet to original time series. We evaluate efficiency of the proposed method modifications on the known electroencephalogram (EEG) recordings obtained for subjects performing a spelling task with P300-based brain-computer interface (BCI). The results show that these modifications can reduce the search space by nearly 99% with no loss of classification accuracy.


[1] Ye L., Keogh E. Time series shapelets: a new primitive for data mining. Proc. 15th ACM SIGKDD Int. Conf. on Knowledge discovery and data mining. 2009, pp. 947-956.

[2] Lines J., Davis L.M., Hills J., Bagnall A. A shapelet transform for time series classification. Proc. 18th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining. 2012, pp. 289-297.

[3] Grabocka J., et al. Scalable discovery of time-series shapelets. 2015. Cornell University, Technical Report arXiv: 1503.03238. Available at: https://arxiv.org/pdf/1503.03238.pdf (accessed 25.01.2017).

[4] Guger С., Daban S., Sellers Е., Holzner С., Krausz G. How many people are able to control a P300-based brain-computer interface (BCI)? Neuroscience Letters, 2009, vol. 462, no. 1, pp. 94-98. DOI: 10.1016/j.neulet.2009.06.045 Available at: http://www.sciencedirect.com/science/article/pii/S0304394009008192

[5] Rakthanmanon T., Keogh E. Shapelets: a scalable algorithm for discovering time series shapelets. Proc. 13th SIAM Int. Conf. on Data Mining. 2013, pp. 668-676.

[6] Hills J., Lines J., Baranauskas Е., Mapp J., Bagnall А. Classification of time series by shapelet transformation. Data Mining and Knowledge Discovery, 2014, vol. 28, no. 4, pp. 851-881. DOI: 10.1007/s10618-013-0322-1 Available at: http://link.springer.com/article/10.1007%2Fs10618-013-0322-1

[7] Grabocka J., Schilling N., Wistuba M., Schmidt-Thieme L. Learning time-series shapelets. Proc. 20th ACM SIGKDD Int. Conf. on Knowledge discovery and data mining. 2014, pp. 392-401.

[8] Karpenko A.P. Sovremennye algoritmy poiskovoy optimizatsii [Modern search optimization algorithms]. Moscow, Bauman MSTU Publ., 2014. 446 p.

[9] Lines J., Bagnall A. Alternative quality measures for time series shapelets. Intelligent Data Engineering and Automated Learning - IDEAL 2012. 2012, vol. 7435, pp. 475-483.

[10] Chen P.H., Lin C.J., Scholkopf B. A tutorial on v-support vector machines. Applied Stochastic Models in Business and Industry, 2005, vol. 21, no. 2, pp. 111-136. DOI: 10.1002/asmb.537 Available at: http://onlinelibrary.wiley.com/doi/10.1002/asmb.537/abstract

[11] Nessonova M.N. Method of rating voting of algorithms committee in classification tasks with teacher. Zaporozhskiy meditsinskiy zhurnal, 2013, no. 1, pp. 101-102 (in Russ.). DOI: 10.14739/2310-1210.2013.1.15533 Available at: http://zmj.zsmu.edu.ua/article/view/15533

[12] Kubat M., Holte R., Matwin S. Learning when negative examples abound. Proc. 9th European Conf. on Machine Learning. LNCS. 1997, vol. 1224, pp. 146-153.

[13] Anand A., Pugalenthi G., Fogel G.B., Suganthan P. An approach for classification of highly imbalanced data using weighting and undersampling. Amino Acids, 2010, vol. 39, no. 5, pp. 1385-1391. DOI: 10.1007/s00726-010-0595-2 Available at: http://link.springer.com/article/10.1007/s00726-010-0595-2

[14] Hoffmann U., Vesin J., Diserens K., Ebrahimi T. An efficient P300-based brain-computer interface for disabled subjects. Journal of Neuroscience Methods, 2008, vol. 167, no. 1, pp. 115-125. DOI: 10.1016/j.jneumeth.2007.03.005 Available at: http://www.sciencedirect.com/science/article/pii/S0165027007001094

[15] Riccio A., Schettini F., Pizzimenti A. Attention and P300-based BCI performance in people with amyotrophic lateral sclerosis. Frontiers in Human Neuroscience, 2013, vol. 7, article no. 732. DOI: 10.3389/fnhum.2013.00732 Available at: http://journal.frontiersin.org/article/10.3389/fnhum.2013.00732/full