全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于特征的时间序列聚类方法研究进展

DOI: 10.11820/dlkxjz.2012.10.008, PP. 1307-1317

Keywords: 聚类,时间序列,时间序列特征,数据挖掘

Full-Text   Cite this paper   Add to My Lib

Abstract:

时间序列聚类可以根据相似性将对象集分为不同的组,从而反映出同组对象的相似性特征和不同组对象之间的差异特征。当序列维度较高时,传统的时间序列聚类方法容易受噪声影响,难以定义合适的相似性度量,聚类结果往往意义不明确。当数据有缺失或不等长时,聚类方法也难以实施。基于上述问题,一些学者提出了基于特征的时间序列聚类方法,不仅可以解决上述问题,还可以发现序列本质特征的相似性。本文根据时间序列的不同特征,综述了基于特征的时间序列聚类方法的研究进展,并进行了分析和评述;最后对未来研究进行了展望。

References

[1]  Shumway R H, Stoffer D S. Time Series Analysis and ItsApplications with R Examples. New York: Springer,2009.
[2]  Han J W, Kamber M. Data Mining: Concepts and techniques.Singapore: Elsevier, 2006.
[3]  Ko?melj K, Batagelj V. Cross-sectional approach forclustering time varying data. Journal of Classification1990, 7: 99-109.
[4]  Balasubramaniyan R, Hüllermeier E, Weskamp N, et al.Clustering of gene expression data using a localshape-based similarity measure. Bioinformatics 2005, 21(7): 1069-1077.
[5]  Liao T W. Clustering of time series data: A survey. PatternRecognition 2005, 38(11): 1857-1874.
[6]  Díaz S P, Vilar J A. Comparing several parametric andnonparametric approaches to time series clustering: Asimulation study. Journal of Classification, 2010, 27(3):333-362.
[7]  Keogh E J, Pazzani M J. An enhanced representation oftime series which allows fast and accurate classification,Clustering and Relevance Feedback//Procs. of the 4thConference on Knowledge Discovery in Databases,1998: 239-241.
[8]  Chen Y G, Nascimento M A, Ooi B C, et al. SpADe: Onshape-based pattern detection in streaming time series//Proceedings of the 23rd International Conference on DataEngineering, IEEE, 2007: 786-795.
[9]  Wang X Z, Smith K, Hyndman R. Characteristic-basedclustering for time series data. Data Mining and KnowledgeDiscovery, 2006, 13(3): 335-364.
[10]  Rose O. Estimation of the Hurst Parameter ofLong-Range Dependent Time Series. Research Report,1996.
[11]  Hilborn R C, Ottino J M, Shinbrot T. Chaos and nonlineardynamics: An introduction for scientists and engineers.AIChE Journal 1995, 41(7): 1831-1832.
[12]  Tian Z, Raghu R, Miron L. BIRCH: An efficient dataclustering method for very large databases. SIGMODRec, 1996, 25(2): 103-114.
[13]  Karypis G, Han S, Kumar V. Chameleon: Hierarchicalclustering using dynamic modeling. IEEE Computer,1999, 32(8): 68-75.
[14]  Ankerst M, Breunig M M, Kriegel H P, et al. OPTICS:Ordering points to identify the clustering structure. SIGMODRec, 1999, 28(2): 49-60.
[15]  Wang W, Yang J, Muntz R. STING: A statistical informationgrid approach to spatial data mining//Proceedings ofthe 23rd Conference on VLDB, 1997: 186-195.
[16]  Biernacki C, Celeux G, Govaert G. Assessing a mixturemodel for clustering with the integrated completed likelihood.IEEE Trans, 2000, 22(7): 719-725.
[17]  Keogh E, Ratanamahatana C A. Exact indexing of dynamictime warping. Knowledge and Information Systems,2005, 7(3): 358-386.
[18]  M?ller-Levet C S, Klawonn F, Cho K H, et al. Clusteringof unevenly sampled gene expression time-series data.Fuzzy Sets and Systems, 2005, 152(1): 49-66.
[19]  M?ller-Levet C S, Klawonn F, Cho K H, et al. Fuzzyclustering of short time-series and unevenly distributedsampling points//Proceedings of the 5th InternationalSymposium on Intelligent Data Analysis, Berlin, Germany,August 28-30, 2003.
[20]  Fu T C, Chung F L, Vincent N, et al. Pattern discoveryfrom stock time series using self-organizing maps//KDD2001 Workshop on Temporal Data Mining. San Francisco,2001: 27-37.
[21]  Hsu K C, Li S T. Clustering spatial-temporal precipitationdata using wavelet transform and self-organizingmap neural network. Advances in Water Resources 2010,33(2): 190-200.
[22]  Lee J G, Han J W, Whang K Y. Trajectory clustering: apartition-and-group framework. Proceedings of ACMSIGMOD International Conference on Management ofData, 2007: 593-604.
[23]  Nanopoulos A, Alcock R, Manolopoulos Y. Featurebasedclassification of time-series data. International Journalof Computer Research, 2001: 49-61.
[24]  Ouyang R, Ren L, Cheng W, et al. Similarity search andpattern discovery in hydrological time series data mining.Hydrological Processes, 2010, 24(9): 1198-1210.
[25]  Kontaki M, Papadopoulos A N, Manolopoulos Y, et al.Continuous trend-based clustering in data streams. DataWarehousing and Knowledge Discovery, 2008, 5182:251-262.
[26]  Kumar M, Patel N R, Woo J. Clustering seasonality patternsin the presence of errors. in ACM KDD ConferenceProceedings, 2002: 557-563.
[27]  Wang X, Wirth A, Wang L. Structure-based statisticalfeatures and multivariate time series clustering//Proceedingsof the Seventh IEEE International Conference on DataMining, 2007: 351-360.
[28]  Caiado J, Crato N, Pe?a D. A periodogram-based metricfor time series classification. Computational Statistics &Data Analysis 2006, 50(10): 2668-2684.
[29]  Kakizawa Y, Shumway R H, Taniguchi M. Discriminationand Clustering for Multivariate Time Series. J. Amer.Stat. Assoc, 1998, 93(441): 328-340.
[30]  Xiong Y, Yeung D Y. Mixtures of ARMA Models forModel-Based Time Series Clustering. Proceedings ofIEEE International Conference on Data Mining, 2002:717-720.
[31]  Bicego M, Murino V, Figueiredo M A T. Similarity-based clustering of sequences using hidden Markovmodels. Machine Learning and Data Mining in PatternRecognition, 2003, 2734: 86-95.
[32]  Rabiner L R. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedingsof the IEEE 1989, 77(2): 257-286
[33]  Oates T, Firoiu L, Cohen P R. Clustering time serieswith hidden markov models and dynamic time warping.Proceedings of the IJCAI-99 Workshop on Neural, Symbolic,and Reinforcement Learning Methods for SequenceLearning, 1999.
[34]  Li C, Biswas G. Temporal Pattern Generation Using HiddenMarkov Model Based Unsupervised Classification.Advances in Intelligent Data Analysis., 1999: 245-256.
[35]  Li C, Biswas G, Dale M, et al. Building models of ecologicaldynamics using HMM based temporal data clustering:A preliminary study. Advances in Intelligent DataAnalysis, 2001: 53-62, doi: 10.1007/3-540-44816-0_6.
[36]  Jain A K. Data clustering: 50 years beyond K-means.Pattern Recognition Letters, 2009, 31(8): 651-666.
[37]  Wang N Y, Chen S M. Temperature prediction and TAIFEXforecasting based on automatic clustering techniquesand two-factors high-order fuzzy time series. ExpertSystems with Applications, 2009, 36(2): 2143-2154.
[38]  Fr?uhwirth-Schnatter S. Model-based clustering of timeseries: A rview from a Bayesian perspective. Manuscript,2011.
[39]  Pakhira M K, Bandyopadhyay S, Maulik U. Validity indexfor crisp and fuzzy clusters. Pattern Recognition2004, 37(3): 487-501.
[40]  Shumway R H, Stoffer D S. Time Series Analysis and ItsApplications with R Examples. New York: Springer,2009.
[41]  Han J W, Kamber M. Data Mining: Concepts and techniques.Singapore: Elsevier, 2006.
[42]  Ko?melj K, Batagelj V. Cross-sectional approach forclustering time varying data. Journal of Classification1990, 7: 99-109.
[43]  Balasubramaniyan R, Hüllermeier E, Weskamp N, et al.Clustering of gene expression data using a localshape-based similarity measure. Bioinformatics 2005, 21(7): 1069-1077.
[44]  Liao T W. Clustering of time series data: A survey. PatternRecognition 2005, 38(11): 1857-1874.
[45]  Díaz S P, Vilar J A. Comparing several parametric andnonparametric approaches to time series clustering: Asimulation study. Journal of Classification, 2010, 27(3):333-362.
[46]  Keogh E J, Pazzani M J. An enhanced representation oftime series which allows fast and accurate classification,Clustering and Relevance Feedback//Procs. of the 4thConference on Knowledge Discovery in Databases,1998: 239-241.
[47]  Shumway R H. Time-frequency clustering and discriminantanalysis. Statistics & Probability Letters, 2003, 63(3): 307-314.
[48]  Alonso A M, Berrendero J R, Hernández A, et al. Timeseries clustering based on forecast densities. ComputationalStatistics & Data Analysis, 2006, 51(2): 762-776.
[49]  Singhal A, Seborg D E. Clustering multivariate time-seriesdata. Journal of Chemometrics, 2005, 19(8):427-438.
[50]  Keogh E, Kasetty S. On the need for time series datamining benchmarks: A survey and empirical demonstration.Data Mining and Knowledge Discovery 2003, 7(4):349-371.
[51]  Piccolo D. A distance measure for classifying ARIMAmodels. Journal of Time Series Analysis, 1990, 11(2):153-164.
[52]  Maharaj E A. Cluster of time series. Journal of Classification,2000, 17(2): 297-314.
[53]  Maharaj E A. Comparison and classification of stationarymultivariate time series. Pattern Recognition, 1999,32(7): 1129-1138.
[54]  Ramoni M, Sebastiani P, Cohen P. Bayesian Clusteringby Dynamics. Machine Learning, 2002, 47(1): 91-121.
[55]  Ramoni M, Sebastiani P, Cohen P. Multivariate clusteringby dynamics//Proceedings of the Seventeenth NationalConference on Artificial Intelligence, 2000: 633-638.
[56]  Chen Y G, Nascimento M A, Ooi B C, et al. SpADe: Onshape-based pattern detection in streaming time series//Proceedings of the 23rd International Conference on DataEngineering, IEEE, 2007: 786-795.
[57]  Wang X Z, Smith K, Hyndman R. Characteristic-basedclustering for time series data. Data Mining and KnowledgeDiscovery, 2006, 13(3): 335-364.
[58]  Rose O. Estimation of the Hurst Parameter ofLong-Range Dependent Time Series. Research Report,1996.
[59]  Hilborn R C, Ottino J M, Shinbrot T. Chaos and nonlineardynamics: An introduction for scientists and engineers.AIChE Journal 1995, 41(7): 1831-1832.
[60]  Tian Z, Raghu R, Miron L. BIRCH: An efficient dataclustering method for very large databases. SIGMODRec, 1996, 25(2): 103-114.
[61]  Karypis G, Han S, Kumar V. Chameleon: Hierarchicalclustering using dynamic modeling. IEEE Computer,1999, 32(8): 68-75.
[62]  Ankerst M, Breunig M M, Kriegel H P, et al. OPTICS:Ordering points to identify the clustering structure. SIGMODRec, 1999, 28(2): 49-60.
[63]  Wang W, Yang J, Muntz R. STING: A statistical informationgrid approach to spatial data mining//Proceedings ofthe 23rd Conference on VLDB, 1997: 186-195.
[64]  Biernacki C, Celeux G, Govaert G. Assessing a mixturemodel for clustering with the integrated completed likelihood.IEEE Trans, 2000, 22(7): 719-725.
[65]  Keogh E, Ratanamahatana C A. Exact indexing of dynamictime warping. Knowledge and Information Systems,2005, 7(3): 358-386.
[66]  M?ller-Levet C S, Klawonn F, Cho K H, et al. Clusteringof unevenly sampled gene expression time-series data.Fuzzy Sets and Systems, 2005, 152(1): 49-66.
[67]  M?ller-Levet C S, Klawonn F, Cho K H, et al. Fuzzyclustering of short time-series and unevenly distributedsampling points//Proceedings of the 5th InternationalSymposium on Intelligent Data Analysis, Berlin, Germany,August 28-30, 2003.
[68]  Fu T C, Chung F L, Vincent N, et al. Pattern discoveryfrom stock time series using self-organizing maps//KDD2001 Workshop on Temporal Data Mining. San Francisco,2001: 27-37.
[69]  Hsu K C, Li S T. Clustering spatial-temporal precipitationdata using wavelet transform and self-organizingmap neural network. Advances in Water Resources 2010,33(2): 190-200.
[70]  Lee J G, Han J W, Whang K Y. Trajectory clustering: apartition-and-group framework. Proceedings of ACMSIGMOD International Conference on Management ofData, 2007: 593-604.
[71]  Nanopoulos A, Alcock R, Manolopoulos Y. Featurebasedclassification of time-series data. International Journalof Computer Research, 2001: 49-61.
[72]  Ouyang R, Ren L, Cheng W, et al. Similarity search andpattern discovery in hydrological time series data mining.Hydrological Processes, 2010, 24(9): 1198-1210.
[73]  Kontaki M, Papadopoulos A N, Manolopoulos Y, et al.Continuous trend-based clustering in data streams. DataWarehousing and Knowledge Discovery, 2008, 5182:251-262.
[74]  Kumar M, Patel N R, Woo J. Clustering seasonality patternsin the presence of errors. in ACM KDD ConferenceProceedings, 2002: 557-563.
[75]  Wang X, Wirth A, Wang L. Structure-based statisticalfeatures and multivariate time series clustering//Proceedingsof the Seventh IEEE International Conference on DataMining, 2007: 351-360.
[76]  Caiado J, Crato N, Pe?a D. A periodogram-based metricfor time series classification. Computational Statistics &Data Analysis 2006, 50(10): 2668-2684.
[77]  Kakizawa Y, Shumway R H, Taniguchi M. Discriminationand Clustering for Multivariate Time Series. J. Amer.Stat. Assoc, 1998, 93(441): 328-340.
[78]  Shumway R H. Time-frequency clustering and discriminantanalysis. Statistics & Probability Letters, 2003, 63(3): 307-314.
[79]  Alonso A M, Berrendero J R, Hernández A, et al. Timeseries clustering based on forecast densities. ComputationalStatistics & Data Analysis, 2006, 51(2): 762-776.
[80]  Singhal A, Seborg D E. Clustering multivariate time-seriesdata. Journal of Chemometrics, 2005, 19(8):427-438.
[81]  Keogh E, Kasetty S. On the need for time series datamining benchmarks: A survey and empirical demonstration.Data Mining and Knowledge Discovery 2003, 7(4):349-371.
[82]  Piccolo D. A distance measure for classifying ARIMAmodels. Journal of Time Series Analysis, 1990, 11(2):153-164.
[83]  Maharaj E A. Cluster of time series. Journal of Classification,2000, 17(2): 297-314.
[84]  Maharaj E A. Comparison and classification of stationarymultivariate time series. Pattern Recognition, 1999,32(7): 1129-1138.
[85]  Ramoni M, Sebastiani P, Cohen P. Bayesian Clusteringby Dynamics. Machine Learning, 2002, 47(1): 91-121.
[86]  Ramoni M, Sebastiani P, Cohen P. Multivariate clusteringby dynamics//Proceedings of the Seventeenth NationalConference on Artificial Intelligence, 2000: 633-638.
[87]  Xiong Y, Yeung D Y. Mixtures of ARMA Models forModel-Based Time Series Clustering. Proceedings ofIEEE International Conference on Data Mining, 2002:717-720.
[88]  Bicego M, Murino V, Figueiredo M A T. Similarity-based clustering of sequences using hidden Markovmodels. Machine Learning and Data Mining in PatternRecognition, 2003, 2734: 86-95.
[89]  Rabiner L R. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedingsof the IEEE 1989, 77(2): 257-286
[90]  Oates T, Firoiu L, Cohen P R. Clustering time serieswith hidden markov models and dynamic time warping.Proceedings of the IJCAI-99 Workshop on Neural, Symbolic,and Reinforcement Learning Methods for SequenceLearning, 1999.
[91]  Li C, Biswas G. Temporal Pattern Generation Using HiddenMarkov Model Based Unsupervised Classification.Advances in Intelligent Data Analysis., 1999: 245-256.
[92]  Li C, Biswas G, Dale M, et al. Building models of ecologicaldynamics using HMM based temporal data clustering:A preliminary study. Advances in Intelligent DataAnalysis, 2001: 53-62, doi: 10.1007/3-540-44816-0_6.
[93]  Jain A K. Data clustering: 50 years beyond K-means.Pattern Recognition Letters, 2009, 31(8): 651-666.
[94]  Wang N Y, Chen S M. Temperature prediction and TAIFEXforecasting based on automatic clustering techniquesand two-factors high-order fuzzy time series. ExpertSystems with Applications, 2009, 36(2): 2143-2154.
[95]  Fr?uhwirth-Schnatter S. Model-based clustering of timeseries: A rview from a Bayesian perspective. Manuscript,2011.
[96]  Pakhira M K, Bandyopadhyay S, Maulik U. Validity indexfor crisp and fuzzy clusters. Pattern Recognition2004, 37(3): 487-501.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133