基于中心Copula函数相似性度量的时间序列聚类方法-陕西师范大学学报期刊社网站

陕西师范大学学报（自然科学版）

数据挖掘专题

基于中心Copula函数相似性度量的时间序列聚类方法

甄远婷，冶继民*，李国荣

（西安电子科技大学数学与统计学院，陕西西安 710126）

冶继民，男，教授，博士生导师，研究方向为盲信号处理、统计学习方法、数据挖掘。E-mail：jmye@mail.xidian.edu.cn

摘要:

针对现实中广泛存在的非线性时间序列数据，提出了一种适用于具有一般相依结构的时间序列聚类的新方法。该方法基于中心Copula函数可以有效度量随机变量之间独立性的特性，采用中心Copula过程捕获时间序列的动态相依结构，采用Cramér-von Mises统计量构造了一种新的时间序列聚类的相似性度量，并给出了该度量的一致性非参数估计及其便于计算的等价形式。实验结果表明，基于新的相似性度量的层次聚类算法不仅适用于非线性时间序列数据，对具有线性相依结构的时间序列数据和实际数据也有较高的聚类质量。

关键词：

相似性度量；中心Copula；非线性时间序列；独立性；动态相依结构

收稿日期：

2020-09-23

中图分类号：

O231

文献标识码：

文章编号：

1672-4291(2021)01-0029-08

基金项目：

国家自然科学基金(61573014)

Doi:

Time series clustering method based on centered Copula function similarity measure

ZHEN Yuanting, YE Jimin*, LI Guorong

(School of Mathematics and Statistics, Xidian University, Xi′an 710126, Shaanxi, China)

Abstract:

A new centered Copula function based method suitable for clustering of time series with general dependency structure is proposed. In light of the centered Copula function can capture the dependency structure between two variables, the centered Copula process is used to capture the dynamic dependency structure of the time series. Using the Cramér-von Mises statistics of the centered Copula process, a new similarity measure of time series is constructed, and a consistent nonparametric estimator with its equivalent form which is easy to calculate are given. Hierarchical clustering algorithm simulation studies show that the proposed similarity measure of time series is not only suitable for nonlinear time series data, but also has higher clustering quality for time series data with linearly dependency structures, and the practice data: GDP data in domestic provinces.

KeyWords:

similarity measure; centered Copula; nonlinear time series; independence; dynamic dependent structure