## Heuristic segmentation of a nonstationary time series

*Physical Review E*69, 021108 (2004)

### Abstract

Many phenomena, both natural and human influenced, give rise to signals whose statistical properties change under time translation, i.e., are nonstationary. For some practical purposes, a nonstationary time series can be seen as a concatenation of stationary segments. However, the exact segmentation of a nonstationary time series is a hard computational problem which cannot be solved exactly by existing methods. For this reason, heuristic methods have been proposed. Using one such method, it has been reported that for several cases of interest-e.g., heart beat data and Internet traffic fluctuations-the distribution of durations of these stationary segments decays with a power-law tail. A potential technical difficulty that has not been thoroughly investigated is that a nonstationary time series with a (scalefree) power-law distribution of stationary segments is harder to segment than other nonstationary time series because of the wider range of possible segment lengths. Here, we investigate the validity of a heuristic segmentation algorithm recently proposed by Bernaola-Galvan [Phys. Rev. Lett. 87, 168105 (2001)] by systematically analyzing surrogate time series with different statistical properties. We find that if a given nonstationary time series has stationary periods whose length is distributed as a power law, the algorithm can split the time series into a set of stationary segments with the correct statistical properties. We also find that the estimated power-law exponent of the distribution of stationary-segment lengths is affected by (i) the minimum segment length and (ii) the ratio R=sigma(epsilon)/sigma((x) over bar), where sigma((x) over bar) is the standard deviation of the mean values of the segments and sigma(epsilon) is the standard deviation of the fluctuations within a segment. Furthermore, we determine that the performance of the algorithm is generally not affected by uncorrelated noise spikes or by weak long-range temporal correlations of the fluctuations within segments.