Estimating entropy rate from censored symbolic time series: a test for time-irreversibility
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Estimating entropy rate from censored symbolic time series Estimating entropy rate from censored symbolic time series: a test for time-irreversibility R. Salgado-García1, a) and Cesar Maldonado2 1) Centrode Investigación en Ciencias-IICBA, Physics Department, Universidad Autónoma del Estado de Morelos. Avenida Universidad 1001, colonia Chamilpa, CP 62209, Cuernavaca Morelos, Mexico. 2) IPICYT/División de Control y Sistemas Dinámicos. Camino a la Presa San José 2055, Lomas 4a. sección, C.P. 78216, San Luis Potosí, S.L.P. Mexico. arXiv:2009.11351v2 [cond-mat.stat-mech] 10 Jan 2021 (Dated: 12 January 2021) In this work we introduce a method for estimating entropy rate and entropy production rate from finite symbolic time series. From the point of view of statistics, estimating entropy from a finite series can be interpreted as a problem of estimating parameters of a distribution with a censored or truncated sample. We use this point of view to give estimations of entropy rate and entropy production rate assuming that they are parameters of a (limit) distribution. The last statement is actually a consequence of the fact that the distribution of estimations obtained from recurrence-time statistics satisfy the central limit theorem. We test our method using time series coming from Markov chain models, discrete-time chaotic maps and real a DNA sequence from human genome. Entropy rate as well as entropy production rate are fun- namical systems and stochastic processes (see Ref. 6 for com- damental properties of stochastic processes and determin- plete details). The entropy production rate quantifies, in some istic dynamical systems. For instance, in dynamical sys- way, the degree of time-irreversibility of a given system from tems the entropy rate is closely related to the largest Lya- a microscopic point of view, which in turn tells us how much punov exponent, stating that the positivity of entropy rate such a system is far from the thermodynamic equilibrium4,5,7 . is a signature of the presence of chaos. Similarly, the en- Moreover, time-irreversibility of certain dynamical processes tropy production rate is a measure of the degree of irre- in nature might be an important feature because it would im- versibility of a given system. Thus, in some sense, a non ply the influence of nonlinear dynamics or non-Gaussian noise zero entropy production rate states how much, a system, on the dynamics of the system 8 . All these features of time- is far from equilibrium. However, estimating either, en- irreversibility has encouraged the study of this property in sev- tropy rate or entropy production rate is not a trivial task. eral systems. For instance, in Ref. 9 it has been found that One of the main limitations to give precise estimations of real DNA sequences would be spatially irreversible, a prop- these quantities is the fact that observed data (time series) erty that has been explored aimed to understand the intriguing are always finite, but the entropy rate and entropy pro- statistical features of the actual structure of the genome. The duction rate are asymptotic quantities defined as a limit fact that DNA might be spatially irreversible has been used to for which it is necessary to have infinitely long time-series. propose a mechanism of noise-induced rectification of parti- We use the recurrence-time statistics combined with the- cle motion10 that would be important in the study of biolog- ory of censored samples from statistics to propose sam- ical processes involving the DNA transport. Testing the irre- pling schemes and define censored estimators for the en- versibility of time series has also been the subject of intense tropy rate and the entropy production rate, taking advan- research. For example, in Ref. 8 it has been proposed a sym- tage of the finiteness of the observed data. bolic dynamics approach to determine whether the time series are time-irreversible or not. Another important study has been reported in Ref. 11, where the authors introduced a method to determining time-irreversibility of time series by using a I. INTRODUCTION visibility graph approach. That approach has also been used to understanding the time-reversibility of non-stationary pro- Entropy rate and entropy production rate are two quantities cesses12 . The possibility of determining this temporal asym- playing a central role in equilibrium and nonequilibrium sta- metry has also lead to try to understand the dynamics of sev- tistical mechanics. On the one hand, entropy rate (also called eral processes beyond physical systems. In Ref. 13 it has been Kolmogorov-Sinai entropy) is closely related to the thermody- explored the time-irreversibility of financial time-series as a namical entropy1,2 which is a fundamental quantity in the con- feature that could be used for ranking companies for optimal text of equilibrium statistical mechanics. On the other hand, portfolio designs. In Ref. 14 it has been studied the time- entropy production has a prominent role in the development irreversibility of human heartbeat time-series, and relating this of nonequilibrium statistical mechanics3–5 . Both, entropy rate property to aging and disease of individuals. Moreover, time- and entropy production rate, have a rigorous definition in dy- irreversibility has also been used to understand several prop- erties of classical music15 . In the literature one can find many estimators of the entropy rate in symbol sequences produced by natural phenomena as a) Electronic mail: raulsg@uaem.mx well as in dynamical systems, random sequences or even in
Estimating entropy rate from censored symbolic time series 2 natural languages taken from written texts. Perhaps, the most summary of the asymptotic properties of the estimators based used method for entropy estimation is the empirical approach, on the recurrence-time statistics. We also describe the method in which one estimates the probability of the symbols using used for estimating parameters of the normal distribution from their empirical frequency in the sample and then, this is used a given censored sample. In Section III we propose our sam- to estimate the entropy rate directly from its definition16,17 . pling schemes for estimating the entropy rate and the reversed One can find a lot of works in this direction trying to find bet- entropy rate using the recurrence-time statistics. There, we ter, unbiased and well-balanced estimators (see Ref. 18 and also describe the method that will be used for implementing references therein). One can go further by asking for the con- the estimations in real data. In Section IV we test the method- sistency and the fluctuation properties of these estimators. For ology established in Section III for estimating the entropy instance in Refs. 19 and 20 there are explicit and rigorous fluc- rate and the reversed entropy rate in an irreversible three-state tuation bounds under some mild additional assumptions, for Markov chain. We compare our estimations with the exact these so-called “Plug-In" estimators. On the other hand, but values that can be actually computed. In Section V we imple- in the same empirical approach, there are also estimators for ment the proposed estimating method in deterministic chaotic the relative empirical entropy as a quantification of the entropy systems, a n-step Markov chain and a real DNA sequence. Fi- production21,22 . nally in Section VI we give the main conclusions of our work. From another point of view, the problem of estimating the entropy rate of stationary processes has also been studied us- ing the recurrence properties of the source. This is, another II. ENTROPY RATE AND ENTROPY PRODUCTION major technique used in the context of stationary ergodic pro- RATE cesses on the space of infinite sequences, in areas such as in- formation theory, probability theory and in the ergodic theory A. Recurrence time statistics of dynamical systems (we refer the interested reader to Ref. 23 and the references therein ). The basis of this approach is Consider a finite set A which we will refer to as alphabet. the Wyner-Ziv-Ornstein-Weiss theorem which establishes an Let X := {Xn : n ∈ N} a discrete-valued stationary ergodic almost sure asymptotic convergence of the logarithm of the process generated by the law P, whose realizations are infi- recurrence time of a finite sample (scaled by its length), to nite sequences of symbols taken from A, that is, the set of the entropy rate23 . This result uses the Shannon-McMillan- all posible realizations is a subset of AN . Here we denote by Breiman theorem, which in turn, can be thought as an ergodic x = x1 x2 x3 . . . an infinite realization of the process X. Let ℓ be theorem for the entropy23. Under this approach it is possi- a positive integer, we denote by xℓ1 the string of the first ℓ sym- ble to define estimators using quantities such as return time, bols of the realization x. A finite string a := a1 a2 a3 . . . aℓ com- hitting time, waiting time among others24 . Here we will use prised of ℓ symbols will be called either ℓ-word or ℓ-block, we the term “recurrence time” as a comprehensive term for those may use one or the other without making any distinction. We mentioned before. Moreover, it is possible to obtain very pre- will say that the ℓ-word a “occurs” at the kth site of the tra- cise results on the consistency and estimation of the fluctua- jectory x, if xk+ℓ−1 k = a. An alternative notation for indicating tions of these estimators by applying the available results on the ℓ-block at the kth site of x will be: x(k, k + ℓ − 1). the distribution of these quantities25–27 . Next, we introduce the return time, the waiting time and In the setting of Gibbs measures in the thermodynamic for- the hitting time. Let us consider a finite string aℓ1 made out malism, one can also find consistent estimators defined from of symbols of the alphabet A. Given two independent realiza- the return, hitting, and waiting times for entropy rate and one tions x and y, let xℓ1 and yℓ1 be their first ℓ symbols, then the also has precise statements on their fluctuations, such as the return, the waiting and the hitting time are defined as follows, central limit theorem28, large deviation bounds and fluctua- ρℓ := ρℓ (x) := inf{k > 1 : xk+ℓ−1 = xℓ1 }, (1) tion bounds20,28 . Similarly occurs within the study of the k estimation of the entropy production rate. In the context of Markov chains applied to the quantification of the irreversibil- ωℓ := ωℓ (x, y) := inf{k ≥ 1 : yk+ℓ−1 k = xℓ1 }, (2) ity or time-reversal asymmetry see Refs. 7 and 29, in Gibb- ssian sources see Ref. 30 as well as for their fluctuation prop- τℓ := τℓ (aℓ1 , x) := inf{k ≥ 1 : xk+ℓ−1 = aℓ1 }, (3) k erties in Ref. 30 and 31. Nonetheless, for real systems, determining the value of the respectively. entropy rate and the entropy production rate is not a trivial Wyner and Ziv (see for instance Ref. 32) proved that for an task. This is because these quantities are obtained as limit val- stationary ergodic process, the quantity 1ℓ log ρℓ converges to ues of the logarithm of recurrence times, as the sample length the entropy rate in probability, and that for stationary ergodic goes to infinity. This is a fundamental limitation, since obser- Markov chains, 1ℓ log ωℓ also converges to the entropy rate h, vations are always finite. So, instead of having the true value in probability. That is, these quantities grow exponentially fast of the entropy rate or the entropy production rate, one always with ℓ and their limit rate is equal to the entropy rate in proba- obtains a finite-time approximation. This makes us believe bility. Later, Ornstein and Weiss33 showed that for stationary that there is a need to define estimators for finite samples, us- ergodic processes ing the point of view of the recurrence times. 1 The article is organized as follows. In Section II we give a lim log ρℓ = h P − a.s. (4) ℓ→∞ ℓ
Estimating entropy rate from censored symbolic time series 3 For the waiting time, it was proved by Shields23 that for sta- where in this case, σ 2 = limℓ→∞ 1ℓ (log ωℓ − h)2 d(P × P). R tionary ergodic Markov chains one has, So, in the context of Gibbs measures, the asymptotic nor- mality it is fulfilled for both, the return times and the waiting 1 lim log ωℓ = h P × P − a.s. (5) times. This also holds for exponential φ -mixing processes. ℓ→∞ ℓ Moreover it is satisfied a large deviations principle for both These theorems are based on the Shannon-McMillan-Breiman quantities as well28 (with some additional restrictions in the theorem, which claims that − 1ℓ log P([xℓ1 ]) converges almost case of the return-time). For the case of the hitting times, one surely to the entropy rate h, where [xℓ1 ] stands for the cylin- has to overcome the bad statistics produced by very short re- der set [xℓ1 ] := {z ∈ AN : zℓ1 = xℓ1 }. Furthermore, in Ref. 27, turns for which the approximation changes (see Ref. 35). Kontoyiannis has obtained strong approximations for the re- In the same context, one can find fluctuation bounds for currence and waiting times of the probability of a finite vector both, the plug-in estimators and for the waiting and the hitting which in turn, have let him to obtain an almost sure conver- time estimators.20 One of the main tools used is the concen- gence for the waiting time in ψ -mixing processes, extending tration inequalities that are valid for very general mixing pro- previous results for Markov chains. He has also obtained an cesses. Using the concentration phenomenon, one can obtain almost sure invariance principle for log ρℓ and log ωℓ . This im- non-asymptotic results. That is, upper bounds for the fluctu- plies that these quantities satisfy a central limit theorem and a ations of the entropy estimator which are valid for every n, law of iterated logarithm.27 where n denotes the length of the sample. In the same spirit, the works of Abadi and collabora- Next, for the estimation of the entropy production rate, in tors24–26 provide very precise results for the approximation Ref. 30, two estimators of the entropy production were intro- of the distribution of the hitting times (properly rescaled) to duced. The entropy production was defined as a trajectory- an exponential distribution, under mild mixing conditions for valued function quantifying the degree of irreversibility of the the process. They also give sharp bounds for the error term process producing the samples, in the following way: let P be for this exponential distribution approximation. This enables the law of the process and let us denote by Pr the law of the to obtain bounds for the fluctuations of the entropy estimators time-reversed process, then the entropy production rate is the using hitting times.20,28,31 relative entropy rate of the process with respect to the time- reversed one, Hℓ (P|Pr ) B. Asymptotic behavior of the estimators ep = h(P|Pr ) := lim , (9) ℓ→∞ ℓ We are interested in estimating the entropy and the entropy where production rates, moreover, we need to assure that their esti- P([xℓ1 ]) mators have good properties of convergence and fluctuations, Hℓ (P|Pr ) := ∑ P([xℓ1 ]) log . (10) since this will enable us to use our method. xℓ1 ∈Aℓ P([x1ℓ ]) Here, we are interested in estimators defined by recurrence times, for which one can find very precise asymptotic results Here x1ℓ stand for the word xℓ1 reversed in order. The estimators regarding their fluctuations. It is known33 that defined in Ref. 30 using the hitting and waiting times are given as follows: 1 lim log ρℓ = h, (6) ℓ→∞ ℓ τx1 (x) Sℓτ (x) := log ℓ , (11) almost surely in ergodic process, thus one can use the return τxℓ (x) 1 time as an estimator of the entropy rate. Furthermore, under the Gibbssian assumption, √ it has been proved that the random where τxℓ (x) := inf{k ≥ 1 : xk+ℓ k = xℓ1 }. Notice that the estima- 1 variable (log ρℓ − ℓh)/ ℓ converges in law to a normal distri- tor actually quantifies the logarithm of the first time the word bution, when ℓ tends to infinite.34 xℓ1 appears in the reversed sequence divided by the first return The waiting and hitting times are also used as estimators. time of the first ℓ symbols in x. For the case of the estimator For instance, it has beed proved that, using the waiting time, one has in an analogous way that: 1 ωℓr (x, y) lim log ωℓ (x, y) = h, (7) Sℓω (x, y) := log , (12) ℓ→∞ ℓ ωℓ (x, y) for P × P almost every pair (x, y), where the distribution P is a Gibbs measure.28 This is obtained from an approximation where ωℓ (x, y) := τxℓ (y) and ωℓr (x, y) := τx1 (y). In the con- 1 ℓ of the 1ℓ log ωℓ to the − 1ℓ log P([xℓ1 ]) which, by the Shannon- text of Gibbs measures or exponential ψ -mixing,30 it has been McMillan-Breiman theorem, goes almost surely to the en- studied the fluctuation properties of such estimators for which tropy rate. Also, they proved the same log-normal fluctuations its consistency has also been proved, that is, P × P-almost for the waiting times, i.e., surely we have that, n log ω − ℓh Sℓω o ℓ lim P × P √ < t = N (0, 1)(−∞,t], (8) lim = ep , (13) ℓ→∞ σ ℓ ℓ→∞ ℓ
Estimating entropy rate from censored symbolic time series 4 as well as, P-almost surely entropy. Taking into account this observation, we can state our problem as follows: given a sample set {hi : 1 ≤ i ≤ m} of Sℓτ independent realizations of Θℓ , we wish to estimate hℓ and σℓ lim = ep . (14) ℓ→∞ ℓ knowing that such a sample is censored from above by hc . The asymptotic normality also holds, in that case, the asymp- It is important to remark that, since the realizations of Θℓ totic variance of the estimator coincides with that of the en- are censored from above by hc , then any sample set H := {hi : tropy production. In the same reference the authors also ob- 1 ≤ i ≤ m} of (independent) realizations of Θℓ will contain nu- tain a large deviation principle for the waiting time estimator. merically undefined realizations; i.e., hi such that hi > hc . We Later in Ref. 31 the fluctuation bounds were obtained for the well refer to these numerically undefined values as censored same estimators introduced in Ref. 30 under the same setting. realizations or censored samples. Those sample with a well- This result is interesting from the practical point of view since defined numerical value will be called uncensored samples o it provides bounds that are valid for finite time and not only in realizations. We will see below that censored sample data will the asymptotic sense. be used for the estimation of hℓ and σℓ . Here, we will use the approach defined in Ref. 7 for the es- Let m := |H | be the size of the sample and let us assume timation of the entropy production rate, since we want to com- that the total number of uncensored realizations in the sam- pare it with the exact results one is able to obtain for Markov ple set H is exactly k, with k < m. Then, the total number of chains. In Ref. 7 it is shown that the entropy production rate censored realizations in H is m − k. Since the realizations are can be obtained as the difference between the entropy rate and assumed to be independent (a usual hypothesis in statistics), the reversed entropy rate for Markov processes. For more gen- we have that k can be seen as a realization of a random vari- eral systems, the entropy production is defined in some anal- able with binomial distribution. Thus, the fraction p̂ := k/m ogous way.5 The reversed entropy rate is defined as the rate of uncensored samples with respect to the total number of re- of entropy of the reversed process in time, i.e., as if we were alizations in H is an estimation of the parameter p of the estimating the entropy rate of the process evolving backwards above-mentioned binomial distribution. As we said above, Θℓ in time. From the practical point of view, in a time series, the has normal distribution, implying that the parameter p is given entropy production rate may be estimated as the difference by, between the entropy rate and the entropy rate estimated from hc − hℓ the reversed time series. To implement the latter methodol- p=Φ , (16) ogy using the recurrence time statistics, in Section III we will σℓ define the reversed recurrence times which will allow us to where Φ is the distribution function of a standard normal ran- give estimations of the reversed entropy rate and eventually, dom variable, i.e., the corresponding estimations of the entropy production rate 1 Z x as a measure of time-irreversibility of the process. It is im- Φ(x) := √ e−y 2 /2 dy. (17) portant to mention that our methodology can still be applied 2π −∞ further than Markov chains, nevertheless, in those cases, one expects to obtain results displaying the irreversibility as a con- In Appendix A, following calculations from Ref. 36, we sequence of the positivity of the entropy production, and not show that the parameters hℓ and σℓ2 can be estimated by using the exact results. the censored sample as follows: ĥ = h̄ + ζ̂ (hc − h̄), (18) 2 2 2 C. Parameter estimation of a normal distribution from σ̂ = s + ζ̂ (hc − h̄) , (19) censored data where h̄ is the sample mean of the uncensored samples and s2 the corresponding sample variance, i.e., Let us denote by Θℓ the random variable whose realizations are estimations of the ℓ-block entropy rate obtained by the 1 k recurrence-time statistics. To be precise, Θℓ can be defined as h̄ := ∑ hi , (20) k i=1 1 Θℓ = log(Tℓ ), (15) 1 k ℓ s2 := ∑ (hi − h̄)2 . (21) k i=1 where Tℓ can be the return, hitting, or waiting time random variable. As pointed out above, Θℓ satisfy the central limit Additionally ζ̂ is defined as: theorem regardless the choice of the recurrence time statistics. This fact enables us to assume that Θℓ has a normal distribu- φ (ξ̂ ) ζ̂ := (22) tion, with mean hℓ and variance σℓ2 . As mentioned before, one p̂ ξ̂ + φ (ξ̂ ) of the problems arising in implementing this estimator for real time series is that the return time Tℓ is censored from above by where ξ̂ is obtained by means of the normal distribution func- a prescribed finite value Tc . From eq. (15), it is clear that the tion as random variable Θℓ becomes censored from above by a finite value hc := log(Tc )/ℓ which will be referred to as censoring ξ̂ := Φ−1 ( p̂). (23)
Estimating entropy rate from censored symbolic time series 5 III. SAMPLING SCHEMES FOR ESTIMATING ENTROPY It is necessary to stress the fact that the values in the above- RATE FROM RECURRENCE-TIME STATISTICS defined sample set are not necessarily all of them numerically well-defined (or uncensored). This is because the return-time As we said above, we are interested in estimating the en- defined in eq. (24) is actually censored from above. Notice (n) tropy rate and the entropy production rate from an observed that we impose the condition that ρℓ take a value no larger trajectory. The trajectory, in this context, stands for a finite- that n. This is imposed by two reasons: on the one hand, length symbolic sequence x = x1 x2 x3 . . . xn which is assumed we have that the return-time cannot be arbitrarily large due to be generated by some process with an unknown law P. As to the finiteness of the trajectory x. And, on the other hand, we saw in section II, we have to assume that the process com- although it is possible that the return-time for some sample plies with the appropriate mixing properties, such as exponen- words might be larger than n and still well-defined, it is not tial φ -mixing or Gibbs, in order for the central limit theorem convenient for the statistics. Let us explain this point in more to be valid. The next step is to obtain samples of the recur- detail. If we take a sample word a located at the kth site, rence time statistics, i.e., we need to establish a protocol for its corresponding return-time can in principle be at most as extracting samples of return, waiting or hitting-times from the large as n + k − ℓ. This happens when the word a occurs (by sequence x. The method for extracting samples we use, is sim- chance) at the n + k − ℓth site. Since all the sample words ρ ilar to the one introduced in Ref. 37, which is used for estimat- in Mℓ are located at different sites along x, it is clear that ing the symbolic complexity and particularly, the topological their corresponding return-time values have different upper entropy of a process. After that, we will define the estimators bounds. Therefore, if we do not impose a homogeneous upper of the entropy rate and entropy production rate, using the fact bound, the collection of return-time samples results in inho- that the observed samples might be censored. mogeneous censored data. As we have seen in section II C, having a homogeneous bound (homogeneous censored data) is crucial for implementing our estimators. A. Return time k k+n First, we establish the method for obtaining samples of the (A) a a return-time. Given a sequence x of size 2n, take two non- a a negative integers ℓ and ∆ such that ℓ < ∆ ≪ n. Then define (B) ρ the set Mℓ = {ai : ai = x(i∆ + 1, i∆ + ℓ), 0 ≤ i < m}, where a a m := ⌊n/∆⌋, of words of length ℓ and evenly ∆-spaced along (C) the first half of the trajectory x. In Fig. 1 we show a schematic 2n ρ representation of how the sample words in Mℓ are collected from the trajectory x. x a1 a2 a3 ... am FIG. 2. Uncensored and censored return-time values. First we n 2n suppose that a sample word a occurs at the kth site along a finite (n) trajectory x of length 2n. In order to get ρℓ (a) we should look for x the occurrence of a along x, from the (k +1)th symbol to the (k +n)th symbol of x. This section of the trajectory is written x(k + 1, k + (n) n). (A) If a is found in x(k + 1, k + n), then ρℓ (a) is numerically FIG. 1. Selection of sample words for return-time statistics. well defined, thus called uncensored. (B) If a is found in x but not (n) in the section x(k + 1, k + n) we consider that ρℓ (a) is censored Next, we define the sample sets of return times Rℓ and re- (numerically undefined ). (C) Finally, if we do not observe any other versed return times R ℓ as follows. First, we associate to each occurrence of a in x beyond the (k + 1)th symbol, it is clear that ρ (n) (n) word a ∈ Mℓ the censored return time, ρℓ (a, x), and the ρℓ (a) is numerically undefined, henceforth, censored. (n) reversed return time, ρℓ (a, x), as follows, In the following, we will refer to this homogeneous upper (n) ρℓ (a, x) := inf{t > 1 : xk+t+ℓ−1 k+t = a, t ≤ n, a := xk+ℓ−1 k }, bound for return-times as censoring time and, whenever con- (24) venient it will alternatively be denoted by Tc . In Fig. 2 we give (n) an illustrative description of the censoring of the samples. ρℓ (a, x) := inf{t > 1 : xk+t+ℓ−1 k+t = a, t ≤ n, a := xkk+ℓ−1 }. Once we have the return-time sample set Rℓ , we introduce (25) the estimator of the entropy rate and the entropy production rate. As we saw in section II, if we take a return-time value Observe that a stands for the block a with its symbols in a t from the sample set Rℓ , then the quantity log(t)/ℓ can be reversed order. Next, Rℓ and R ℓ are defined by interpreted as a realization of the block entropy rate, hℓ which (n) Rℓ := {t ∈ N : ρℓ (a) = t, a ∈ Mℓ }, ρ (26) in the limit when ℓ → ∞, obeys the central limit theorem. This fact enables us to implement the following hypothesis: for (n) ρ R ℓ := {t ∈ N : ρℓ (a) = t, a ∈ Mℓ }. (27) finite ℓ, the value log(t)/ℓ is a realization of a normal random
Estimating entropy rate from censored symbolic time series 6 2 √ variable with (unknown) mean hℓ and variance σℓ2 . Then, the where φ (x) = e−x /2 / 2π is the probability density sample sets function of the standard normal distribution and Φ its ρ (cumulative) distribution function. Hℓ := {h = log(t)/ℓ : t ∈ Rℓ }, (28) H ρ := {h = log(t)/ℓ : t ∈ R ℓ }, (29) 7. Finally, the estimations for the mean of the block en- ℓ tropy and its variance using the return-time estimator can be considered as sets of realizations of normal ran- are given by dom variables censored from above by the quantity hc := log(Tc )/ℓ = log(n)/ℓ that we call censoring entropy. Then the ĥℓ = h̄ + ζ̂ (hc − h̄), (38) estimation procedure for the block entropy is essentially the σ̂ℓ2 2 = s + ζ̂ (hc − h̄) . 2 (39) one described in section II C. Here we summarize the steps for performing the estimation of hℓ for return-time statistics. where hc is the censoring entropy and it is defined as 1. Given a finite sample trajectory or a symbolic sequence hc := log(Tc )/ℓ. x of size 2n, define the censoring time as the half of the size of the sample trajectory, i.e., Tc = n. Fix the number 8. Repeat steps 4 – 7 for the set R ℓ in order to have an esti- m of sample words or blocks to be collected and the size mation of the reversed block entropy rate, which allows of the block ℓ to be analyzed. Next, define the spacing to have an estimation of the block entropy production ρ ∆ := ⌊n/m⌋ and the sample set Mℓ of evenly ∆-spaced rate just by taking the difference between the reversed words that lies along the first half of the trajectory x, block entropy and the block entropy7 as follows, i.e., ê p := ĥRℓ − ĥℓ. (40) ρ Mℓ = {ai : ai = x(i∆ + 1, i∆ + ℓ), 0 ≤ i < m}. 2. Define the sets of return-time samples and reversed B. Waiting time return-time samples as The waiting-time estimator for the block entropy requires (n) ρ Rℓ := {t ∈ N : ρℓ (a) = t, a ∈ Mℓ }, (30) two distinct trajectories. In practical situations, we normally (n) ρ have one single trajectory. In order to overcome this prob- R ℓ := {t ∈ N : ρℓ (a) = t, a ∈ Mℓ }. (31) lem, we split the original sequence in two equal-sized parts. Since we assume sufficiently rapid mixing, it is possible to re- 3. Using the previous sets of return-time samples define gard the second half of the sample to be independent of the the sets of block entropy and reversed block entropy first half, provided that the size of the sample is large enough. ρ Thus, one may consider the two parts of the sample as two Hℓ := {h = log(t)/ℓ : t ∈ Rℓ }, (32) independent trajectories. After that, we collect m different ρ ℓ-words at random along one of those trajectories. This col- H ℓ := {h = log(t)/ℓ : t ∈ R ℓ }. (33) lection is denoted by Mℓω , and will play the role of the set ρ 4. Next, define the rate uncensored sample values p̂ := of sample words, in the same way as it was done by set Mℓ ρ k/m, where m is the total number of samples in Hℓ and in section III A. A schematic representation of this sampling ρ k is the number of uncensored samples in Hℓ (hence- procedure is shown in Fig. 3. ρ forth there are m − k censored samples in Hℓ ). a1 a2 a3 a4 5. Let 1 ≤ i ≤ k, and denote by hi , each of the uncensored 2n ρ n samples in Hℓ . Then its mean and variance are given as follows x 1 k h̄ := ∑ hi , (34) k i=1 FIG. 3. Selection of sample words for the waiting time statistics. 1 k s2 := ∑ (hi − h̄)2 . (35) The next step consists in defining the censored waiting-time k i=1 corresponding to each word in the sample Mℓω . Let x be the trajectory consisting of 2n symbols. Assume that the sam- 6. Define the sample functions (see section II C and ap- ples are randomly collected from the segment x(n + 1, 2n − ℓ). pendix A for details) Then we define the censored waiting-time and the censored reversed waiting-time for a ∈ Mℓω as follows, φ (ξ̂ ) ζ̂ := , (36) (n) p̂ ξ̂ + φ (ξ̂ ) ωℓ (a, x) := inf{t ≥ 1 : xtt+ℓ−1 = a}, (41) (n) ξ̂ := Φ−1 ( p̂), (37) ωℓ (a, x) := inf{t ≥ 1 : xtt+ℓ−1 = a}. (42)
Estimating entropy rate from censored symbolic time series 7 ω It is important to notice that the both, the waiting time and the 8. Repeat steps 4 – 7 for the set H ℓ in order to have an reversed waiting time are bounded from above by n, i.e., the estimation of the reversed block entropy rate, which al- sample waiting times are homogeneously censored by n. lows to have an estimation of the block entropy produc- The rest of the method follows the lines of the one described tion rate by taking the difference between the reversed in section III A. Here we summarize the main steps: block entropy and the block entropy7 as follows, 1. Given a finite sample trajectory x of size 2n, set the cen- ê p := ĥRℓ − ĥℓ. (53) soring time Tc = n equals to the half of the size of the sample trajectory. Fix the number m of sample words to be collected and the size of the block ℓ. Next, collect m different words at random along the symbolic sequence C. Hitting time x(n + 1 : 2n). We denote by Mℓω this collection of ℓ- words. The hitting-time estimator requires a set of sample words which should be drawn at random from the process that gen- 2. Define the sets of waiting-time samples and reversed erates the observed trajectory x. Although we do not know waiting-time samples as the law of the process, we can still avoid this problem if the (n) set of sample words is obtained by choosing the ℓ-words at Wℓ := {t ∈ N : ωℓ (a, x) = t, a ∈ Mℓω }, (43) random from another observed trajectory. However, this is the (n) W ℓ := {t ∈ N : ωℓ (a, x) = t, a ∈ Mℓω }. (44) very same method we used for collecting the sample words for the waiting-time estimator. Then, from the statistical point 3. From the sets of waiting-time samples define the sets of of view, the hitting-time and waiting-time method can be re- block entropy and reversed block entropy garded as the same method. Hℓω := {h = log(t)/ℓ : t ∈ Wℓ }, (45) ω H ℓ := {h = log(t)/ℓ : t ∈ W ℓ }, (46) IV. ESTIMATIONS TESTS 4. Define the rate of uncensored sample values as p̂ := Now, we will implement the above defined methods for es- k/m, where m is the total number of samples in Hℓω timating the block entropy and entropy production rates. First and k is the number of uncensored samples also in Hℓω of all, we will perform numerical simulations in order to im- ( thus, the remaining m − k samples are censored). plement a control test statistics which will be compared with 5. Let 1 ≤ i ≤ k, denote by hi , each of the uncensored sam- the numerical experiments using our methods. ples in Hℓω . Then its mean and variance are given as In section III we established two methods for estimating follows block entropies by using either, the return-time statistics or the waiting-time statistics. These methods assume that we 1 k only have a single “trajectory” or, better said, symbolic se- h̄ := ∑ hi , (47) k i=1 quence, obtained by making an observation of real life. Our 1 k purpose here is to test the estimators themselves, and not the s2 := ∑ (hi − h̄)2 . (48) sampling methods. The latter means that we will implement k i=1 the estimators (20) and (21) for both, the return-time and the waiting-time statistics, without referring to the sampling 6. Define the sample functions (see section II C and ap- schemes mentioned in section III. This is possible because we pendix A for details) have access to an unlimited number of sequences, which are produced numerically with a three-states Markov chain. In φ (ξ̂ ) ζ̂ := , (49) this sense we have control of all of the parameters involved p̂ ξ̂ + φ (ξ̂ ) in the estimators, namely, the length of the block ℓ, the en- ξ̂ := Φ−1 ( p̂), (50) tropy threshold hc (by which the recurrence-time samples are √ censored) and the sampling size |Hℓ |. After that, we will im- 2 where φ (x) = e−x /2 / 2π is the probability density plement the estimation method described in section III using function of the standard normal distribution and Φ its a single sequence obtained from the Markov chain defined be- (cumulative) distribution function. low. The latter is a numerical experiment done to emulate an observation of real life where the accesible sample symbolic 7. Finally, the estimations for the mean of the block en- sequences are rather limited. tropy and its variance using the return-time estimator are given by ĥℓ = h̄ + ζ̂ (hc − h̄), (51) A. Finite-state Markov chain σ̂ℓ2 = s2 + ζ̂ (hc − h̄)2 , (52) For numerical purposes we consider a Markov chain whose again, hc is the censoring entropy defined as above. set of states is defined as A = {0, 1, 2}. The corresponding
Estimating entropy rate from censored symbolic time series 8 5 5 10 (a) (b) (c) direct (a) (b) 4 reversed 4 log( f ) 3 3 1 h ep 2 2 0.1 1 1 10 (d) (e) (f) 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 q q log( f ) 1 FIG. 4. Entropy rate and entropy production rate. (a) We show the 0.1 behavior of entropy rate h and reversed entropy rate hR as a func- tion of the parameter q using the exact formulas given in eqs. (55) 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 and (56). (b) We display the behavior of entropy production rate as a h function of q using the exact formula (57). FIG. 5. Return-time entropy density for q = 0.60 and ℓ = 10. (a) hc = 0.48, (b) hc = 0.57, (c) hc = 0.66, (d) hc = 0.75, (e) hc = 0.84, stochastic matrix P : A × A → [0, 1] is given by, (f) hc = 1.02. 0 q 1−q P = 1−q 0 q , (54) the corresponding return time (for the ith realization) will be q 1−q 0 either, ρi := t − ℓ + 1 or an undefined value ρi > Tc . Once we have collected the sample set of return times {ρi } where q is a parameter such that q ∈ [0, 1]. It is easy to see that we obtain a set of block entropy rates by means of the equation this matrix is doubly stochastic and the unique invariant prob- ability vector π = π P is given by π = ( 31 , 13 , 31 ). Moreover, log(ρi ) it is easy to compute the entropy rate and the time-reversed hi = , (58) ℓ entropy rate, indeed, they are given by, whenever ρi is numerically defined. Of course, we might h(q) = −q log(q) − (1 − q) log(1 − q), (55) obtain some numerically undefined sample block entropies hR (q) = −(1 − q) log(q) − q log(1 − q). (56) hi > hc due to the censored return times. Analogously, we obtain a sample set of reversed entropy Additionally, the corresponding entropy production rate is rates. That is, we make evolve the Markov chain and stop given by its evolution at time t until the first ℓ-word a1 , a2 , . . . , aℓ ap- pears reversed in the realization, i.e., at−ℓ+1 , at−ℓ+2 , . . . , at = q aℓ , aℓ−1 , . . . , a1 or until the time t − ℓ + 1 exceeds the given e p (q) = (2q − 1) log . (57) 1−q upper bound Tc . The reversed return time for the realiza- tion i will be ρi = t − ℓ + 1 or it is numerically undefined if The behavior of the entropy rate and entropy production rate t − ℓ + 1 > Tc . Then we obtain the sample set {hi } by means can be observed in Figure 4 We will use this model to generate of equation hi = log(ρi )/ℓ. symbolic sequences in order to test the estimators. Notice that this procedure involves two parameters that can freely vary. These are the block length ℓ and hc (or equiv- alently Tc ), where hc is an upper bound for the possibly ob- B. Statistical features of estimators for censored data served block entropy rates, thus, by censoring the correspond- ing sample set. The first numerical experiment we perform is intended to Then, we analyze statistically the sample set of block en- show the statistical properties of the estimators without im- tropy rates and reversed block entropy rates for several values plementing the sampling schemes introduced above. To this of the free parameters. In Figure 5 we show the histogram of end, we produce a censored sample set of 5 × 104 return times the relative frequencies of the block entropy rate for ℓ = 10, obtained from several realizations of the three-state Markov q = 0.60 and several values of hc . Correspondingly, in Fig- chain. We obtain each of those return times as follows. First ure 6 we show the histogram of the relative frequencies of the we initialize the Markov chain at the stationary state (i.e., we reversed block entropy rate for ℓ = 10, q = 0.60 and several choose the first symbol at random using the stationary vec- values of hc . tor of the chain) and we make evolve the chain. This proce- We can appreciate how the density of the block entropy rate dure generates a sequence which grows in time, say for in- is censored while ℓ is kept fix. If the value of hc is small, for stance a1 , a2 , . . . at . The evolution of the Markov chain will most of the samples the return time is numerically undefined be stoped at time t either, until the first ℓ-word a1 , a2 , . . . , aℓ because the samples are censored from above. This is seen appears again, that is if, at−ℓ+1 , at−ℓ+2 , . . . , at = a1 , a2 , . . . , aℓ for instance, in Figure 5a, in which hc takes the smallest value or when the time t − ℓ + 1 exceeds a given bound Tc . Then, for the displayed graphs. In this case, approximately only a
Estimating entropy rate from censored symbolic time series 9 10 1.1 3 0.8 (a) (b) (c) (b) return-time (a) return-time 0.67 (reversed entropy)0.75 1 2.5 1 log( f ) 0.7 0.66 0.9 2 0.65 ^ ^ h hR 0.1 0.65 0.7 0.8 0.9 1 1.1 0.6 0.7 0.8 0.9 1 1.1 0.8 1.5 10 (d) (e) (f) 0.7 1 1 0.6 0.5 log( f ) 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 hc hc 0.1 FIG. 8. Return-time entropy estimations as a function of hc for 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 h several values of ℓ. Panel (a): The graphics shows the behavior of ĥ as we increase the entropy threshold hc for ℓ = 6 (filled squares), ℓ = 9 (filled triangles), ℓ = 12 (filled circles), ℓ = 15 (X’s), ℓ = 18 FIG. 6. Reversed return-time entropy density for q = 0.60 and ℓ = (stars), ℓ = 19 (plus). Panel (b): it is shown the behavior of the 10. (a) hc = 0.48, (b) hc = 0.57, (c) hc = 0.66, (d) hc = 0.75, (e) estimated reversed entropy for the same parameter values used in hc = 0.84, (f) hc = 1.02. panel (a). 25% of the samples are numerically well-defined resulting in not well manifested for the block entropy rate. We can also the ‘partial’ histogram displayed in Figure 5a. In Figure 5b, observe that increasing the block length, the histogram pro- the value of hc is increased causing the histogram to ‘grow’. gressively evolve towards a bell-shaped distribution, which is In the remaining graphs, from Figure 5c to Figure 5d, this ten- reminiscent of the normal one. This shows that an estimation dency is clear, as we increase the value of hc the number of using our approach could be more accurate for large values of numerically defined samples grows, thus completing gradu- block lengths due to the central limit theorem. ally the corresponding histogram. Something similar occurs Once we have the sample set of block entropy rates we for the reversed block entropy shown in Figure 6. use the estimation procedure for censored data as described On the other hand, if we keep hc constant and vary the block in Section III. We perform this procedure for the entropy rates length ℓ, we can appreciate the evolution of the histogram to- and reversed entropy rates obtained from the the return-time wards a normal-like distribution. We show this effect in Fig- and the waiting-time statistics. ure 7 for q = 0.60 and hc = 1.155 fixed. This is in agree- In Figure 8 we show the estimation of the block entropy ment with the central limit theorem, as we have mentioned rate and the reversed block entropy rate using the return-time in previous sections. In Figure 7 we show the histograms for statistics. In Figure 8a, the displayed curves (solid black lines) ℓ = 6, 9, 12, 15, 18 and 19 (panels (a)–(f) respectively). We show the behavior of the estimation of the block entropy rate observe that for the lowest value of ℓ, the histogram is rather as a function of the censoring bound hc for several values of irregular, which means that the central limit theorem is still ℓ. This figure exhibits two important features of our estima- tion technique. Firstly, we notice that the estimation of the entropy rate has large fluctuations for small hc . We can say 4 that the smaller hc , the larger statistical errors are observed, as (a) (b) (c) 3 expected. Secondly, we observe that, the larger ℓ, the better the estimation. The latter can be inferred from the fact that f 2 the curve with the largest value of ℓ in Figure 8a is closest to 1 the exact entropy rate (solid red line). A similar behavior oc- 0 curs for the reversed block entropy rate estimations shown in 0.4 0.6 0.8 0.4 0.6 0.8 0.4 0.6 0.8 Figure 8b. 6 (d) (e) (f) For the waiting-time statistics an analogous behavior oc- 4 curs. In Figure 9 it is shown the curves for the estimations f of the block entropy rate, in panel (a), and the reversed block 2 entropy rate, in panel (b). As expected, the estimations for 0 small values of the censoring bound hc have large fluctua- 0.5 0.6 0.7 0.8 0.5 0.6 0.7 h 0.8 0.5 0.6 0.7 0.8 tions, which gradually decrease as hc is increased. This is clearly observed in Figure 9 because the black solid lines de- viate largely from the exact value (solid red line) for small FIG. 7. Entropy estimated by means of the return-time statistics values of hc . Concerning the value of ℓ, it is clear that for the for the three-states Markov chain We show the histograms of the es- largest value of ℓ, the estimation is closer to the exact entropy timated entropy density for q = 0.60, hc = 1.155 and (a) ℓ = 6, (b) rate for hc large enough (see the insets in Figure 9). ℓ = 9, (c) ℓ = 12, (d) ℓ = 15, (e) ℓ = 18, (f) ℓ = 19. We obtained the All these observations allows us to state that, for obtain- corresponding histograms using 5 × 104 sample words in each case. ing the best estimations (as far as possible within the present
Estimating entropy rate from censored symbolic time series 10 1 (a) waiting-time 0.67 1.1 (b) waiting-time 0.76 (reversed entropy) (a) return-time (b) waiting-time 0.8 0.8 0.9 0.74 0.66 1 0.7 0.7 ^ h^ 0.6 ^ 0.72 ^ hR h 0.8 0.9 h 0.6 0.65 0.7 0.7 0.8 0.9 1 1.1 0.7 0.8 0.9 1 1.1 0.5 0.5 0.7 0.8 0.4 0.4 0.6 0.4 0.6 0.8 1 1.2 0.7 0.4 0.6 0.8 1 1.2 0.3 0.3 0 5 10 15 20 25 30 0 5 10 15 20 25 30 hc hc 3 (c) return-time (reversed) 3 (d) waiting-time (reversed) 2.5 2.5 FIG. 9. Waiting-time entropy estimations as a function of hc for several values of ℓ. Panel (a): The graphics shows the behavior of h^R 2 ^ 2 hR ĥ as we increase the entropy threshold hc for ℓ = 6 (filled squares), 1.5 1.5 ℓ = 9 (filled triangles), ℓ = 12 (filled circles), ℓ = 15 (X’s), ℓ = 18 1 1 (stars), ℓ = 19 (plus). Panel (b): it is shown the behavior of the estimated reversed entropy for the same parameter values used in 0.5 0.5 panel (a). 0 5 10 15 20 25 30 0 5 10 15 20 25 30 block length block length scheme) we should keep hc as large as possible. Similarly, in FIG. 10. Estimation of block entropy rate as a function of ℓ. Black order to assure the central limit to be valid, we should take the lines stand for the estimated block entropy rate and red lines are block length ℓ as large as possible. the exact entropy rate. We show the curves corresponding to the Now, we turn our attention to the implementation of the es- Markov chain parameter q = 0.50 (solid lines), q = 0.60 (dotted timations of block entropy rate using the schemes described lines), q = 0.70 (dashed lines), q = 0.80 (dotted–dashed lines) and in Section III. For this purpose, first, we generate a single se- q = 0.90 (double-dotted–dashed lines). (a) Block entropy rate es- quence of N = 12 × 106 symbols by means of the three-states timations using the return-time statistics. (b) Same as in (a) using Markov chain. Then, we implement the sampling schemes for waiting-time statistics. (c) Reversed block entropy rate estimations the return-time and the waiting-time statistics. In each case, using the return-time statistics. (d) Same as in (c) using waiting-time we collect m = 5 × 104 sample words, which correspond to statistics. m = 5 × 104 samples of block entropy rates and reversed block entropy rates. These sample sets contain both, numerically defined and undefined samples, the latter ones are due to the TABLE I. Block entropy estimations using the return-time statistics. censoring. In this case, the censoring bound for entropy rate Parameters Estimations hc is determined by q ℓ∗ p̂ ĥ ∆ĥ ∆ĥ/h log(N/2) 0.5 22 0.6066 0.692325 0.000822 0.001187 . hc = (59) 0.6 23 0.5488 0.669906 0.003106 0.004636 ℓ 0.7 25 0.5718 0.607702 0.003162 0.005203 We should emphasize that in the present case we have con- 0.80 30 0.5880 0.496892 0.003510 0.007064 trol only on a single parameter, which we take as the length 0.9 30 0.9250 0.328234 0.003151 0.009600 of the block ℓ. Contrary to the above exposed numerical ex- periments, in this case hc is no longer a free parameter; it is actually determined by means of the length of the symbolic sequence N and ℓ, the length of the block. Consequently, for a large ℓ we have a short censoring upper bound, implying changes in the values of ℓ imply changes in the value of hc . that only a few samples for block entropy rates are numeri- The latter is important for two reasons: on the one hand, we cally well-defined. This entails a loss of accuracy since, the have that, in order to assure the validity of the central limit the- less numerically well-defined samples, the larger becomes the orem, we should take ℓ as large as possible (actually, the true variance of the estimators. This phenomenon can be observed entropy rate is obtained in the limit ℓ → ∞). On the other hand, in Figure 10 for several values of the parameter q of the three- it is desirable to have as much as non-censored samples as states Markov chain defined in Section IV A. possible, i.e., it is convenient for hc to be as large as possible. In Figure 10a we show the estimation of block entropy rate However, in practice, we cannot comply with both require- as a function of ℓ using the return-time statistics. The red ments at once because of expression (59): the larger ℓ, the lines show the exact value of the entropy rate obtained with shorter hc , whenever the length N of the symbolic sequence is eq. (55), while the black lines correspond to the estimations kept constant (which commonly occurs in real-world observed of the block entropy rate using the return-time statistics under data). the sampling scheme described in Section III A. Figure 10b An important consequence of the latter, that we cannot shows the same as in Figure 10a but using the waiting-time make ℓ as large as we want. Actually, the maximal block statistics. Figures 10d and 10d show the corresponding curves length that it is possible to use for entropy estimations is de- for the reversed block entropy rate for the return-time and the termined by the accuracy we would like to obtain. This is, waiting-time statistics, respectively.
Estimating entropy rate from censored symbolic time series 11 TABLE II. Block entropy estimations using the waiting-time statis- TABLE V. Entropy production estimations from return and waiting tics. time statistics. Parameters Estimations Return Waiting q ℓ∗ p̂ ĥ ∆ĥ ∆ĥ/h q êp ∆ep êp ∆ep 0.5 22 0.6096 0.691518 0.001630 0.002357 0.50 −0.000876 0.000876 0.001266 0.001266 0.6 23 0.5460 0.670477 0.002535 0.003781 0.60 0.081349 0.000256 0.080766 0.000327 0.7 26 0.4590 0.609711 0.001153 0.001891 0.70 0.326271 0.012648 0.328695 0.010224 0.8 30 0.5926 0.496028 0.004374 0.008818 0.80 0.775849 0.055928 0.789039 0.042738 0.9 30 0.9242 0.333756 0.008673 0.025986 0.90 1.653559 0.104221 1.666150 0.091630 TABLE III. Time-reversed block entropy estimations using the return-time statistics. Parameters Estimations timal value ℓ∗ for q = 0.50, q = 0.60, q = 0.70, q = 0.80, and q ℓ∗ p̂ ĥR ∆ĥR ∆ĥR /hR q = 0.90 for the return-time and the waiting-time statistics re- 0.5 22 0.6104 0.691449 0.001698 0.002456 spectively. We also show a comparison of the estimated block 0.6 21 0.4620 0.751255 0.002850 0.003794 entropy rate with their corresponding exact values. We can 0.7 17 0.4504 0.933973 0.015810 0.016928 appreciate from these tables that the relative error ∆ĥ/h (the 0.8 12 0.5586 1.272741 0.059438 0.046701 relative difference between the estimation and the exact value) 0.9 8 0.4642 1.981793 0.101070 0.050999 is lower than 0.06. Moreover, for q = 0.50 and q = 0.60, the relative errors are even less than 1%. In Tables III and IV we show the estimations of reversed entropy rate, and the corre- sponding optimal ℓ∗ , for the return-time and the waiting-time Observe that all the curves of the estimated block entropy statistics respectively. rate have a common behavior that we anticipated above: there is a special value of ℓ for which the estimation seems to be optimal. But, for small and large values of ℓ the estimated en- In Figure 11 we show both the block entropy rate (panel tropy deviates visibly from the exact value. This phenomenon a) and the reversed block entropy rate (panel b) as a function is produced because, the estimators become better as the ℓ of the parameter q. In that figure, the estimation correspond- increases but also decreases the number of numerically well- ing to the return-time and waiting-time statistics are compared defined samples due to the censoring. A criterium for obtain- with the exact value. We observe that the return-time and ing an optimal ℓ∗ might not be unique, so here we use a simple waiting-time statistics have approximately the same accuracy. one. First of all, once the value of ℓ is chosen, the censoring From Figure 11 we can also see an interesting behavior of entropy rate is fixed according to eq. (59). This bound in turn, the estimation, that is, the larger the entropy rate, the larger determines the number of numerically well-defined samples; the deviation from the exact result. This effect can actually the shorter hc , the lower number k of numerically well-defined be explained as follows. First we should have in mind that samples we have. Due to relationship (59) we can also say that the return-time and waiting-time can be interpreted as mea- the larger ℓ, the lower number k of numerically well-defined sures of the recurrence properties of the system. Specifically, samples. A simple way to optimize this interplay between ℓ the entropy rate itself can, in some way, be interpreted of as and k = k(ℓ), is taking the block length ℓ∗ for which k(ℓ∗ ) is a measure of the recurrence time per unit length of the word as close as possible to the half of the sample size m. (this is a consequence of the fact that the logarithm is a one-to- one function). Thus, it becomes clear that the larger entropy Using this criterium we compute the optimal block length rate, the larger the recurrence times in the system. Since all ℓ∗ , and the corresponding estimated value of entropy rate, for the samples are censored from above it should be clear that a several values of the parameter q of the Markov chain. Ta- system having larger recurrence times will have larger errors bles I and II we show the estimated entropy rate ĥ and the op- in the estimations. Therefore we may say that a system with large entropy rate will exhibit large statistical errors in its es- timations. Despite of this effect, we observe in Figure 11 that TABLE IV. Time-reversed block entropy estimations using the the errors in the estimations are sufficiently small for practical waiting-time statistics. applications. Parameters Estimations q ℓ∗ p̂ ĥR ∆ĥR ∆ĥR /hR Finally we show in Table V the entropy production rate of 0.5 22 0.6094 0.692784 0.000363 0.000524 the system by taking the difference between the block entropy 0.6 21 0.4612 0.751243 0.002861 0.003808 rate and the reversed block entropy rate, for both, the return 0.7 16 0.6494 0.938406 0.011377 0.012124 and the waiting-time statistics. It is important to remark that 0.8 12 0.5286 1.285067 0.047112 0.036661 these recurrence statistics are consistent one to each other, 0.9 8 0.4472 1.999906 0.082957 0.041480 having moderate deviations (statistical errors) when compared with the exact values.
Estimating entropy rate from censored symbolic time series 12 0.7 0.522 (a) 0.52 (a) Lorenz-like estimated Lyapunov 0.6 0.518 0.516 0.514 h 0.5 0.512 0.4 0.445 (b) entropy rate return-time Logistic (a = 3.6) waiting-time 0.44 0.3 exact 0.435 0.5 0.6 0.7 0.8 0.9 q 0.43 2.1 return-time 0.2 waiting-time 1.8 exact 0.18 (c) Manneville-Pomeau (z=31/16) 0.16 1.5 0.14 hR 1.2 0.12 0.1 7 3 4 5 6 0.9 10 10 10 10 10 (b) Tc 0.6 0.5 0.6 0.7 0.8 0.9 q FIG. 12. Estimation of block entropy rate as a function of the cen- soring time Tc for several chaotic maps. In all these numerical exper- FIG. 11. Estimation of block entropy rate and reversed block entropy iments we obtained a symbolic sequence of 8 × 106 symbols long. rate as a function of q. (a) It is shown the block entropy rate estimated We obtained a sample set of 2 × 104 words following the waiting from the return-time statistics (black filled circles) and the waiting- time sampling scheme. We fix the censoring time Tc and compute the time statistics (red filled squares). We also show the corresponding corresponding estimations for entropy rate. We repeat the estimation exact values of the entropy rate (black solid line) of the system for for several values of the censoring time Tc . Solid lines represent the comparing these estimations. In panel (b) the same as in panel (a), waiting-time estimations and dashed lines represent the estimation but for the reversed entropy rate. reported in Ref. 39 of the Lyapunov exponent for (a) the Lorenz-like map (b) the logistic map (a = 3.8) and (c) the Manneville-Pomeau map (z = 31/64). V. EXAMPLES It is clear that this map has a generating partition defined by In this section we apply our methodology for estimating en- {[0, 1/2), [1/2, 1]} allowing a direct symbolization of the time tropy rate in some well-studied systems for which either, the series. entropy rate or the (positive) Lyapunov exponents are known. The second chaotic map we use is the well-known family We also include a couple of examples for estimating the en- of logistic maps defined as tropy production rate for showing the performance of our estimator for the analysis of the time-reversibility (or time- Ka (x) := ax(1 − x). (61) irreversibility) of the process from a finite time-series. We take a = 3.6, a = 3.8 and a = 4 corresponding to the en- tropy rate (estimated from Lyapunov exponents) reported in A. Entropy rate for chaotic maps Ref. 39. As in the case of the Lorenz-like map, the generating partition is given by {[0, 1/2), [1/2, 1]} which is the one we For one-dimensional chaotic maps, a theorem of Hofbauer use for the symbolization of the time-series. and Raith38 allows to compute the entropy rate by means of Finally we test our method on the Manneville-Pomeau the Lyapunov exponent and the fractal dimension of the cor- maps defined as responding invariant measure. We use the results reported in Ref. 39 for the entropy rate estimated from the Lyapunov ex- Mz (x) := x + xz (mod 1), (62) ponents as reference values. We test our methodology for three chaotic maps: a Lorenz-like transformation, the logis- which is a family of chaotic maps exhibiting a dynamics with long range correlations. This family is parametrized by tic map and the Manneville-Pomeau map. z ∈ R+ . We concentrate on parameter values within the in- The first chaotic map we use to exemplify our estimator for terval 1 < z < 2 for which the map admits a unique abso- entropy rate is a Lorenz-like map L : [0, 1] → [0, 1] defined as lutely continuous invariant measure39 . Additionally, for such ( 3/4 parameter values the dynamics has a power law decay of cor- 1 − 3−6x 4 if 0 ≤ x < 1/2 relations. We use the parameter values z1 = 3/2, z2 = 7/4, L(x) := (60) 6x−3 3/4 z3 = 15/8, z4 = 31/16, z5 = 63/32, and z6 = 127/64. In if 1/2 ≤ x ≤ 1. 4
You can also read