Approximate and Sample Entropies Complexity Metrics

Chesnokov Yuriy

4.18/5 (8 votes)

Jun 17, 2008

GPL3

1 min read

24183

The article presents C++ code for estimation of approximate and sample entropies suitable for biological signals analysis

Introduction

There are myriads of signals to analyse with spectral analysis methods: medical (HRV, ECG, EEG, EMG), geological, musical, etc... Among different analysis methods, there is a group of complexity metrics aimed to estimate how complex the signal is. Consider sine wave and random noise. Obviously the sine wave is a simple form of signal while noisy more complex. There are approximate (ApEn) and sample (SmEn) entropies metrics which provide such quantitative estimation of degree of complexity of the signal. ApEn and SmEn are better suited for complexity estimation of short-term noisy signals. They have the advantage that they analyse the original signal, as some complexity metrics need the original signal to be quantized to considerably small alphabet. So they are widely used in medicine for HRV data analysis. For EEG data analysis, they are applicable for estimation of some complex neuronal activity.

Background

You should read my article on the application of ApEn and SmEn for analysis of HRV data for prediction of paroxismal atrial fibrillation. There, more detailed explanation is presented. Here are just the formulas for ApEn and SmEn.

SmEn

Using the Code

The ApEn code is shown below:

double ApEn(const double* data, unsigned int m, double r, unsigned int N, double std)
{
        int Cm = 0, Cm1 = 0;
        double err = 0.0, sum = 0.0;

        err = std * r;

        for (unsigned int i = 0; i < N - (m + 1) + 1; i++) {
                Cm = Cm1 = 0;
                for (unsigned int j = 0; j < N - (m + 1) + 1; j++) {
                        bool eq = true;
                        //m - length series
                        for (unsigned int k = 0; k < m; k++) {
                                if (abs(data[i+k] - data[j+k]) > err) {
                                        eq = false;
                                        break;
                                }
                        }
                        if (eq) Cm++;

                        //m+1 - length series
                        int k = m;
                        if (eq && abs(data[i+k] - data[j+k]) <= err)
                                Cm1++;
                }

                if (Cm > 0 && Cm1 > 0)
                        sum += log((double)Cm / (double)Cm1);                
        }

        return sum / (double)(N - m);
}

The SmEn code is shown next:

double SmEn(const double* data, unsigned int m, double r, unsigned int N, double std)
{
        int Cm = 0, Cm1 = 0;
        double err = 0.0, sum = 0.0;

        err = std * r;

        for (unsigned int i = 0; i < N - (m + 1) + 1; i++) {
                for (unsigned int j = i + 1; j < N - (m + 1) + 1; j++) {      
                        bool eq = true;
                        //m - length series
                        for (unsigned int k = 0; k < m; k++) {
                                if (abs(data[i+k] - data[j+k]) > err) {
                                        eq = false;
                                        break;
                                }
                        }
                        if (eq) Cm++;

                        //m+1 - length series
                        int k = m;
                        if (eq && abs(data[i+k] - data[j+k]) <= err)
                                Cm1++;
                }
        }

        if (Cm > 0 && Cm1 > 0)
                return log((double)Cm / (double)Cm1);
        else
                return 0.0; 

}

N is the size of signal pointed by data, std is dispersion of the signal, r is typically used as 0.2. SmEn and ApEn measure the ratio of how many similar patterns (within error r * std) there are for length m to length m+1.

History

17^th June, 2008: Initial post