Phase Space Representation of Intrinsic Mode Function: An Effective Approach to Classify Human Lung Sounds Signals Associated with COPD and Bronchiectasis
Sibghatullah I. Khan 1 G. Ganesh Kumar1
Abstract:
Bronchiectasis and chronic obstructive pulmonary disease (COPD) are common human lung diseases. In general, the expert pulmonologist carries preliminary screening and detection of these lung abnormalities by listening to the adventitious lung sounds. The present paper is an attempt towards the automatic detection of adventitious lung sounds of Bronchiectasis, COPD from normal lung sounds of healthy subjects. For classification of the lung sounds into a normal and adventitious category, we obtain features from phase space representation (PSR). At first, the empirical mode decomposition (EMD) is applied to lung sound signals to obtain intrinsic mode functions (IMFs). The IMFs are then further processed to construct two dimensional (2D) and three dimensional (3D) PSR. The feature space includes the 95% confidence ellipse area and interquartile range (IQR) of Euclidian distances computed from 2D and 3D PSRs, respectively. The process is carried out for the first four IMFs correspondings to normal and adventitious lung sound signals. The computed features depict a significant ability to discriminate the two categories of lung sound signals. To perform classification, we use the least square support vector machine with two kernels, namely, polynomial and radial basis function (RBF). The LS-SVM classifier is experimented with the grid search algorithm to find optimum values of hyperparameters σ and γ. Simulation outcomes on ICBHI 2017 lung sound dataset show the ability of the proposed method in effectively classifying normal and adventitious lung sound signals. LS-SVM employing RBF kernel provides the highest classification accuracy of 97.67 % over feature space constituted by first, second, and fourth IMF.
Keywords: Adventitious lung sound signals (ALS), Empirical mode decomposition (EMD), Intrinsic mode function (IMF), Phase space reconstruction (PSR), Least square support vector machine (LS-SVM).
Introduction:
Respiratory disease arriving out of lung abnormalities constitutes 7% of global mortality [1]. Consistent infection and inflammation in the lungs cause Bronchiectasis, whereas allergies, smoking, and pollution results in COPD [2]. Bronchiectasis results in a decreased ability of lungs to clear out mucus, which again increases the vulnerability of lung towards infection [3]. In general, the preliminary screening of lung diseases includes auscultation of chest [4]. The lung sounds originating from the said diseases are termed as adventitious or abnormal lung sounds. Adventitious lung sounds include subcategories called as wheeze and crackles. Generally, these adventitious lung sounds are high pitch [5], contains musical characteristics (in the case of wheeze) [5] and spikes (in the case of crackles) [6]. The significant part of lung disease diagnosis is to detect these adventitious lung sounds in the preliminary stage of disease occurrence. Early detection of lung disease is strongly associated with a reduction in further prevalence [7]. The features obtained from lung sound signals are useful in an effective preliminary diagnosis of lung disease [8]. To extract suitable diagnostic features from lung sounds, they can be considered as stationary or non-stationary. Moreover, the features can be either linear or non-linear. Assuming lung sounds as a stationary signal, researchers have used various time and frequency domain features [9,10,11,12] to classify lung sounds into the normal and adventitious category. Linear prediction (LP) [13] and Mel frequency cepstral coefficients (MFCC) has been used for analysis and classification of normal and adventitious lung sounds [14,15,16,17,18]. Considering the non-stationarity in the lung sound signals, researchers have used multiresolution approaches of signal processing in [19,20,21]. The techniques based on wavelet transform have been employed to analyze and classify lung sound signals into normal and adventitious category [22,23,24]. The non-linear signal processing techniques such as Lyapunov exponent [25], fractal dimension [26,27], and approximate entropy [28] have proved to be useful in providing valuable diagnostic information. The adventitious lung sounds exhibit a higher value of approximate entropy as compared to their normal counterpart [28]. Considering above mentioned facts, we employed the empirical mode decomposition (EMD) method for analysis and classification of normal and adventitious lung sound signals [29]. The methods based on EMD first extracts the intrinsic mode functions (IMFs), proved to be useful in classifying normal and adventitious lung sound signals. Another modification to the EMD method known as ensemble EMD (EEMD) has been used to analyze lung sound signals [30].
The proposed method in this paper relies on the phase space representation (PSR) of the lung sound signal through the EMD process. To obtain PSR, two parameters, namely time lag and embedding dimension, are required. In this work, these two values were kept as constant. The IMFs of lung sound signals contain amplitude and frequency modulated (AM-FM) components. This property of IMFs is useful in defining new features [31,32,33]. Considering this aspect, we have constructed two and three dimensional PSRs (2D-PSR and 3D-PSR) from the IMFs. In the proposed approach, first, we compute the ellipse area from 2D-PSR, and the inter-quartile range (IQR) is computed from Euclidian distances of 3D-PSR. The 2D and 3D PSR parameters from the first four IMFs have been used to construct a feature set, followed by the least square support vector machine (LS-SVM) to classify normal and adventitious lung sound signals.
The paper sequence is as follows: Section 2 describes the methodology, which includes dataset, EMD method, computation of 2D and 3D PSR parameters, and LS-SVM classifier. Section 3 shows experimental results, and section 4 covers discussion. The paper concludes in section 5.
- Methodology
2.1. Dataset
We use ICBHI 2017 lung sound database [34], available online at (https://bhichallenge.med.auth.gr/ICBHI_2017_Challenge). The dataset contains human lung sound signals recorded by three research teams, namely the School of Health Sciences, University of Aveiro (ESSUA), Aristotle University of Thessaloniki (AUTH), and the University of Coimbra (UC) for various disease, broadly classified into a normal and adventitious category. The adventitious category includes human lung sounds corresponding to 6 categories of lung diseases, namely Asthma, Bronchiectasis, Bronchiolitis, COPD, Lower respiratory tract infection (LRTI), and upper respiratory tract infection (URTI). The details of the ICBHI dataset can be found at [34]. In the present study, a total of 172 lung sound signals are included, out of which 90 belongs to the normal category, and 82 belongs to the adventitious category. The adventitious category is a combination of 66 and 16 number of COPD and Bronchiectasis types of lung sound signals, respectively.
2.2. Empirical mode decomposition (EMD)
EMD is used to decompose non-linear and non-stationary signals into finite amplitude and frequency modulated (AM-FM) components, these components are termed as intrinsic mode functions (IMFs) [35]. This process is signal-dependent, and no presumption is made about stationarity and linearity of the signal. EMD has been successfully used in the past for non-linear and non-stationary signal analyses such as gear faults signals analysis [38,39], the center of pressure signal analysis [36,37], analysis of speech signal [40], analysis of electrocardiogram [41] and electromyogram signals [42]. To decompose a signal by the EMD method, the resultant band limited IMFs must satisfy two necessary conditions [35]:
- The total number of maxima and minima in each IMF should have at the most difference of one.
- For each IMF, the average value of the boundary specified by the minima and maxima must be zero.
The first and second conditions satisfy the narrowband requirement, and it ensures the elimination of redundant fluctuations due to asymmetric waveforms, respectively [35].
EMD uses a sifting process to derive IMFs from a signal . Sifting is an iterative process; the complete process is given in the following steps [35]:
- From the signal , extract local minima and maxima.
- To define boundary specified by minima and maxima, compute an envelope and by joining all the points corresponding to minima and maxima, respectively.
- Define the mean of and as:
- Define as
- Verify whethersatisfies IMF eligibility or not.
- Iterate through steps 1-5 till becomes IMF
After successful computation of first , assign . To compute successive IMFs, define where is the residual signal which serves as the new signal in place of . The process of sifting continues until the residual becomes monotonic, and consequently, further IMF extraction is not possible.
The signal can be reconstructed by summing all IMFs and a residual [35]:
where M is the number representing total IMFs extracted and is the final residual. Fig. 1 and Fig. 2 show the plot of the first 7 IMFs for normal and adventitious lung sound signals, respectively.
Fig.1. Empirical mode decomposition of the normal lung sound signal for one breath cycle
Fig. 2. Empirical mode decomposition of the abnormal lung sound signal.
- Phase space representation
Phase space reconstruction is a useful technique to capture the non-linear dynamics of the signal. The dynamic systems contain two parts, i.e., state and dynamics [43]. At a particular time instance, system information is referred to as the state, whereas the rule governing the state with respect to time is the dynamics of the system. To visualize the evolution of the dynamic behavior of a time-varying signal, we use the phase space representation (PSR). Lung sound signals can be represented as a time series vector , where K is the number representing the total number of data points. In time delay method of obtaining phase space reconstruction, it is expressed as [44]:
Where and are time lag and embedding dimesons, respectively.
With the value of =2 or 3, PSR can be used to visualize the signal behavior. In the present study, we have opted for the embedding dimension value of 2 and 3 because of simplicity in visualization. As mentioned in [45], we use a time lag value of 1 to reconstruct phase space. Two dimensional (2D) PSR is obtained by keeping d=2, and with d=3, the PSR is referred to as three dimensional (3D) PSR. Here it may be noted that the 2D-PSR is the same as that of the Poincare plot, which finds applications in variability measurement of biomedical signals [46]. The following subsection provides the procedure to extract the 2D and 3D PSRs from IMFs.
2.3.1. Ellipse area computation from 2D-PSR
The symmetric IMFs components have AM-FM components and are capable of providing significant features for discrimination of normal and adventitious lung sounds signals. The elliptical nature of PSR for sinusoidal signals has been demonstrated by [47]. In consequence, the PSR of IMFs, which are oscillatory, are expected to exhibit elliptical patterns. Fig. 3 and Fig. 4 show two-dimensional phase space reconstructions (2D-PSR) computed by the EMD process on normal and adventitious lung sound signals, respectively. In the said figures, the elliptical patterns are visible from 2D-PSR of IMFs. To compute the area of an ellipse, considering 95% of the data points, authors in [48,49] have proposed a strategy for analyzing COP signals. Moreover, in [50], the classification of epileptic seizure and seizure-free EEG signals has been carried using second order difference plots (SODP) of IMFs by utilizing a 95% confidence ellipse area as a feature. In present work, we use a 95% confidence area of ellipse computed from 2D-PSR of IMFs as the feature to classify normal and adventitious lung sound signals. Following is the procedure for computation of 95% ellipse area from 2D PSR:
The plot of vector vs is 2D-PSR. First the mean values of and is calculated as [48,49]
Define parameter L as [48,49]:
Using computed parameters, a and b, the ellipse area (with 95% data points) can be calculated as: [31,48,49]:
Fig. 3. SODP of IMFs of normal lung sound signal for the first seven IMFs.
Fig. 4. SODP of IMFs of abnormal lung sounds for the first 7 IMFs.
2.3.2. IQR of Euclidian distances computed from 3D-PSR
The 3D-PSR is useful in visualizing the dynamics of the system. If vectors and represents the delayed version of the vector Then the plots of these three vectorsresult in 3D-PSR. To compute 3D-PSR, first Euclidian distance of the point () from the origin is calculated as [44]:
From 3D-PSR, we computed the interquartile range (IQR) of Euclidian distances (). IQR quantifies the variability in data, and it specifies the range between the 25th and 75th percentile [50,51]. IQR shows the dispersion for 50% of observation. This property of IQR makes it insensitive to outliers. Fig. 5 and Fig. 6 show the plots of the first four IMFs represented by 3D-PSR of normal and adventitious lung sound signals, respectively. In this paper, IQR is used as the feature to discriminate normal and adventitious lung sound signals.
Fig. 5. 3D-PSR plot of normal lung sound signal for first four IMFs
Fig. 6. 3D-PSR plot of adventitious lung sound signal for first four IMFs
2.4. Least squares support vector machine
The support vector machine (SVM) classifier is based on the supervised learning theory and is widely employed in pattern recognition tasks [49]. SVM constructs optimal hyperplane in higher dimensional feature space to separate various classes.
For a feature space containing data points , where and is input data and the th output class label respectively, additionally, can take the value of either +1 or -1, representing two different classes. The SVM classifier function to discriminate two classes is given as [52]:
Where and are the weight vector and bias term in d-dimensional feature space, respectively. The function maps into dimensional feature space. The core principle of SVM is to determine the optimal hyperplane that maximizes the distance of data points to that of hyperplane for the respective class. This maximization problem stated in SVM may be termed as an optimization problem with inequality constraints [52]. For classifying biomedical signals, the least square SVM (LS-SVM) has been frequently used [53,54,55]. In LS-SVM, the optimization problem can be stated as [52]:
subjected to equality constraints:
where For equation (14), the Lagrangian multiplier can be defined as:
Solving equation (16) results in decision hyperplane function [52,56]:
In equation (17) the is a kernel function. In this work, the following kernel functions are used:
- The radial basis function (RBF) kernel: It is defined as [57]
In equation (18), σ is the hyperparameter. A grid search algorithm is employed to find the optimum value of σ.
- Polynomial Kernel function: It is defined as [57]
where is the degree of a polynomial.
For evaluating the effectiveness of the classifier, we use performance metrics, namely, sensitivity, specificity, accuracy, precision, recall, and F-score. To compute these metrics, four parameters, namely true positive (TP), true negative (TN), false positive (FP), and false negative (FN), are required. These parameters are obtained from the resultant confusion matrix. TP and TN denote correctly classified positive and negative instances, respectively, whereas incorrectly classified positive and negative instances are denoted by FP and FN, respectively.
The performance measures are defined as follows:
- Sensitivity (SEN):
It is defined as:
It quantifies the ability of the classifier model in predicting positive class labels correctly [58].
- Specificity (SPE):
It is defined as:
Contrary to sensitivity, it is the ability of the classifier model to predict negative instances correctly [58].
- Accuracy (ACC): Out of total samples, the ability of the classifier model to predict correct positive and negative classes is quantified by accuracy [58]. It is defined as:
- Positive predictive value (PPV): It is defined as:
It quantifies the ability of the classifier model in identifying positive instances from total positive space [58].
- Negative predictive value (NPV): It is defined as:
On the contrary to PPV, it quantifies the ability of the classifier model in identifying negative instances from total negative space [58].
- F1-Score (F1): It isdefined as the weighted harmonic mean of the positive predictive value and sensitivity of the test [59].
- Results
Fig. 1 and Fig. 2 shows the first seven IMFs of normal and adventitious lung sound signals. Further, Fig. 3 and Fig. 4 show the 2D-PSRs corresponding to normal and adventitious lung sound signals for the first seven IMFs. We employed the area parameter computed by including 95% of data traces (95% confidence ellipse area) from the 2D-PSR plot for the first seven IMFs to classify adventitious and normal lung sound signals. From the plots of IMFs and corresponding 2D-PSR, it is observed that only the first four IMFs have significant variability. Consequently, only the first four IMFs were considered for the computation of 3D-PSR plots. Fig. 5 and Fig 6 show the plot of 3D-PSR computed from the first four IMFs of normal and adventitious lung sound signals, respectively. To determine the spread of the points in 3D phase space, for each IMF, we computed IQR of Euclidian distances. Here we did not split the signal into windows. Instead, the complete signal is considered for analysis. This approach has reduced the complexity, generally encountered in time series analysis. Thus, this approach does not require any segmentation of lung sound signal into breathing cycles, thereby reducing the time and computational complexity. To verify the classification ability of constructed feature space, we used Kruskal-Wallis statistical test [60] against both the classes with all feature sets.
Fig. 7 depicts the result of the Kruskal-Wallis statistical test performed in the ellipse area parameter of the first four IMFs. Fig. 8 depicts the good discrimination ability of feature constructed by 3D-PSR is observed by Kruskal-Wallis statistical test. The significant probability (p) [60] value for both the features resulting from Kruskal-Wallis statistical test is captioned in Fig. 7, and Fig. 8.
From encouraging results of the Kruskal-Wallis test, the two features, namely the ellipse area from the 2D-PSR plot and IQR of Euclidian distances from the 3D-PSR plot, are include to form feature space. In this work, we utilized MATLAB for implementing the proposed method for the classification of normal and adventitious lung sound signals.
Fig. 7. Comparison of the estimation of the ellipse area for normal and adventitious lung sound signals for first four intrinsic mode functions (IMFs): IMF1(p<0.001), IMF2 (p<0.001), IMF3 (p<0.001), IMF4 (p < 0.001).
Fig. 8. Comparison of the estimation of the IQR computed from 3D-PSR for normal and adventitious lung sound signals for first four intrinsic mode functions (IMFs): IMF1(p<0.001), IMF2 (p<0.001), IMF3 (p<0.001), IMF4 (p < 0.001).
The constructed feature vectors are then fed to the LS-SVM classifier. Polynomial and radial basis function kernels have been employed in LS-SVM to evaluate the performance of the classifier. Moreover, the optimum values of hyperparameters have been obtained using a grid search algorithm. To ensure the reliability and stability of the LS-SVM classifier, tenfold cross-validation is used [61]. The MATLAB code implementation of LS-SVM is available online at [62].
The performance of LS-SVM with polynomial and radial basis function kernel is shown in Table1-2. Joint feature space is constructed by combining features from 2D-PSR (95% confidence ellipse area) and 3D-PSR (IQR of Euclidian distance). Consequently, the combination of IMF1 & IMF2, IMF1 & IMF3, IMF1 & IMF4, IMF2 & IMF3, IMF2 & IMF4, IMF3 & IMF4, IMF1 & IMF2 &IMF3, IMF1 & IMF2 & IMF4, IMF1 & IMF3 & IMF4 and IMF2 & IMF3 & IMF4 is denoted by F12, F13, F14, F23, F24, F34, F123, F124, F134 and F234 respectively. For mentioned feature sets, LS-SVM classifier with polynomial and radial basis function kernel is experimented and performance of classifier is tabulated in Table1-2 respectively.
Table 1. Performance of LS-SVM Classifier with Polynomial Kernel
Features Space | ACC | SPE | SEN | PRE | REC | F-SCORE |
F12 | 79.65 | 94.34 | 73.11 | 60.98 | 94.34 | 74.07 |
F13 | 80.81 | 94.55 | 74.36 | 63.41 | 94.55 | 75.91 |
F14 | 80.81 | 92.98 | 74.78 | 64.63 | 92.98 | 76.26 |
F23 | 79.65 | 97.96 | 72.36 | 58.54 | 97.96 | 73.28 |
F24 | 79.07 | 97.92 | 71.77 | 57.32 | 97.92 | 72.31 |
F34 | 80.23 | 96.15 | 73.33 | 60.98 | 96.15 | 74.63 |
F123 | 81.40 | 94.64 | 75.00 | 64.63 | 94.64 | 76.81 |
F124 | 80.81 | 92.98 | 74.78 | 64.63 | 92.98 | 76.26 |
F134 | 80.81 | 94.55 | 74.36 | 63.41 | 94.55 | 75.91 |
F234 | 79.07 | 96.00 | 72.13 | 58.54 | 96.00 | 72.73 |
Features Space | ACC | SPE | SEN | PRE | REC | F-SCORE |
F12 | 93.60 | 96.10 | 91.58 | 90.24 | 96.10 | 93.08 |
F13 | 90.70 | 98.53 | 85.58 | 81.71 | 98.53 | 89.33 |
F14 | 91.28 | 94.67 | 88.66 | 86.59 | 94.67 | 90.45 |
F23 | 88.37 | 95.59 | 83.65 | 79.27 | 95.59 | 86.67 |
F24 | 91.28 | 95.89 | 87.88 | 85.37 | 95.89 | 90.32 |
F34 | 90.12 | 95.77 | 86.14 | 82.93 | 95.77 | 88.89 |
F123 | 90.70 | 97.14 | 86.27 | 82.93 | 97.14 | 89.47 |
F124 | 97.67 | 98.75 | 96.74 | 96.34 | 98.75 | 97.53 |
F134 | 96.51 | 98.72 | 94.68 | 93.90 | 98.72 | 96.25 |
F234 | 80.81 | 100.00 | 73.17 | 59.76 | 100.00 | 74.81 |
Table 1. Performance of LS-SVM Classifier with RBF Kernel
From Table 1, it is evident that LS-SVM has not performed satisfactorily with a polynomial kernel function. Additionally, there is no significant change in classifier performance in the feature set. This shows the inherent variations and non-linear separability of the features space, for which the polynomial kernel seems to fail in capturing the non-linearity of feature space. From Table 2, it is evident that the highest classification accuracy is achieved when LS-SVM is used with a radial basis function kernel. Moreover, the feature set denoted by F124 results in maximum classification accuracy of 97.67% when fed to LS-SVM with radial basis function kernel. The proposed method resulted in the classifier to attain the maximum values of performance parameters highlighted in the bold entries in Table 2. It is important to note here that the maximum accuracies are attained using a Radial basis function kernel compared to the polynomial kernel. Additionally, the remaining performance measures follow the same trend, and their value is high for the radial basis function kernel. The feature space constituted by the ellipse area parameter (from 2D-PSR) and IQR (from 3D-PSR) of IMF1, IMF2, and IMF4 provide the best classification performance compared to the rest of the combinations. The receiver operator characteristics (ROC) curve for LS-SVM classifier with F124 feature space employing the RBF kernel is shown in Fig. 9. The AUC for said ROC curve is 0.99. A comparison of the proposed method is made with the existing methods for the same dataset in Table 2. From Table 2, the effectivity of the proposed method is evident for the classification of normal and adventitious lung sound signals.
Fig. 8 ROC plot of LS-SVM classifier with RBF kernel function.
- Discussion
The lung sound signals posses non-stationary and non-linear characteristics, consequently making the EMD method suitable for analyzing lung sound signals. The lung sounds signals are decomposed by EMD, resulting in symmetric IMFs which has oscillatory nature. The IMF possesses the behaviors of the data-adaptive filter with high passband gain [63]; consequently, the frequency content in each IMF decreases with its order, i.e., the first IMF contains the highest frequency, and the frequencies decrease in following IMF components. Using the EMD process, IMFs are extracted, and for the first four IMFs, 2D and 3D PSR have been constructed to form feature space. From Fig. 3 and 4, it is evident that 2D-PSR of the IMFs of lung sound signals exhibit an elliptical pattern. It encourages us to compute the ellipse area from 2D-PSR by including 95% of the data points, termed as 95% confidence ellipse area. In addition to the suitability of the EMD method for analyzing the non-linear and non-stationary signal, indeed, the elliptical patterns in 2D-PSR of computed IMFs aids in the computation of 95% confidence ellipse area parameter, hence the use of EMD for decomposing lung sound signal is justified.
From Fig. 3 and Fig. 4, it is evident that there is a decrease in the area of the PSR corresponding to normal lung sounds signal as compared to adventitious lung sound signal. The increase in area for adventitious lung sound signal is indicative of higher variability and amplitude contains in it. 3D-PSR represents the distribution of Euclidian distances of different points in the 3D phase space. In the 3D phase space, to quantify the spread of the points, the IQR of Euclidian distances for each IMF of the lung sound signal is computed. The sudden transients and spikes present in lung sound signals contributed to the outliers. The IQR is robust to the outliers as it measures the dispersion of points.
The LS-SVM is a well-known classifier used in various applications like bioinformatics [63]. It is known for its high performance, superior accuracy, and ability to train itself with a dataset containing a small number of instances.
The K-fold cross-validation is used to compute classification performance. In this method, the dataset is randomly divided into K subsets of equal size [61]. For each training and testing iteration, one subset out of the K subset is regarded as a testing subset, and the rest of the subsets are used for training. To include all K subsets for training, the process is repeated K times. Finally, the average of all K testing results is computed to estimate the result of K-fold validation. In real-world datasets, the best optimum value of K is 10, hence the name is given as tenfold cross-validation [61].
In this study, the classification performance of the LS-SVM classifier in classifying normal and adventitious lung sound signal is accessed by tenfold cross-validation. Another useful parameter to quantify the overall classifier performance is the area under the ROC curve (AUC). Larger the AUC, better the classifier accuracy over varying values of threshold [57]. Fig. 8 shows ROC for the proposed classifier, and the AUC achieved is 0.99, indicating better classification performance of classifier to discriminate normal and adventitious lung sound signals. Moreover, the parameters, namely, sensitivity, specificity, positive predictive value, negative predictive value, and F1-score, attains maximum value with the F124 feature set and RBF kernel of LSSVM classifier. A comparison in terms of accuracy of the proposed method with other existing methods for classification of normal and adventitious lung sound signal is tabulated in Table 3.
Authors | Method | Classification Accuracy |
Demir et.al. [64] | STFT-CNN-SVM, Standalone CNN. | 65.5 % for STFT-CNN-SVM. 63.09 % for standalone CNN. |
G. Altan et.al.[65] | Hilbert Hung transform with CNN | 93.67 % |
Saraiva et. al. [66] | MFCC with CNN | 74% |
D. Perna et al. [67] | MFCC-RNN | 81% |
R. Liu et al. [68] | LMFB-CNN | 81.62 |
Present work | EMD-IMF-Ellipse area from 2D-PSR and IQR from 3D-PSR followed by LS-SVM classifier. | 97.67% |
Table 2. shows the accuracies achieved by existing and proposed methodology for the classification of normal and adventitious lung sound signals. It should be noted that the comparison is given in Table 4. includes only the experiments performed on the ICBHI 2017 dataset.
In [64], STFT has been used to convert lung sound signals into images, followed by CNN for classification. Sperate classification performance has been studied for CNN-SVM and standalone CNN. The archived classification accuracy in [66] is 65.5% and 63.09% for CNN-SVM and standalone CNN, respectively. The authors in [65] have utilized statistical features from Hilbert Hung transform with CNN based deep learning classifier have been used for classification of adventitious lung sounds indicative of COPD. Classification accuracy archived by the presented method is 93.67% for 12-channel lung sound signal. In [66], the authors employed Mel frequency cepstral coefficients (MFCC) to covert one-dimensional lung sound signals into two-dimensional images. CNN with ADAM optimizer has been employed for the classification task. The classification accuracy achieved using this method is 74%. In [67], an RNN based approach was presented to classify lung sound signals in various categories. The authors in [67] used windowing techniques to segment lung sound signals with various frame sizes. Features extraction includes computation of 13 MFCCs. For classification, authors employed RNN models using the ADAM optimization algorithm. For a binary classification task, the maximum classification accuracy of 81% has been achieved by the authors with 250ms frame length of lung sound. In [68], STFT based Log Mel filter bank (LMFB) has been used to convert a one-dimensional lung sound signal into three-dimensional images. Convolution neural network (CNN) has been used to classify normal and adventitious lung sounds from ICBHI 2017 dataset. The classification accuracy obtained by the presented method in [64] is 81.62%.
In the present study, we proposed EMD to extract IMFs, followed by the construction of 2D and 3D PSRs. From the PSRs, we computed two features, namely, 95% confidence area of the ellipse from 2D-PSR and Euclidian distances from 3D-PSR. These two features constitute our feature space, over which the LS-SVM classifier is implemented. LS-SVM employing RBF kernel results in the highest classification accuracy of 97.67% with tenfold cross-validation. From Table 4. it is clear that the proposed method outperforms other existing methods in terms of accuracy. It is valuable to note that in the proposed methodology, EMD is used, which suits in analyzing non-linear and non-stationary signals.
Additionally, the time domain analysis makes the proposed method feasible to implement real-time systems. To ensure the robustness and reliability in classification, tenfold cross-validation has been used. The proposed method of lung sound classification could be integrated with telemedicine applications to evolve the respiratory diagnostic expert system. However, as the dataset included in the present study is limited to ICBHI 2017 dataset, there is a need to include out of sample dataset to establish its clinical diagnostic ability.
Conclusion:
In the present paper, empirical mode decomposition (EMD) is used to extract IMFs of lung sounds signals due to their non-stationary and non-linear nature. Due to symmetric nature, the effectiveness of extracted IMFs has been identified as useful to constitute feature space for the classification of normal and adventitious lung sound signals. The feature space has been formed by plotting two- and three-dimensional phase space representations (2D and 3D PSR) from the first four IMFs. With fixed values of time lag and embedding dimension, PSR of IMFs has been constructed. From 2D-PSR, a 95 % confidence ellipse area has been computed, which is considered as the first feature. The second feature is the interquartile range (IQR) computed from Euclidian distances of 3D-PSR. The lung sound signals were not segmented and used as a whole for analysis. The feature set is then fed to LS-SVM classifier for the classification of normal and adventitious lung sound signals. With LS-SVM, the performance of Polynomial and Radial basis function (RBF) kernel has been evaluated. The RBF kernel with hyperparameters as σ=0.0011 and γ =472.66 has provided maximum classification accuracy. Moreover, LS-SVM with RBF kernel resulted in good performances for other parameters like sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and F1-Score. Accordingly, it can be concluded that the decision boundary formed using RBF kernel in the proposed feature space is useful to classify normal and adventitious lung sound signals.
Future work may consider integrating new features with proposed features presented in this study to improve accuracy of classification further. Moreover, future studies may consider adaptive kernel selection to get optimum hyperplane, thereby improving classification accuracy. The proposed methodology in the present paper may be integrated with deep learning to improve classification accuracy further. The proposed features in the present study may be integrated with deep learning algorithms for the multiclass classification of adventitious lung sound signals. Inclusion of large datasets, more rigorous quantitative and qualitative analysis would enable real-time implementation of the proposed methodology. Finally, it will be useful to integrate the proposed methodology with emerging deep learning techniques to aid in the development of an expert system to classify various biomedical sound signals like phonocardiogram (human heart sounds), bowel sounds in the normal and abnormal category.