Chapter 10 Analysis of alternative features for model building The model for VFD tested on passenger vehicle is analysed for performance improvement using alternate features. The use of wavelet features is presented in Section 10.2 followed by use of feature fusion in Section 10.3 and the use of spectral features are presented in Section 10.4 181 CHAPTER 10 ANALYSIS OF ALTERNATIVE FEATURES FOR MODEL BUILDING 10.1 INTRODUCTION The development of a comprehensive model for misfire and other multi-class vehicle fault detection system using a single low cost sensor requires the assessment of alternative signal features. This is done for identifying the diverse possibilities to achieve 100% classification accuracy in detecting all the vehicle-faults under consideration. In this section, the use of alternative features based on spectral information and wavelet decomposition of time series signals are considered. This attempt is necessary since the model developed using statistical features was not capable of reaching 100% classification in all the classes, as observed from the results presented in Chapter 9. Alternative features were consciously avoided at the initial stage since all the options like Discrete Wavelet Transforms (DWT), Discrete Fourier Transform (DFT) using Fast Fourier Transform (FFT) and Power Spectral Density (PSD) involve intensive computation, requiring a complex onboard computation infrastructure in the vehicle. It increases the cost of the setup and challenges the aim of developing a low cost system. A judicious decision based on the cost-benefit relationship needs to be taken after performance analysis. The vibration signature of the engine block is acquired as the base signal from which other transforms are obtained to build the new sets of features. Misfire and other fault simulations are as described in Section 4.5.2. The following data transforms were used as base formulations for new feature extraction: Discrete Wavelet transform of vibration signals o Harr o Daubechies Db2 to Db9 Feature fusion o DWT features with statistical features of vibration signal Spectral decomposition of vibration signals o Discrete Fourier transform using Fast Fourier Transform (FFT) o Power Spectral Density (PSD) 182 The frequency plot of all the conditions is presented in Figure 10.1. The plots show the presence of noise in a wide bandwidth. Good Misfire Frequency in Hz Frequency in Hz Engine high Rpm Gear knock Frequency in Hz Frequency in Hz Choking Low tyre pressure Frequency in Hz Frequency in Hz Figure 10.1 Frequency plots of various vehicle faults Moreover different faults will be represented by different frequency hence it cannot be directly used for interpretation of vehicle condition. 183 Misfire Good Engine high Rpm Gear knock Low tyre pressure Choking Figure 10.2 PSD plots of various vehicle faults The PSD plots presented in Figure 10.2 also show an almost constant energy content in the entire frequency band considered. Hence the direct use of these features might not be very feasible. An alternative feature formulation from these spectral information needs to be used. 184 10.2 Model analysis using Discrete Wavelet transforms The DT-CFS-KD and RF-CFS-KD models using CFS based FSS followed by Konenenko discretisation of data is taken for evaluation with Harr and DWT features. The parameters for DWT based system is presented in Table 10.1. Table 10.1 Classifier parameters for DWT based model in the passenger car Parameters for evaluation Model performance evaluation Model building time Total Number of Instances Correctly Classified Instances Incorrectly Classified Instances Classification accuracy Misfire detection accuracy Mean absolute error Root mean squared error MDL correction Number of leaves Size of the tree Features used Decision tree 10-fold stratified cross-validation 0.01 s Random forest 10-fold stratified cross-validation 0.01 s 1200 985 215 82.1 100 0.0681 0.1901 1200 969 231 80.8 100 0.0687 0.1984 Incorporated 46 56 levels Db7 Incorporated 10 trees Db7 The results presented in Table 10.1 is not very encouraging since an overall classification accuracy of 82.1% and 80.8% were recorded for DT-CFS-KD and RF-CFS-KD respectively. However, both the models recorded 100% for misfire detection. The confusion matrix presented in Table 10.4 clearly portrays that no condition is misclassified as misfire and misfire is also detected with 100% accuracy. The good condition is largely misclassified as choking (102 instances) and 15 as low tyre pressure and a similar trend is observed in the choking condition as well. The other results are comparable with DT-CFS-KD and RF-CFS-KD. The time taken for feature extraction and classification using wavelets is very high when compared to models using statistical and histogram features, as observed from Tables 10.2 and 10.3. 185 Table 10.2 Model performance using wavelet features with decision tree Harr db2 db3 db4 db5 db6 db7 db8 db9 Multi class 81.6 80.8 81.5 81.3 81.3 82.3 82.1 81.1 81.9 Misfire 92 98.5 98 99 99 99 100 98.5 99.5 733 792 801 799 856 851 974 970 978 MAE 0.0716 0.0703 0.069 0.0694 0.0689 0.0663 0.0681 0.0695 0.0676 RMSE 0.1916 0.1931 0.1903 0.1907 0.1903 0.1876 0.1901 0.1902 0.1875 Time taken(s) From the results presented in Tables 10.2 and 10.3, it is clearly recorded that only db7 is capable of achieving 100% classification accuracy for misfire and records an overall performance of 80.8% with Random forest and 82.1% with decision tree algorithm. The results are not very satisfactory given the extremely high computation load encountered in the wavelet decomposition. Table 10.3 Model performance using wavelet features with Random forest Multi class Misfire Time taken(s) Harr db2 db3 db4 db5 db6 db7 db8 db9 81.7 82.4 80.6 79.8 81.5 82.9 80.8 80.7 80.8 93 99.5 99 99 99 99 100 99 99.5 734 790 803 800 857 851 975 970 978 MAE 0.0707 0.0685 0.0678 0.0687 0.0674 RMSE 0.1902 0.1907 0.1897 0.1932 0.1904 0.1856 0.1984 0.1884 0.1889 186 0.064 0.0687 0.0676 0.0668 Table 10.4 Decision tree confusion matrix for model using db7 features Good 82 0 0 53 0 38 Mis1 0 200 0 0 0 0 GnoK 1 0 197 0 0 0 Choke 102 0 0 129 0 15 GrHiRpm 0 0 0 0 200 0 20Psi 15 0 3 18 0 177 Good Mis1 GnoK Choke GrHiRpm 20Psi The use of DWT features with single level of decomposition and db is not a very encouraging choice but performs satisfactorily. Additionally large number of good conditions is misclassified as choking and vice-versa. Hence this model is not recommended for consideration. 10.3 MODEL PERFORMANCE ANALYSIS USING FEATURE FUSION A new formulation using feature fusion was also considered for evaluation. The statistical features of the time series vibration signal and wavelet based features are used jointly to evaluate if performance improves to a level higher than that of statistical or DWT features, when used individually. The result of the analysis using random forest and decision tree are presented in Tables 10.5 and 10.6 In this analysis, random forest achieved a higher multi-class performance when compared to that of a decision tree. It is noticed that the feature fusion model delivers an impressive 95.5% multi class accuracy and 100 % misfire detection using statistical features, db2 features and random forest. A comparable performance is observed with the model combination using statistical features, Harr features and decision tree with a maximum performance of 94.1% multi class accuracy and 100 % misfire detection. Since the Random forest performs better than decision tree, it is recommended. Both the models have also recorded one of the lowest mean absolute error and root mean squared error. 187 Table 10.5 Model performance using feature fusion and decision tree Multi class Misfire Harr db2 db3 db4 db5 db6 db7 db8 db9 94.1 93.9 93.8 94 93.75 93.7 94.1 94.4 94.4 100 100 100 100 100 100 100 100 100 MAE 0.0279 0.0285 0.0288 0.0277 0.0292 0.0291 0.0276 0.0265 0.0269 RMSE 0.1297 0.1296 0.1314 0.1291 0.1334 0.1318 0.1305 0.1276 0.1263 Table 10.6 Model performance using feature fusion and random forest Multi class Misfire Harr db2 db3 db4 db5 db6 db7 db8 db9 95.2 95.5 95.3 95.5 95 95.2 94.8 95.3 94.9 99.5 100 100 99.5 100 100 100 100 100 MAE 0.0247 0.0244 0.0255 0.0256 0.025 0.0248 0.0246 0.0242 0.0258 RMSE 0.1127 0.1109 0.1122 0.1147 0.1133 0.1123 0.1129 0.1093 0.1155 Analysing the results presented in Table 10.7, it is clearly evident that the random forest based model with db7 attains a maximum classification accuracy of 80.8% in multi-class mode and 100% in two-class mode. The feature fusion results indicate that the maximum classification accuracy in multi-class mode varies between 94.8% and 95.5% and achieves 100% in two-class mode except for Harr and db4. Comparing these results, with the performance of statistical features recording 94.6% and 100%, as presented in Table 9.3 of Section 9.3.1, the increase in classification is less than one percent. 188 Table 10.7 Random forest model performance comparison Feature Multi Wavelet features Statistical and wavelet feature fusion class Misfire Multi class Misfire Harr db2 db3 db4 db5 db6 db7 db8 db9 81.7 82.4 80.6 79.8 81.5 82.9 80.8 80.7 80.8 93 99.5 99 99 99 99 100 99 99.5 95.2 95.5 95.3 95.5 95 95.2 94.8 95.3 94.9 99.5 100 100 99.5 100 100 100 100 100 The use of feature fusion is encouraging but the time taken for formulating the DWT based features, as observed from Tables 10.2 and 10.3 is the only setback for this system. However feature fusion as a concept is very encouraging from this analysis. 10.4 MODEL PERFORMANCE ANALYSIS USING SPECTRAL INFORMATION The frequency domain analysis is more commonly used for rotary machines where segregation of vibration into distinct frequency regions and identify the frequency variation in each such region is easily achievable. Moreover the minimal presence of noise renders such application more effective. The use of frequency domain for vehicle fault detection is challenged by the presence of noise in a wide band of frequency requiring multiple filters and processes. In addition to all these challenges, the use of frequency will make the system less reliable after a period of time when the frequency or vibration signature of the system changes due to wear and tear. However, features extracted from the spectral information of the engine block can be used for effective model building. Detection of misfire alone could be possible using DFT (Horner 1995). Many of the reported systems referred in Section 2.4 have conducted the experiment on a stationary engine or vehicle which reduces the applicability of the results. 189 10.4.1 Spectral feature formulation The engine block vibration acquired using the accelerometer is transformed to frequency domain using Discrete Fourier Transform (DFT). DFT computation is computationally intensive processes hence an efficient algorithm known as ‘Fast Fourier Transform (FFT) using the Cooley and Tukey algorithm (Cooley and Tukey 1965) was opted to perform DFT. The conversion of time domain signal to frequency domain signal is the first step towards any frequency domain analysis. The PSD is essentially to identify the energy content in a frequency or band of frequency. Since the FFT and PSD data contain a large portion of noise their use as a base for statistical feature formulation is evaluated. The use of mean of FFT signal for pump fault identification has been reported by Al-Hashmi (AlHashmi, 2008). The statistical features as presented in Section 3.2.1 are extracted using DFT and PSD as base inputs independently. The extracted features are processed using FSS and discretisation before being fed as an input to the classification algorithms for model building and evaluation. 10.4.2 Model performance analysis using spectral features The model performance based on spectral features is presented in Table 10.8. It is clearly evident that spectral features do not offer any possibility even for misfire detection since it is not able to achieve the mandatory 100% misfire classification accuracy needed. From among DFT and PSD, only DFT turns out better than PSD. The multi-class identification accuracy of DFT reaches 93.3% whereas it is very poor at 59.9% for PSD. An interesting observation is that, in spite of poor performance in multi-class accuracy, the PSD is able to record 92% misfire detection accuracy. The time taken for building the model is also considerably higher than using statistical features, which is less than one second as presented in Table 9.3 of Section 9.3.1. 190 Table 10.8 Model performance using DFT and PSD features Discrete Fourier transform Power spectral density features Decision Tree Random Forest Decision Tree Random Forest Multi class 93.3 92.7 59.9 58.4 Misfire 96.5 96 92 92 Time taken 102 103 126 126 MAE 0.0303 0.0315 0.1564 0.1547 RMSE 0.1323 0.1411 0.2803 0.2797 Based on the results, it is concluded that the use of spectral features is not very encouraging for building the model. 10.5 CONCLUSION Only wavelets gives 82% overall and feature fusion returns 94 to 95% compared to statistical features reported in Chapter 9 where the performance is at 94.6%. It is clear that the heavy computational load due to the calculation of two diverse set of features including wavelets is not a favourable choice. In this analysis, an impressive performance is noticed when the feature fusion model delivers an impressive 95.5% multi class accuracy and 100 % misfire detection using statistical features, db2 features and Random forest. A comparable performance is observed with the model combination using statistical features, Harr features and decision tree with a maximum performance of 94.1% multi class accuracy and 100 % misfire detection. Since the Random forest performs better that model is recommended. Both the models have also recorded one of the lowest mean absolute error and root mean squared error. 191
© Copyright 2024