Prediction of Failure Time and Remaining Useful Life in Aviation Systems: Predictors, models, and challenges

In many important industries, such as aerial transportation, offshore wind turbine (OWT) structures, and nuclear power plants that reached or are near the end of their useful life, the structural conditions for continued usage are acceptable. Thus, safe continued operation with required modifications and assessment is more cost-effective than replacing them with a new system. To achieve this goal, many studies have been performed on predicting failure time and remaining useful life, especially in systems that require a very high level of reliability. The present review investigates the articles that predict the remaining useful life or failure time in aviation systems, from three perspectives: 1. Methods and algorithms, especially Machine Learning algorithms, which are growing in recent years in the field of Prognosis and Health Management. 2. Historical predictors such as working life history, environmental conditions, mechanical loads, failure records, asset age, maintenance information, or sensor variables and indicators that can be continuously controlled in each system, such as noise, temperature, vibration, and pressure.3. Challenges of researches on prediction of the failure time of flying systems. The literature assessment in this field shows that using diagnostic and prognostic outputs to identify possible defects and their origin, checking the system's health, and predicting the remaining useful life (RUL) is increasing due to market needs.


Introduction
Prognosis is one of the frequently repeated words in the medical world, which provides a prediction of the future state of the patient according to the clinical conditions of the patient and available medical facilities because the prognosis of various diseases plays an important role in clinical decision making for the physician [1]. The same interpretation and attitude are present in the industrial world, and a patient can be an industrial system, a device, or a component. Then, the prediction of the health status of the system using monitoring data will affect the diagnoses and maintenance decisions for the system. Due to long usage and customers' needs, many assets in different industries, such as aviation and military structures, offshore wind turbine structures (OWT), and nuclear power plants, have reached or are near the end of their useful life. However, they still have acceptable structural conditions for further use. These assets are valuable and expensive; thus, it is interesting and desirable for the owners to continue operations with them for economic aspects and replacement burden. The prediction of remaining useful life is technically possible and economically beneficial because the cost of required maintenance activities is much lower than the cost of substitution with a new system [2]. This goal can be achieved by monitoring the system's health and maintaining reliability within the acceptable level. The condition-based maintenance calculates the probable failure time for the system or part and assures safe operation during this time. Prognosis and Health Management (PHM) predict the remaining useful life (RUL) using diagnostic outputs and prognostic and follows optimal maintenance policies for the systems and equipment. PHM identifies the potential faults and their origin and checks the system conditions to balance the highest level of availability and the lowest cost [3]. The time before a system fails and loses acceptable performance is called the remaining useful life [4,5].
The purpose of the remaining useful life prediction is to anticipate the time of failure before it happens, according to the conditions that the system experienced in the past and the present condition [6]. The purpose of PHM is to prevent risks and financial losses that usually 98 / IJRRS / Vol. 4/ Issue 2/ 2021 M. Babaee, J. Gheidar-Kheljani, M. Khazaee, M. Karbasian are not compensable by predicting the air vehicles' remaining useful life or other health condition indicators. Aviation industries are now seeking prognostic and condition-based predictive maintenance rather than preventive maintenance and periodic inspections. Aviation health monitoring systems development requires advanced data acquisition systems and tedious, expensive, and time-consuming flight tests. However, the existence of such systems leads to a significant reduction in maintenance costs, man-hours, and financial losses. The models that can identify system defects using key characteristics and warn the system failures before occurrence are very useful and necessary for safe flight [7]. To establish predictive maintenance systems instead of preventive maintenance, it is necessary to define models that can predict the remaining useful life and the time of failure. Machine learning tools have grown in recent years and have shown acceptable performance in this field. The published articles on machine learning (ML) in the field of PHM from 2013 to 2019, based on the model type, are shown in Figure 1, which shows the growing trend of these tools in this field [8] The purpose of this study is to review the articles that predict the failure time and remaining useful life in the field of aviation accidents from three perspectives: 1. Predictors and variables 2. Methods and algorithms, especially ML algorithms, have been growing in recent years in PHM 3. challenges.

Predictors and Variables
ML models predict the output variable by selecting system characteristics or input variables. ML models will have higher prediction capability if: firstly, the number of data is much enough, and secondly, the input variables are correctly selected and reflect the health status of the system. Model inputs play a key role in reducing the prediction error and reaching a reliable failure time prediction. Some articles consider three categories to define the inputs of the prediction model technical health, design records, and environmental conditions [9,10]. The technical status of a system is measured with information such as working life history, environmental conditions, mechanical loads, failure records, asset age, maintenance information, and indicators that can be continuously controlled in each system, such as noise, temperature, vibration, and pressure. Shafiee presents a structured framework for RUL estimation to decide on life extension. It is suggested that first of all, the health of vital parts should be checked. Since the number of these parts is high, especially in complex systems, and checking all of them is difficult, these important and key parts can be identified with the techniques such as failure mode effect analysis (FMEA), fault tree analysis (FTA), and event tree analysis (ETA). This method is rational because the evaluation and control of the components that are in unfavorable conditions and have a destructive effect on the life and reliability of the system minimize the risk for the whole complex and increase the remaining useful life. Then, all data, including supervisory, operational, environmental, and breakdown data, would be collected and checked [11]. In real systems, more than one failure mode can be counted for the critical components and parts of a system, therefore, in such studies, the dominant failure mechanism is identified, and it is determinant in the prediction of failure time [9]. Using normalizing, statistical indicators, signal processing approaches, and reducing data dimensions, and raw data can be prepared to enter learning algorithms. In this way, ML algorithms work better, and the final model has higher validity and accuracy; this is called feature engineering [12]. The purpose of feature selection is to specify a subset of variables from the entire raw data, determine the optimal effective input data, and minimize the adverse effects of noise and irrelevant information errors as much as possible [13]. Figure 2 shows the place of feature engineering in ML models. If these variables are too many or not enough, the answers of the model will be associated with false alarms. In a valid model, the rate of false alarms should be low and the correct detection rate is high as possible [3]. The defined feature engineering for each system is specific for it and is not generalized for others. In addition, it depends on the knowledge and experience of the system experts. Therefore, its correct implementation is difficult; in the meantime, Deep learning (DL) networks help to find features by structural matching between the model and data. DL models do not need feature engineering [14]. DL will be discussed in more detail in the next section.  ystems, their p nts significantly ns in which th temperature, ironment, en ays affect the rts. The safety rrosion and fa mage tolerance ity [15]. Acco ation and pred ucture after p orrelate with y of the structu o use the inve cts, records of inspection for ures, as the inp model. airs and main viation accide me the flight n accident is d maintenance he last inspect iable for antici nce the inspect ion, and cabin were propound dents [17]. for the data-driven approaches is the most important part of the problem. Data unavailability can be a key challenge in these approaches [30]. The data can be from sensors or event data [31]. Sensor data refers to the parameter values obtained from the system measurement, such as voltage, current, temperature, vibration, and pressure. The mechanical, magnetic, electrical, and event data includes the appropriate procedures for repair and maintenance that the operators and employees have performed and recorded according to the different conditions of the systems and equipment [28].DL is a subset of ML and ML is a very useful subset of artificial intelligence. Deep neural networks (DNN) have more neurons and hidden layers than normal neural networks, and subsequently, the number of weights in these networks is much higher, so it has higher learning ability. CNN, Recurrent Neural Networks (RNN), AE, andGAN are among DNNs. The application of ML tools has recently made extensive progress in the field of reliability, failure time prediction, and PHM. The wide application of ML in reliability engineering can be studied in [32].
Figure3 shows the categories of ML and the subcategories, which include a) Supervised learning, b) Unsupervised learning, c) Semisupervised learning, and d) Reinforcement learning . Table 1 introduces and shows some of the most important types of ML models for the four mentioned categories along with their advantages and disadvantages [32], in the meantime, as a few examples of the use of machine learning models in the field of reliability, the following item scan be mentioned: basic linear regression (LR) [33], polynomial response surface (PRS) [34], random forest regression (RFR) [35], decision tree regression (DTR) [36][37][38][39], Bayesian network [40,41], can be mentioned. In addition, deep learning networks have emerged as a very effective tool for pattern recognition, which has the potential to improve the performance in current intelligent prognostication. Newer tools such as DNN models [42][43][44], LSTM [45,46], RNN [47], CNN [48,49], AE networks [50], Support Vector Regression (SVR) [51,52], have been used in the field of reliability in the recent years and result in good and significant results.   [53]. Khalif et al., conducted on the turbofan engine degradation data set available at NASA, tried to estimate the RUL of the equipment directly from the sensor values. In the mentioned method, estimation of degradation modes or failure threshold is not required, a support vector regression model was used to determine the relationship between health indicators and sensor values [54]. Jun et al. proposed a PHM technique to increase the useful life of tactical missiles and by examining the application of classical techniques, engineering approaches to data acquisition architecture, life destruction factor analysis, and life prediction process [55]. Another study predicted helicopter accidents in the United States with the help of ML tools; among different ML techniques, DNN showed the best performance [18]. Ma et al. designed a PHM system to predict the remaining useful life of the aircraft engine with the approach of life prediction and fault diagnosis, as well as checking the health status of the system. For this purpose, performance reduction characteristics were measured with the help of several sensors on the engine, and AE and logistic regression models were used [56]. Liu et al. predicted the remaining useful life of the aircraft engine using a data-based model, with the help of CNN, and detecting the risk of equipment failure and consequently reducing losses demonstrated the effectiveness of this proposed structure [57]. Farsi developed a convolutional neural network based on the raw data recorded by the vibration sensor and transformed to the frequency domain. This algorithm can detect a defective bearing from a healthy and perfect bearing and then determine the location and size of the damage [58].De Potter et al. designed a predictive repair and maintenance model for the remaining useful life of the aircraft fleet. The structure of the model is such that these forecasts are updated periodically, and based on the evaluation of these forecasts over time, alarms are created, and by activation of alarms, the maintenance program is planned. The mentioned model, with the help of a convolution network was designed for a fleet of 20 planes with 2 types of turbofan engines [59]. Lee and Mitty Kay (2022) proposed a plan for repairs and maintenance with high reliability and reasonable cost. For this purpose, they evaluated various traditional and predictive maintenance strategies. This study investigated Gaussian process learning models and adaptive sampling for all maintenance plans. Finally, they concluded that predictive maintenance and repairs based on remaining useful life prognosis were superior to other plans [60]. Subagia et al. investigated the relationship between helicopter accidents and their configuration and examined 825 accidents from 2005 to 2015.They only considered characteristics such as the number of main rotor blades, the number of engines, rotor diameter, and take-off weight, in the mentioned study, the logistic regression model with the response variable of the probability of accidents was used [61]. Sampaio et al. proposed a method that simulates the collected data from a vibration system for an engine. These collected values enter the test and training dataset into a neural network and predict the failure time. A model was built to simulate normal engine vibrations and measurements instead of real accelerometers, and ML tools were used to make predictions [62]. Zhao et al. focused on the remaining useful life prediction of an aircraft engine in a progressive degradation mode. In their paper, they pointed out that there is a certain relationship between the degradation process and the remaining useful life, and they tried to learn this destruction pattern by a neural network that reflects this relationship [63]. Celikmih et al., for the prediction of aircraft system failures, offered Multilayer Perceptron (MLP) model as an artificial neural network (ANN), support vector regression (SVR), and linear regression (LR) as ML. The aircraft equipment maintenance and failure data were collected in two years, and nine input variables were determined; a hybrid data preparation model is proposed to improve the success of failure number prediction in two steps. In the first step, a feature selection method is used for feature evaluation to find the most effective and ineffective parameters. In the second step, a K-means algorithm is modified to remove noisy or inconsistent data [64]. Liu et al. predicted the remaining useful life of aircraft engines with the help of 7 input variables and using LSTM networks [65]. Another study that was conducted on the NTSB aviation accident data by Zhang, investigated aviation accidents and the probability of death on the NTSB accident database using (LSTM) [66]. In addition, Zhang et al. constructed a formulated Bayesian network to show causal relationships in the sequence of NTSB incidents [67].

Current Challenge
To examine the challenges in the mentioned field more closely and to reveal the research opportunities and gaps in this research field, 7 leading articles are compared with details in Table 2. One of the existing challenges in the studies investigating the failure time (especially in aerial accidents) is the lack of availability of run-tofailure sensor data in real conditions. Most of the studies in the field of calculating the failure time of aerial accidents have used the simulation data, e.g., the NASA data bank, CMAPSS, which has simulated the sensor data related to the failure of the aircraft engine until the moment of the failure, or other data that are mostly simulations.  Table 2 shows that the proposed methods heavily depend on the training data. But, failure data in the aviation system is not available enough to reach the required level of confidence now. The unavailability of run-to-failure sensor data is a major limitation in calculating the failure time and accidents, which can be considered in future research. The clustering methods and PCA (principle component analysis) can be augmented to them to minimize the required data and optimize the test conditions.
The second challenge is that the failure time prediction is usually made as a static calculation based on historical data. Then, the degrading condition of the system is not considered. The proposed method output can be updated using data fusion with monitoring signals. In this way, the maintenance credit for the air vehicles can be shown to the standard authority organizations and limitations of the training dataset can be compensated with monitoring data to some extent.
The third challenge is related to the balance of model-based and data-driven prediction algorithms. Both of them lack some things to fulfill the required precision and validity domain. Thus, a fair combination of two methods augmented with the ML techniques can be used to cope with the dynamic behavior of the air vehicles' health. The innovative idea is to use nonphysical models to determine the residuals for failure time prediction and fault distinguishing. This method can eliminate the model-based problem and help use the available data on direct training instead of physical model parameter estimation. In addition, the nonphysical models can be valid for larger flight conditions and aircraft configurations.

Discussion and Conclusion
ML algorithms are widely used in many engineering fields and show satisfactory performance in real applications. In the present work, the prediction of failure time in the field of aviation accidents was investigated. The tools, models, predictors, variables, and challenges in this field were discussed. Investigations showed that ML models in the field of the aviation industry provide increasing performance and are considered powerful tools for predicting failure time and RUL. The variables and factors of accidents predictor were mentioned mainly for airplanes and helicopters in different studies. In the case of helicopter accidents, the maximum take-off weight, the diameter of the main rotor blades or the diameter of the rotor, the number of main rotor blades, the type of engine, and the number of engines were among the variables used in some studies. The C-MAPSS dataset with 21 variables was used in several studies to analyze airplane engine accidents.
Due to the complications of the model-based or physics-based approaches, researchers have widely used data-driven approaches. On the other hand, having a database with accurate and sufficient data is considered the first step and challenge of using datadriven approaches.
Sometimes the run-to-failure sensor data is unavailable, making it hard to analyze and predict failure based on data. According to the use of simulation data in aviation industry articles and the high cost and risk of flight testing, this issue seems to be one of the most important challenges for future studies in this field.