An overview of reinforcement learning and deep reinforcement learning for condition-based maintenance

Document Type : Review Article


School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran


Condition-based maintenance (CBM) involves making decisions on maintenance based on the actual deterioration conditions of the components. It consists of a chain of states representing various stages of deterioration and a set of maintenance actions. Therefore, condition-based maintenance is a sequential decision-making problem. Reinforcement Learning(RL) is a subfield of Machine Learning proposed for automated decision-making. This article provides an overview of reinforcement learning and deep reinforcement learning methods that have been used so far in condition-based maintenance optimization.


Main Subjects

  1. Adsule, M.Kulkarni, &A.Tewari, "Reinforcement learning for optimal policy learning in condition‐based maintenance, "IET Collaborative Intelligent Manufacturing, vol.2, no.4, pp.182-188,2020.
  2. Andriotis, K. Papakonstantinou, "Managing engineering systems with large state and actions paces through deep reinforcement learning," Reliab. Eng. Syst. Saf.vol.191, 106483,2019.
  3. Rausand, & A. Hoyland, System reliability theory: models, statistical methods, and applications, John Wiley & Sons.(2003). 
  4. Bellman, "A Markovian decision process. Journal of mathematics and mechanics," 679-684, 1957.
  5. Huang, Q. Chang, & J. Arinez, "Deep reinforcement learning based preventive maintenance policy for serial production lines," Expert Systems with Applications, vol.160, 113701, 2020.
  6. Yousefi, S. Tsianikas, &D. W. Coit, "Dynamic maintenance model for a repairable multi-component system using deep reinforcement learning," Quality Engineering, vol.34, no.1, pp.16-35, 2022.
  7. Sutton, A. Barto, "Reinforcement Learning: An Introduction, "MIT Press: Cambridge, MA, USA, 2018.
  8. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, Bellemare, M. G., ... & Hassabis, D. "Human-level control through deep reinforcement learning". naturevol.518(7540), pp.529-533, 2015.
  9. Yousefi, S. Tsianikas, &D. W. Coit, "Reinforcement learning for dynamic condition-based maintenance of a system with individually repairable components, "Quality Engineering,vol.32,no.3,pp. 388-408, 2020.
  10. Peng, S. (2021). Reinforcement learning with Gaussian processes for condition-based maintenance. Computers & Industrial Engineering, 158, 107321.
  11. Mahmoodzadeh, K. Y. Wu, E. L. Droguett, & A. Mosleh, "Condition-based maintenance with reinforcement learning for dry gas pipeline subject to internal corrosion," Sensors, vol.20,no.19,5708,2020
  12. Zhang, & W. Si, "Deep reinforcement learning for condition-based maintenance planning of multi-component systems under dependent competing risks, "Reliability Engineering & System Safety, vol.203, 107094, 2020.
  13. Knowles, D. Baglee, & S. Wermter, "Reinforcement learning for scheduling of maintenance". In International Conference on Innovative Techniques and Applications of Artificial Intelligence (pp. 409-422). Springer, London.2010, December.
  14. H. Shin, &H. B. Jun, "On condition-based maintenance policy," Journal of Computational Design and Engineeringvol.2, no.2, pp.119-127,2015.
  15. Ebden "Gaussian processes for regression: A quick introduction". Available: 02965. pdf
  16. Grande, Th. Walsh, and J. How, "Sample efficient reinforcement learning with Gaussian processes," in proceeding of 31th conference on Machine Learning, china, 2014.
  17. Sammut, &G. I. Webb, (Eds.). "Encyclopedia of machine learning," Springer Science & Business Media.2011.
  18. Huang, "Introduction to Various Reinforcement Learning Algorithms. Part I (Q-Learning, SARSA, DQN, DDPG),"Towards Data Science, vol.12.2018.
  19. Sewak, "Deep reinforcement learning," Springer Singapore, 2019. 
  20. Cheng, M.D.Pandey, JA. V.D. Weide,"The probability distribution of maintenance cost of a system affected by the gamma process of degradation: finite time solution, "Reliab Eng. Syst Saf., vol.108,pp.65–76,2012
  21. D. Jonge, & P. A. Scarf, "A review on maintenance optimization, "European journal of operational research, vol.285, no.3, pp.805-824,2020.
  22. Dong, Zh. Ding, &sh. Zhangde, "Deep Reinforcement Learning Fundamentals, Research and Applications", Princeton University,china2020.
  23. Henk ,A Tijms. "First Course in Stochastic Models," John Wiley & Sons, Ltd, 2004.
  24. Li, H. Pham, "An inspection-maintenance model for systems with multiple competing processes, "IEEE Trans Reliability, vol. 54,pp.318–27.2005
  25. Y-HLin, Y-FLi, E. Zio, "Fuzzy reliability assessment of systems with multiple-dependent competing degradation processes, "IEEE Trans Fuzzy Syst, vol.23,pp.1428–38,2014.
  26. Matiisen, ”Demystifying Deep Reinforcement Learning” Available: 19, 2015)
  27. S. Melo, "Convergence of Q-learning: A simple proof," Institute of Systems and Robotics, Tech. Rep, 1-4.,2001.
  28. Nevmyvaka, Y. Feng, & M.Kearns, "Reinforcement learning for optimized trade execution," In Proceedings of the 23rd international conference on Machine learning (pp. 673-680), 2006.
  29. Tian, H. Liao "Condition-based maintenance optimization for multi-component systems using proportional hazards model, "Reliab Eng. Sys. Saf., vol. 96, pp.581–9,2011.
  30. V. Hasselt, A. Guez, & D. Silver, "Deep reinforcement learning with double q-learning," In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1), 2016.
  31. J. Watkins, P. Dayan, "Q-learning Machine, Learning 1992, 8, 279–292.
  32. Yaakov, ShieMannor, and R. Meir. "Bayes meets Bellman: The Gaussian process approach to temporal difference learning," In Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 154–161, 2003.
  33. Zhang, Q. Yang, "Optimal maintenance planning for repairable multi-component systems subject to dependent competing risks, "IIE Trans, vol. 47, pp.521-532,2015.
  34. Zhao, M. R.Kosorok,&D.Zeng, "Reinforcement learning design for cancer clinical trials, "Statistics in medicine, vol.28, no.26, pp.3294-3315,2009.