Target Interception in Uncertain Environment Using ART2-Based Reinforcement Learning

Document Type : Original Research Article

Authors

1 Department of Computer Science, Faculty of Basic Sciences, University of Lorestan, Khorramabad, Iran

2 Department of Computer Engineering, Faculty of Technology and Engineering, University of Lorestan, Khorramabad, Iran

3 Department of Computer Engineering, Faculty of Technology and Engineering, University of Arak, Arak, Iran

Abstract

Tracking moving targets using mobile robots is a crucial aspect of robotics. This paper presents a novel approach for tracking a moving target in an uncertain environment with various obstacles, even when the target’s trajectory and speed are continuously changing and unknown. The proposed method utilizes reinforcement learning, a widely used technique for motion planning problems. However, applying reinforcement learning in uncertain dynamic environments poses a challenge due to the continuous state space. To address this issue, our algorithm employs an ART2 neural network for classifying the state space. Additionally, to enhance the speed of reaching the target, a point is predicted based on the target’s speed, direction, and the robot’s speed. The robot then selects its next move to approach this predicted point while avoiding contact with both static and dynamic obstacles. Simulation results demonstrate the efficiency of the algorithm, as the robot successfully reaches the target without colliding with any obstacles.

Keywords

Main Subjects


[1] C. Zhou, B. Huang and P. Fränti, “A Review of Motion Planning Algorithms for Intelligent Robots.” Journal of Intelligent Manufacturing, vol. 33, no. 2, pp. 387–424, Nov. 2021, doi: https://doi.org/10.1007/s10845-021-01867-z.
[2] Q. Zhu, J. Hu and L. Henschen, “A New Moving Target Interception Algorithm for Mobile Robots Based on Sub-Goal Forecasting and an Improved Scout Ant Algorithm.” Applied Soft Computing, vol. 13, no. 1, pp. 539–549, Jan. 2013, doi:  https://doi.org/10.1016/j.asoc.2012.08.013.
[3] P. Chen, J. Pei, W. Lu and M Li, “A Deep Reinforcement Learning Based Method for Real-Time Path Planning and Dynamic Obstacle Avoidance.” Neurocomputing, 497: pp. 64-75, May 2022, doi: https://doi.org/10.1016/j.neucom.2022.05.006
[4] T-H.S. Li, S-J. Chang and W. Tong, "Fuzzy target tracking control of autonomous mobile robots by using infrared sensors," in IEEE Transactions on Fuzzy Systems, vol. 12, no. 4, pp. 491-501, Aug. 2004, doi: http://doi.org/10.1109/TFUZZ.2004.832526.
[5] L. Yang, J. Qi, J. Xiao and X. Yong, “A literature review of UAV 3D path planning,” in Proceeding of the 11th World Congress on Intelligent Control and Automation, pp. 2376–2381, June 2014, doi: https://doi.org/10.1109/wcica.2014.7053093.
[6] L. Freda and G. Oriolo, “Vision-Based Interception of a Moving Target with a Nonholonomic Mobile Robot.” Robotics and Autonomous Systems, vol. 55, no. 6, pp. 419–432, June 2007, doi: https://doi.org/10.1016/j.robot.2007.02.001.
[7] S. Lin, A. Liu, J. Wang and X. Kong, “A Review of Path-Planning Approaches for Multiple Mobile Robots.” Machines, vol. 10, no. 9, p. 773, Sept. 2022, doi: https://doi.org/10.3390/machines10090773.
[8] E. Masehian and A. Naseri, “Mobile robot online motion planning using generalized voronoi graphs,” Journal of Industrial Engineering, vol. 4, no. 5, pp. 1–15, Jan. 2010. [Online], Available: https://www.sid.ir/en/vewssid/j_pdf/1029920100501.pdf
[9] J. Fan, Y. Song and MR Fei “ART2 Neural Network Interacting with Environment.” Neurocomputing, Elsevier BV, vol. 72, no. 1, pp. 170–176, Dec. 2008, doi: https://doi.org/10.1016/j.neucom.2008.02.026.
[10] WD. Smart and LP. Kaelbling, "Effective reinforcement learning for mobile robots," Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), Washington, DC, USA, vol.4, pp. 3404-3410, 2002, doi: https://doi.org/10.1109/ROBOT.2002.1014237.
[11] H.R. Boem and H.S. Cho, "A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning," IEEE Transaction on System, Man, and Cybernetics, vol. 25, pp. 464-477, 1995, doi: https://doi.org/10.1109/21.364859
[12] J.J. Park, J.H. Kim and J.B. Song, “Path Planning for a robot manipulator based on probabilistic roadmap and reinforcement learning.” International Journal of Control, Automation, and Systems, vol. 5, pp. 674–680, 2007.
[13] G.A. Carpenter, “ART 2-A: An Adaptive Resonance Algorithm for Rapid Category Learning and Recognition.” Neural Networks, vol. 4, no. 4, pp. 493–504, Jan. 1991, doi: https://doi.org/10.1016/0893-6080(91)90045-7.
[14] M. Yao, J. Li, Q. Gu, L. Tang and X. Qu, "Study on Q-learning algorithm based on ART2," 2010 8th World Congress on Intelligent Control and Automation, Jinan, China, pp. 3161-3166, 2010, doi: https://doi.org/10.1109/WCICA.2010.5553787.
[15] J. Fan, Y. Song and MR Fei, “ART2 Neural Network Interacting with Environment.” Neurocomputing, Elsevier BV, vol. 72, no. 1, pp: 170–176, 2008, doi:  https://doi.org/10.1016/j.neucom.2008.02.026.
[16] E.N. Kazemi, N. Shabakhty, K. Abbasi and M.S. Sanayee, “Structural Reliability: An Assessment Using a New and Efficient Two-Phase Method Based on Artificial Neural Network and a Harmony Search Algorithm,” Civil Engineering Infrastructures Journal, vol. 49, pp. 1–20, 2016, doi:  https://doi.org10.7508/ceij.2016.01.001.
[17] J. Yu, Y. Su and Y. Liao. “The Path Planning of Mobile Robot by Neural Networks and Hierarchical Reinforcement Learning.” Frontiers in Neurorobotics, vol. 14, Oct. 2020, doi:  https://doi.org/10.3389/fnbot.2020.00063.
[18] S. LaValle, “Rapidly-Exploring Random Trees: A New Tool for Path Planning.” Technical Report, TR 98-11, 1998.
[19] H. H. González-Banos, D. Hsu, and J.-C. Latombe, “Motion planning: Recent developments. Autonomous Mobile Robots: Sensing, Control, Decision-Making and Applications,” 2006.
[20] RC Luo, TM Chen, KL Su, “Target tracking using a hierarchical grey-fuzzy motion decision-making method,” in IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, vol. 31, no. 3, pp. 179-186, May 2001, doi: https://doi.org/10.1109/3468.925657.
[21] L. Huang, “Velocity Planning for a Mobile Robot to Track a Moving Target — a Potential Field Approach.” Robotics and Autonomous Systems, vol. 57, no. 1, pp. 55–63, Jan. 2009, doi: https://doi.org/10.1016/j.robot.2008.02.005.
[22] S.I.A. Meerza, M. Islam and M.M. Uzzal, “Q-Learning Based Particle Swarm Optimization Algorithm for Optimal Path Planning of Swarm of Mobile Robots,” in 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 15, 2019, doi: https://doi.org/10.1109/ICASERT.2019.8934450.
[23] KB de Carvalho, IRL de Oliveira, DKD Villa, AG Caldeira, M Sarcinelli-Filho and AS Brandão, “Q-learning based Path Planning Method for UAVs using Priority Shifting,” in 2022 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 421–426, 2022, doi: https:// doi.org/10.1109/ICUAS54217.2022.9836175.
[24] A. Konar, IG. Chakraborty, SJ. Singh, LC. Jain and AK. Nagar, "A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot," in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 43, no. 5, pp. 1141-1153, Sept. 2013, doi: http://dx.doi.org/10.1109/TSMCA.2012.2227719
[25] C. Yan and X. Xiang, “A Path Planning Algorithm for UAV Based on Improved Q-Learning,” in 2018 2nd International Conference on Robotics and Automation Sciences (ICRAS), 2018, pp. 1–5, doi: http://dx.doi.org/10.1109/ICRAS.2018.8443226.
[26] D. Li, W. Yin, W. E. Wong, M. Jian and M. Chau, "Quality-Oriented Hybrid Path Planning Based on A* and Q-Learning for Unmanned Aerial Vehicle," in IEEE Access, vol. 10, pp. 7664-7674, 2022, doi: http://dx.doi.org/10.1109/ACCESS.2021.3139534.
[27] Z. Yijing, Z. Zheng, Z. Xiaoyi, L. Yang, “Q learning algorithm based UAV path learning and obstacle avoidence approach,” in 2017 36th Chinese Control Conference (CCC), pp. 3397–3402, 2017, doi: http://dx.doi.org/10.23919/ChiCC.2017.8027884.
[28] J. Cui, R. Wei, Z. Liu and K Zhou, “UAV Motion Strategies in Uncertain Dynamic Environments: A Path Planning Method Based on Q-Learning Strategy.” Applied Sciences, vol. 8, no. 11, Nov. 2018, p. 2169, doi:  https://doi.org/10.3390/app8112169.
[29] Y. Gao, Y. Li, Z. Guo, “A Q-learning based UAV Path Planning Method with Awareness of Risk Avoidance,” in 2021 China Automation Congress (CAC), pp. 669–673, 2021, doi: http://dx.doi.org/10.1109/CAC53003.2021.9728342.
[30] E. Masehian and Yalda Katebi. “Sensor-Based Motion Planning of Wheeled Mobile Robots in Unknown Dynamic Environments.” Journal of Intelligent & Robotic Systems, vol. 74, no. 3-4, pp. 893–914, May 2013, Doi: https://doi.org/10.1007/s10846-013-9837-3.
[31] Z. Dehghani Ghobadi, F. Haghighi, A. Safari, “An Overview of Reinforcement Learning and Deep Reinforcement Learning for Condition-Based Maintenance.” International Journal of Reliability, Risk and Safety: Theory and Application, vol. 4, no. 2, Dec. 2021, pp. 81–89, doi: https://doi.org/10.30699/ijrrs.4.2.9.
[32] S. Eidi, A. Safari and F. Haghighi, “Optimal Preventive Maintenance Policy for Non-Identical Components: Traditional Renewal Theory vs Modern Reinforcement Learning.”, International Journal of Reliability, Risk and Safety: Theory and Application, vol. 6, no. 1, pp. 77–85, July 2023, doi: https://doi.org/10.22034/IJRRS.6.1.9.