Complete coverage path planning for multi-connected free-form surface grinding based on reinforcement learning

Zhen Zhu; Bing-Zhou Xu; Chang-Qing Shen; Xiao-Jian Zhang; Si-Jie Yan; Han Ding

doi:10.1007/s40436-025-00570-z

Advances in Manufacturing >

2026 , Vol. 14 >Issue 2: 359 - 376

DOI: https://doi.org/10.1007/s40436-025-00570-z

ARTICLES

Complete coverage path planning for multi-connected free-form surface grinding based on reinforcement learning

Zhen Zhu ,
Bing-Zhou Xu ,
Chang-Qing Shen ,
Xiao-Jian Zhang ,
Si-Jie Yan ,
Han Ding

Expand

1. State Key Laboratory of Intelligent Manufacturing Equipment and Technology, School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, People's Republic of China;
2. HUST-Wuxi Research Institute, Wuxi 214174, Jiangsu, People's Republic of China

Received date: 2024-05-04

Revised date: 2024-07-02

Online published: 2026-04-27

Supported by

This work was supported by the National Key Research and Development Program of China (Grant No. 2022YFB4700501), the National Natural Science Foundation of China (Grant Nos. 52375495, 52188102), and the Fundamental Research Funds for the Central Universities (Grant No. HUST: 23JYCXJJ034).

Fold

Abstract

Coverage path planning (CPP) is an essential process in robotic grinding, particularly with the increasing demand for large-scale multiconnected free-form surfaces, such as high-speed rail shells, car shells, and aeronautical parts. Owing to its multi-connectivity, achieving full coverage with a single continuous path is challenging. Additionally, large curvatures make the path spacing difficult to control, leaving some areas uncovered. Existing methods often fail to optimize continuity and coverage rates simultaneously, resulting in redundant tool-feeding and lifting processes that significantly reduce processing efficiency. Thus, a novel method for free-form surface CPP is proposed based on reinforcement learning (RL), which enables the learning of an optimal path with optimized continuity and coverage rates. Specifically, to regulate the path spacing, a uniform grid map is constructed based on the least-squares conformal mapping (LSCM) method, which parameterizes the grinding surface to a two-dimensional (2D) plane with controllable distortion. Furthermore, a CPP-specific evaluation criteria (CEC) is designed to evaluate the path through various key factors, including coverage rate, continuity, and smoothness. Finally, a grinding path is generated using the CEC-guided RL framework. The method was verified through several simulations, and a grinding experiment on a high-speed rail head surface was conducted as a typical application. The results showed high path continuity and coverage rates, demonstrating its potential for addressing CPP problems in different manufacturing scenarios.

The full text can be downloaded at https://doi.org/10.1007/s40436-025-00570-z

Key words： Coverage path planning (CPP); Grinding; Reinforcement learning (RL); Muti-connected free-form surface; Reward function construction

Cite this article

Zhen Zhu , Bing-Zhou Xu , Chang-Qing Shen , Xiao-Jian Zhang , Si-Jie Yan , Han Ding . Complete coverage path planning for multi-connected free-form surface grinding based on reinforcement learning[J]. Advances in Manufacturing, 2026 , 14(2) : 359 -376 . DOI: 10.1007/s40436-025-00570-z

References

[1] Galceran E, Carreras M (2013) A survey on coverage path planning for robotics. Robot Auton Syst 61(12):1258-1276
[2] Wang H, Qi S, Sun Y et al (2023) Study on automatic sanding process of putty for top of high speed railway train body. Paint Coatings Industry 53(5):76-80
[3] Hassan M, Liu D (2020) PPCPP: a predator-prey-based approach to adaptive coverage path planning. IEEE Trans Rob 36(1):284-301
[4] Wan S, Zhang X, Xu M et al (2018) Region-adaptive path planning for precision optical polishing with industrial robots. Opt Express 26(18):23782. https://doi.org/10.1364/oe.26.023782
[5] Nasirian B, Mehrandezh M, Janabi-Sharifi F (2021) Efficient coverage path planning for mobile disinfecting robots using graph-based representation of environment. Front in Robot and AI 8. https://doi.org/10.3389/frobt.2021.624333
[6] Krupke D (2023) Near-optimal coverage path planning with turn costs. arXiv:2310.20340. https://doi.org/10.48550/arxiv.2310.20340
[7] Mansor MSA, Hinduja S, Owodunni O (2006) Voronoi diagram-based tool path compensations for removing uncut material in 2JD pocket machining. Comput Aided Des 38(3):194-209. https://doi.org/10.1016/j.cad.2005.09.001
[8] Zou Q, Zhao J (2013) Iso-parametric tool-path planning for point clouds. Comput Aided Des 45(11):1459-1468. https://doi.org/10.1016/j.cad.2013.07.001
[9] Sun Y, Guo D, Zhang J et al (2005) Iso-parametric tool path generation from triangular meshes for free-form surface machining. Int J Adv Manuf Technol 28(7/8):721-726. https://doi.org/10.1007/s00170-004-2437-4
[10] Zou Q, Zhang J, Deng B et al (2014) Iso-level tool path planning for free-form surfaces. Comput Aided Des 53:117-125. https://doi.org/10.1016/j.cad.2014.04.006
[11] Feng HY, Teng Z (2005) Iso-planar piecewise linear NC tool path generation from discrete measured data points. Comput Aided Des 37(1):55-64. https://doi.org/10.1016/j.cad.2004.04.001
[12] McGovern S, Xiao J (2002) UV grid generation on 3D freeform surfaces for constrained robotic coverage path planning. In: IEEE 18th international conference on automation science and engineering (CASE), Mexico City, Mexico, 20-24 August, pp 1503-1509. https://doi.org/10.1109/case49997.2022.9926608
[13] Sheng W, Chen H, Xi N et al (2005) Tool path planning for compound surfaces in spray forming processes. IEEE Trans Autom Sci Eng 2(3):240-249. https://doi.org/10.1109/tase.2005.847739
[14] Hauth S, Richterich C, Glasmacher L et al (2010) Constant cusp toolpath generation in configuration space based on offset curves. Int J Adv Manuf Technol 53(1/4):325-338. https://doi.org/10.1007/s00170-010-2817-x
[15] Wang S, Wang C, Wang P et al (2023) PDE-based spiral machining trajectory planning method without tool feed marks on 2D arrayed multi-island regions. Int J Adv Manuf Technol 125(5/6):2021-2034. https://doi.org/10.1007/s00170-022-10702-5
[16] Choset H (2001) Coverage for robotics-a survey of recent results. Ann Math Artif Intell 31:113-126. https://doi.org/10.1023/A:1016639210559
[17] Huang KC, Lian F, Chen C et al (2021) A novel solution with rapid Voronoi-based coverage path planning in irregular environment for robotic mowing systems. Int J Intell Robot Appl 5(4):558-575. https://doi.org/10.1007/s41315-021-00199-8
[18] Kyaw PT, Pning A, Thu TT et al (2020) Coverage path planning for decomposition reconfigurable grid-maps using deep reinforcement learning based travelling salesman problem. IEEE Access 8:225945-225956. https://doi.org/10.1109/access.2020.3045027
[19] Shen C, Mao S, Xu B et al (2003) Spiral complete coverage path planning based on conformal slit mapping in multi-connected domains. arXiv:2309.10655. https://doi.org/10.48550/arXiv.2309.10655
[20] Holland JH (1992) Adaptation in natural and artificial systems. MIT Press, Cambridge, MA
[21] Grondman I, Busoniu L, Lopes G et al (2012) A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans Syst Man Cybern 42(6):1291-1307. https://doi.org/10.1109/tsmcc.2012.2218595
[22] Gong Y, Wang L, Guo R et al (2014) Multi-scale orderless pooling of deep convolutional activation features. In: Fleet D, Pajdla T, Schiele B et al (eds) Computer vision-ECCV 2014. Lecture Notes in Computer Science, vol 8695. Springer, Cham. https://doi.org/10.1007/978-3-319-10584-0_26
[23] Schulman J, Wolski F, Dhariwal P et al (2017) Proximal policy optimization algorithms. arXiv:1707.06347. https://doi.org/10.48550/arXiv.1707.06347
[24] Huang S, Ontanon S (2022) A closer look at invalid action masking in policy gradient algorithms. FLAIRS 35. https://doi.org/10.32473/flairs.v35i.130584
[25] Xing B, Wang X, Liu Y et al (2023) An algorithm of complete coverage path planning for unmanned surface vehicle based on reinforcement learning. J Marine Sci Eng 11(3):645. https://doi.org/10.3390/jmse11030645
[26] Suzuki S, Abe K (1985) Topological structural analysis of digitized binary images by border following. Comput Vis Graph Image Process 30(1):32-46. https://doi.org/10.1016/0734-189x(85)90016-7
[27] Graham RL, Yao F (1983) Finding the convex hull of a simple polygon. J Algorithms 4(4):324-331. https://doi.org/10.1016/0196-6774(83)90013-5
[28] Hexagon (2024) Leica absolute tracker AT960. https://hexagon.com/products/leica-absolute-tracker-at960. Accessed 1 Apr 2024
[29] Hexagon: leica T-Scan 5 (2024). https://hexagon.com/products/leica-t-scan-5. Accessed 1 Apr 2024
[30] Page ES, Bellman R (1962) Adaptive control processes: a guided tour. J R Stat Soc Ser A, Gen 125(1):161-162
[31] Botvinick MM (2012) Hierarchical reinforcement learning and decision making. Curr Opin Neurobiol 22(6):956-962. https://doi.org/10.1016/j.conb.2012.05.008

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References