Aufsatz in einer Fachzeitschrift
Performance Analysis of Different Reward Functions in Reinforcement Learning for the Scheduling of Modular Automotive Production Systems
Details zur Publikation
Autor(inn)en: | Gelfgren, J.; Bill, E.; Luther, T.; Hagemann, S.; Wenzel, S. |
Verlag: | Elsevier |
Publikationsjahr: | 2024 |
Zeitschrift: | Procedia CIRP |
Seitenbereich: | 81-86 |
Buchtitel: | Procedia CIRP |
Bandnr.: | Volume 126 |
Jahrgang/Band : | 126 |
ISSN: | 2212-8271 |
eISSN: | 2212-8271 |
DOI-Link der Erstveröffentlichung: |
Sprachen: | Englisch |
Conventional, linear production lines struggle with the new flexibility requirements of the automotive market. Modular production has the potential to radically improve the production flexibility. However, scheduling modular production systems is still an open research question. Reinforcement learning (RL) is a form of artificial intelligence that shows great potential in scheduling complex modular production systems.
Nonetheless, the performance of RL agents heavily depends on their reward function. Designing an optimal reward function is highly complex. This paper addresses this research gap and systematically compares six different reward functions for a variety of different modular production scenarios. In addition, a new learning method using resets is proposed and its performance is compared with the standard learning approach. The results suggest that dense reward functions perform better than a sparse one. However, there are major case-to-case discrepancies. The proposed learning method outperforms the standard learning method by 7 % on average. Nonetheless, the performance difference between different scenarios is larger than with the standard approach.
Schlagwörter
Modular Production, Production Scheduling, Reinforcement Learning, Reward Functions