Automation of Production Planning and Control

A deep dive into production control with intelligent agents

JournalIndustry 4.0 Science
Issue Volume 41, 2025, Edition 5, Pages 86-93
Open Accesshttps://doi.org/10.30844/I4SE.25.5.84
Bibliography Share Cite Download

Abstract

How can artificial intelligence (AI) automate production planning and control? This study examines its potential to enhance efficiency in modern production environments. The focus is on establishing a robust data infrastructure that integrates real-time, historical, and contextual data to create a solid basis for AI models. Reinforcement learning (RL) is applied to aid automation. A roadmap for implementation, focusing on practical application, is presented. This roadmap incorporates simulation-based training methods and outlines strategies for continuous improvement and adaptation of production processes.

Keywords

Article

Modern production environments’ increasing complexity poses significant challenges for companies when planning and controlling their production [1, 2]. In this context, AI can offer novel perspectives, as it can develop optimal strategies from iterative learning processes that adapt to changing production conditions and automate them [3,4].

However, the successful application of AI in production planning and control requires a robust and scalable data infrastructure that must fulfill a wide range of requirements. Ensuring the availability and quality of real-time and historical data, as well as seamless integration into existing systems such as Manufacturing Execution Systems (MES) and Enterprise Resource Planning (ERP) systems, is paramount. In the following sections, these challenges will be examined in detail, drawing on current research.

The article aims to present a strategy for companies to automate production planning and control using AI and to provide a detailed description of the potential of automating production control using intelligent agents.

Production planning and control basics

Production planning and control is a comprehensive term encompassing all strategic and operational measures necessary to plan, control, and monitor manufacturing processes efficiently [5,6]. The overarching objective of production planning and control is to ensure on-time and cost-efficient production with an optimal use of resources (mainly machinery and personnel), thereby achieving high logistical performance at minimal cost [5]. According to the approaches advanced by Nyhuis et al. [5] and Schmidt et al. [6], production planning and control comprises the following elements:

  • Planning: defining production programs, calculating material requirements, and allocating capacities.
  • Control: coordinating production processes by checking availability, dispatching production orders, forming meaningful sequences in processing, and adapting capacities.
  • Controlling: analyzing deviations and initiating corrective measures.

The Hanoverian Supply Chain Model (HaLiMo) is a theoretical framework that organizes the tasks and processes involved in production planning and control according to their chronological and logical sequence (see Fig. 1) [6]. This integrated approach ensures that production processes remain flexible and responsive, a paramount quality in dynamic markets.

Figure 1: Hanoverian Supply Chain Model (following [6]).
Figure 1: Hanoverian Supply Chain Model (following [6]).

Artificial intelligence in the Hanoverian supply chain model

Manufacturing companies are among the primary beneficiaries of digital transformation [7]. The increasing availability of data presents a range of opportunities for enhancing efficiency, improving quality, and reducing production costs [8]. AI applications are being successfully utilized in various different areas to achieve these goals, whilst also being the object of intensive research. AI solutions are making a particularly critical impact on production planning and control (see [10, 4, 9]). 

Within the HaLiMo framework, a range of use cases for AI applications are identified, with the potential to contribute to achieving holistic automation of production planning and control through a multi-agent system approach. In principle, each of the eleven primary production planning and control tasks (see Fig. 1) can be automated using an appropriate AI approach. The multi-agent system can then be implemented so that the individual AI solutions used to automate production planning and control are linked (central AutoPPC) according to their information and material flows (see Fig. 1). 

The added value for both research and industry lies in the fact that implementing a future-proof data infrastructure provides a holistic basis for intelligent, adaptive, and automated production planning and control, incorporating both real-time and historical data, and can be seamlessly integrated into existing systems. In the extant literature (see [9, 10]), the optimization of individual production planning and control subtasks with the assistance of AI is the current norm. However, there is little holistic linkage like that offered by a framework model such as HaLiMo, which is the key contribution that this article makes.

The subsequent discussion focuses on automating production control tasks using intelligent Reinforcement Learning (RL) agents. To achieve this, the fundamentals of RL are outlined, followed by a demonstration of how companies can develop intelligent agents to automate discrete production control tasks and integrate them comprehensively within their organization.

RL is a subfield of machine learning in which an agent learns to optimize its actions by interacting with an environment. The agent operates in a defined environment characterized by distinct states. By executing actions, the agent changes the current state and receives feedback as a reward. This reward indicates the extent to which the selected action contributes to achieving the long-term goal [11, 9]. The context and schematic structure are illustrated in Figure 2.

Figure 2: Schematic structure of a reinforcement learning agent (following [11]).
Figure 2: Schematic structure of a Reinforcement Learning agent (following [11]).

The objective is to devise a strategy that maximizes the cumulative reward over numerous interactions. The existing literature thoroughly outlines the fundamental concepts and methodologies of this approach [11]. Preliminary approaches already demonstrate RL’s potential to automate individual production planning and control tasks and to cope with complexity [12].

For instance, it has been demonstrated that RL agents can outperform established industry standards in time management [3].

Data infrastructure requirements

The performance of RL models depends on the quality and variety of the underlying data and information. In production environments, this data can be categorized into three main types:

  1. Real-time data: Production plants continuously generate data streams reflecting the current state of machines, material flow, and production progress. This is a consequence of integrating sensors, IoT devices, and MES [13].
  2. Historical data: Extensive archives of past production processes are necessary to train and validate RL models. Historical data primarily includes production logs with information on planned values, processing times, and capacities [11, 14].
  3. Context and metadata: Beyond numerical data, information regarding production plans, job priorities, resource availability, and other organizational parameters is crucial. This contextual data, often derived from ERP systems, provides the framework for interpreting production data [15].

The performance of an RL agent is closely linked to the quality of the available data. Incomplete or inconsistent data sets can lead to misinterpretation and faulty learning processes. Therefore, quality assurance processes must be implemented to check the decisions of the RL agent for plausibility and completeness [11, 9]. This necessitates the utilization of contemporary streaming technologies and data pipelines capable of processing substantial data volumes expeditiously [16]. The diversity in data formats from disparate sources poses a significant challenge. Data lakes and ETL (Extract, Transform, Load) processes facilitate efficient aggregation and pre-processing [3, 11].

Integration of the RL agent into existing systems

Integrating the RL agent into the existing IT landscape is a prerequisite for realizing synergies between traditional systems and modern algorithms. The primary focus in this regard is on MES and ERP systems. MES’s significant responsibility is the real-time control and monitoring of production processes [17]. The RL agent must be able to access the data provided by MES directly [17, 11].

In addition, the agent must be able to provide feedback to the system through control commands. ERP systems are designed to manage company-wide data that extends beyond the confines of the production process [17]. To ensure optimal support for the RL agent, it is imperative to integrate and contextualize data from this system. This often necessitates the development of specialized interfaces that facilitate bidirectional communication between the systems, fostering seamless integration and effective collaboration [18].

Implementation roadmap—from concept to practice

The successful introduction of an RL agent in practical production planning and control requires a systematic approach that considers both technical and organizational aspects. The following delineates a detailed and practical roadmap that companies can use to guide them from conceptual planning to training, integration, and continuous optimization. The approach is grounded in the widely recognized Cross Industry Standard Process for Data Mining, a standardized procedure for conducting data mining projects [19]. The schematic structure and procedure are illustrated in Figure 3.

Figure 3: Implementation roadmap for integrating a Reinforcement Learning agent in production planning and control.
Figure 3: Implementation roadmap for integrating a Reinforcement Learning agent in production planning and control.

Before developing a client-specific RL agent, companies must perform a thorough inventory of their existing IT and production systems. This encompasses documenting prevailing data sources (for example real-time and contextual data) and the interfaces utilized for MES and ERP systems. Furthermore, it is imperative to establish quantifiable key performance indicators (KPIs) that can effectively gauge the efficacy of the RL agent [9]. 

The subsequent stage in the process is consolidating the requisite data sources. Ensuring the quality and consistency of the data is paramount and should be facilitated through automated testing and data cleansing processes [11, 14]. Integrating the RL agent into production is contingent upon creating a digital twin or simulation model of the production system. This facilitates risk-free testing and training of the agent under simulated production conditions. Concurrently, a simulation platform should be developed that maps various scenarios, including extreme or rare production situations. This establishes a realistic environment where the RL agent can learn iteratively and refine its decision-making strategies [4, 11].

The initial training in the simulation is initiated at this stage. The RL agent is first exposed to historical and simulated real-time data, thereby facilitating the development of fundamental decision strategies. Through systematic evaluations and refinements, the model undergoes continuous enhancement. Following the successful completion of preliminary training sessions, a gradual integration of real-time data is initiated. Transitions to a productive environment are emulated through incremental assessments, and the model is fortified against contingencies. This interoperability enables the agent to implement decisions based on real-time feedback while simultaneously feeding relevant data back to the higher-level system [17]. 

Before comprehensive implementation, the RL agent should be deployed in a pilot phase in a controlled area. This pilot phase validates the system’s functionality under real-world conditions while concurrently monitoring and assessing security, data protection, and performance aspects in actual operational settings. After integration, continuous monitoring is imperative. This entails observing the performance of the RL agent, its decisions, and the repercussions of these decisions on production processes. 

Actual ‘realization’ can be initiated after completing the six steps above. In this context, realization is the practical implementation of the production planning and control tasks, which the RL agent subsequently automates. It is imperative to note that the schematic sequence of the seven steps, as illustrated in Figure 3, should not be interpreted as a one-time process. Instead, it is a process that must be repeatedly executed and adapted to new conditions.

The system should be designed to continuously acquire knowledge from new data and feedback. Feedback loops between production, IT, and the developers of the RL agent enable the execution of adjustments and the continuous optimization of the model. Implementing an RL agent is known to induce modifications in work processes and necessitate acquiring new skills.

Consequently, it is incumbent upon companies to promptly allocate resources toward training and continuing education programs. The objective of such initiatives is to facilitate the acquisition of competencies in utilizing novel technology while concurrently addressing concerns and inhibitions that may arise when adopting these new methodologies. This approach fosters not only acceptance but also the attainment of long-term success.

This roadmap provides companies with a structured approach to successfully transitioning from the conceptual planning stage to the productive use of an RL agent in production planning and control. Companies can minimize risks and achieve sustainable competitive advantages by systematically considering technical, organizational, and security-related aspects.

Challenges and prospects

Implementing a holistic data infrastructure for RL in production planning and control poses numerous challenges, including the fact that relevant data is often stored in isolated systems that do not communicate effectively. To overcome these silos, it is necessary to implement not only technical but also organizational measures to achieve company-wide data integration.

In addition to data integration, the scalability and performance of the RL agent have been known to present challenges. High data volumes and varying loads characterize production environments. Therefore, the data infrastructure must be designed to scale flexibly and handle peak loads. This challenge can be met through distributed database systems and cloud solutions that guarantee high availability and performance. 

A further challenge is the need for interdisciplinary collaboration to successfully implement and integrate such projects into a company’s production process. The success of RL projects in production planning and control is contingent upon close collaboration between data scientists, IT specialists, production logisticians, and security officers.

It is only through an interdisciplinary approach that the complex challenges of data integration and security can be overcome. Implementing an RL agent should be regarded as a continuous process rather than a one-time undertaking, as it involves the ongoing allocation of resources. The data infrastructure must be flexible to accommodate future requirements, technological advancements, and evolving production processes. Consequently, regular evaluations and updates are imperative.

To conclude, it can be stated that the implementation of RL in production planning and control necessitates a robust and future-proof data infrastructure. Essential prerequisites include providing high-quality real-time and historical data and seamless integration into existing MES and ERP systems. By leveraging modern technologies such as digital twins and advanced data preprocessing, companies can achieve sustainable enhancement of their production systems’ performance and secure long-term competitive advantages. Interdisciplinary collaboration and continuous data infrastructure development are pivotal to success in an increasingly digitized and networked industrial environment. 

HaLiMo is proposed as a good method for achieving this objective in a structured manner, due to its holistic character and its consideration of interactions, information and material flows in production logistics. Visions for the future could involve automating production planning and control in its entirety. This could be realized through the automation and integration of all production planning and control tasks.

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – SFB 1153 – 252662854.

The original German version of this article can be accessed via DOI: 10.30844/I4SD.25.5.86


Bibliography

[1] Mrugalska, B.; Wyrwicka, M.K.: Towards Lean Production in Industry 4.0. Procedia Engineering 182 (2017), pp. 466-473.
[2] Mütze, A.; Lucht, T.; Nyhuis, P.: Logistics-Oriented Production Configuration Using the Example of MRO Service Providers. IEEE Access 10 (2022), pp. 20328-20344.
[3] Altenmüller, T.; Stüker, T.; Waschneck, B.; Kuhnle, A.; Lanza, G.: Reinforcement learning for an intelligent and autonomous production control of complex job-shops under time constraints. Production Engineering 14 (2020) 3, pp. 319-328.
[4] Panzer, M.; Bender, B.: Deep reinforcement learning in production systems: a systematic literature review. International Journal of Production Research 60 (2022) 13, pp. 4316-4341.
[5] Nyhuis, P.; Wiendahl, H.-P.: Logistische Kennlinien: Grundlagen, Werkzeuge und Anwendungen, Springer, Berlin, 2012.
[6] Schmidt, M.; Nyhuis, P.: Produktionsplanung und -steuerung im Hannoveraner Lieferkettenmodell: Innerbetrieblicher Abgleich logistischer Zielgrößen. Springer Vieweg, Berlin, 2021, p. 217.
[7] Sui, X.; Jiao, S.; Wang, Y.; Wang, H.: Digital transformation and manufacturing company competitiveness. Finance Research Letters 59 (2024), p. 104683.
[8] Zhang, Q.; Li, S.; Li, Z.; Xing, Y.; Yang, Z.; Dai, Y.: CHARM: A Cost-Efficient Multi-Cloud Data Hosting Scheme with High Availability. IEEE Transactions on Cloud Computing 3 (2015) 3, pp. 372-386.
[9] Wang, Y.-C.; Usher, J. M.: Application of reinforcement learning for agent-based production scheduling. Engineering Applications of Artificial Intelligence 18 (2005) 1, pp. 73-82.
[10] Hiller, T.; Demke, T. M.; Nyhuis, P.: Throughput Time Predictions Along the Order Fulfilment Process. IEEE Access 12 (2024), pp. 9705-9718.
[11] Brunton, S. L.; Kutz, J. N.: Data-Driven Science and Engineering. Cambridge University Press, 2019.
[12] Stricker, N.; Kuhnle, A.; Sturm, R.; Friess, S.: Reinforcement learning for adaptive order dispatching in the semiconductor industry. CIRP Annals 67 (2018) 1, pp. 511-514.
[13] Peschke, F.; Eckardt, C.: Flexible Produktion durch Digitalisierung: Entwicklung von Use Cases. Hanser, München, 2019, p. 249.
[14] Kaelbling, L. P.; Littman, M. L.; Moore, A. W.: Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4 (1996), pp. 237-285.
[15] Pistorius, J.: Industrie 4.0 – Schlüsseltechnologien Für Die Produktion: Grundlagen, Potenziale und Anwendungen. Springer, Berlin, 2020, p. 190.
[16] Jordan, M. I.; Mitchell, T. M.: Machine learning: Trends, perspectives, and prospects. Science 349 (2015) 6245, pp. 255-260.
[17] Berić, D.; Stefanović, D.; Lalić, B.; Ćosić, I.: The Implementation of ERP and MES Systems as a Support to Industrial Management Systems. International Journal of Industrial Engineering and Management 9 (2018) 2, pp. 77-86.
[18] Tao, F.; Qi, Q.; Liu, A.; Kusiak, A.: Data-driven smart manufacturing. Journal of Manufacturing Systems 48 (2018), pp. 157-169.
[19] Wirth, R.; Hipp, J.: CRISP-DM: Towards a standard process model for data mining. Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining, 1 (2000), pp. 29-39.

Your downloads


Solutions: Production Control Production Planning

You might also be interested in

Learning Factories for the Future of Manufacturing in Brazil

Learning Factories for the Future of Manufacturing in Brazil

Advancing manufacturing through technology and skills development
Manufacturing firms in developing countries face challenges in closing productivity gaps while adopting Industry 4.0 technologies. Learning factories are one helpful approach to countering these challenges. One such example is the learning factory Fábrica do Futuroin São Paulo, Brazil, which has engaged students, supported competence development, and collaborated with industry in applied research, functioning as a hub for advanced manufacturing initiatives.
Serious Gaming and the Energy Transition

Serious Gaming and the Energy Transition

Collaborative knowledge generation and interactive understanding of complex interrelationships
Janine Gondolf ORCID Icon, Gert Mehlmann, Jörn Hartung, Bernd Schweinshaut, Anne Bauer
Conveying the complexity and multifaceted nature of the energy transition to a broad audience is a challenge. This article demonstrates how interactive serious games on a multitouch table can help make connections tangible and comprehensible. The games and the table were used in various conversational contexts. These are presented here in three case vignettes based on participant observation of the different applications, as well as situated and shared reflection. The vignettes demonstrate how interaction can trigger epistemic processes, enable shifts in perspective, and foster collective thinking, all of which are necessary for shaping the future of society as a whole.
Industry 4.0 Science | Volume 42 | 2026 | Edition 2 | Pages 62-69
Experiencing Digital Twins in Production and Logistics

Experiencing Digital Twins in Production and Logistics

The fischertechnik® Learning Factory 4.0 as a development platform for possible expansion stages
Deike Gliem ORCID Icon, Sigrid Wenzel ORCID Icon, Jan Schickram, Tareq Albeesh
The fischertechnik® Learning Factory 4.0 has proven to be a suitable experimental environment for testing digital twins. Depending on the targeted maturity stage, the functions of a digital twin range from status monitoring and forecasting to the operational control of production and logistics systems. To systematically classify these functions, this article presents a maturity model that serves as a framework for the development of a digital twin. Building on this, selected use cases are implemented in a test and development environment based on a system architecture with multi-layered logic structure. These initial implementations serve to highlight application purposes, relevant methods, and typical challenges and potentials in the transfer to real factory environments.
Industry 4.0 Science | Volume 42 | Edition 2 | Pages 30-37 | DOI 10.30844/I4SE.26.2.30
Collaborative Robots in Production Environments

Collaborative Robots in Production Environments

Employee qualification and acceptance for human-machine interaction
Tobias Wienzek, Mathias Cuypers ORCID Icon
The introduction of new technologies poses a major challenge, especially for small and medium-sized enterprises (SMEs). At the same time, SMEs must rise to this challenge in order to keep pace technologically and economically. Employee acceptance is an important factor in ensuring that both the introduction and the long-term use of a technology are successful. At the same time, the introduction process also has a central influence on acceptance in the long term. This article uses the implementation of collaborative robotics as an example for examining such an introduction process, identifying the key factors that influence employee acceptance and the important role played by advanced employee training. It serves to highlight how the introduction process and employee training are seamlessly interlinked.
Industry 4.0 Science | Volume 42 | 2026 | Edition 2 | Pages 14-21 | DOI 10.30844/I4SE.26.2.14
XAI for Predicting and Nudging Worker Decision-Making

XAI for Predicting and Nudging Worker Decision-Making

Feasibility and perceived ethical issues
Jan-Phillip Herrmann ORCID Icon, Catharina Baier, Sven Tackenberg ORCID Icon, Verena Nitsch ORCID Icon
Explainable artificial intelligence (XAI)-based nudging, while ethically complex, may offer a favorable alternative to rigid, algorithmically generated schedules that simultaneously respects worker autonomy and improves overall scheduling performance on the shop floor. This paper presents a controlled laboratory study demonstrating the successful nudging of 28 industrial engineering students in a job shop simulation. The study shows that the observed concordance between students’ sequencing decisions and a predefined target sequence increases by 9% through nudging. This is done by using XAI to analyze students’ preferences and adjusting task deadlines and priorities in the simulation. The paper discusses the ethical issues of nudging, including potential manipulation, illusory autonomy, and reducing people to numbers. To mitigate these issues, it offers recommendations for implementing the XAI-based nudging approach in practice and highlights its strengths relative to rigid, ...
Industry 4.0 Science | Volume 42 | 2026 | Edition 1 | Pages 70-78
Improving Documentation Quality and Creating Time for Core Activities

Improving Documentation Quality and Creating Time for Core Activities

Success factors for implementing AI-based documentation systems in nursing care
Sophie Berretta ORCID Icon, Elisabeth Liedmann ORCID Icon, Paul-Fiete Kramer ORCID Icon, Anja Gerlmaier, Christopher Schmidt
Demographic change is accompanied by both a growing demand for care and a shortage of qualified nursing staff. Consequently, AI-based technologies are increasingly becoming a focus of care-related innovations. Their aim is to reduce workload pressure, save time, and enhance the attractiveness of the nursing profession. Using the example of AI-supported documentation systems for admission interviews, this article examines to what extent such systems can contribute to improvements in work processes and care quality, focusing on the perspectives of nursing professionals and nursing experts. The results indicate potential for workload relief, enhanced documentation quality, and the reallocation of time resources toward direct patient care. However, realizing these potentials requires a human-centered and context-sensitive implementation approach.
Industry 4.0 Science | Volume 42 | 2026 | Edition 1 | Pages 154-160 | DOI 10.30844/I4SE.26.1.146