Abstract:Ensuring worker safety remains a critical challenge in modern manufacturing environments. Industry 5.0 reorients the prevailing manufacturing paradigm toward more human-centric operations. Using a design science research methodology, we identify three essential requirements for next-generation safety training systems: high accuracy, low latency, and low cost. We introduce a multimodal chatbot powered by large language models that meets these design requirements. The chatbot uses retrieval-augmented generation to ground its responses in curated regulatory and technical documentation. To evaluate our solution, we developed a domain-specific benchmark of expert-validated question and answer pairs for three representative machines: a Bridgeport manual mill, a Haas TL-1 CNC lathe, and a Universal Robots UR5e collaborative robot. We tested 24 RAG configurations using a full-factorial design and assessed them with automated evaluations of correctness, latency, and cost. Our top 2 configurations were then evaluated by ten industry experts and academic researchers. Our results show that retrieval strategy and model configuration have a significant impact on performance. The top configuration (selected for chatbot deployment) achieved an accuracy of 86.66%, an average latency of 10.04 seconds, and an average cost of $0.005 per query. Overall, our work provides three contributions: an open-source, domain-grounded safety training chatbot; a validated benchmark for evaluating AI-assisted safety instruction; and a systematic methodology for designing and assessing AI-enabled instructional and immersive safety training systems for Industry 5.0 environments.




Abstract:Two industry-grade datasets are presented in this paper that were collected at the Future Factories Lab at the University of South Carolina on December 11th and 12th of 2023. These datasets are generated by a manufacturing assembly line that utilizes industrial standards with respect to actuators, control mechanisms, and transducers. The two datasets were both generated simultaneously by operating the assembly line for 30 consecutive hours (with minor filtering) and collecting data from sensors equipped throughout the system. During operation, defects were also introduced into the assembly operation by manually removing parts needed for the final assembly. The datasets generated include a time series analog dataset and the other is a time series multi-modal dataset which includes images of the system alongside the analog data. These datasets were generated with the objective of providing tools to further the research towards enhancing intelligence in manufacturing. Real manufacturing datasets can be scarce let alone datasets with anomalies or defects. As such these datasets hope to address this gap and provide researchers with a foundation to build and train Artificial Intelligence models applicable for the manufacturing industry. Finally, these datasets are the first iteration of published data from the future Factories lab and can be further adjusted to fit more researchers needs moving forward.