Large-language models (LLMs) have shown remarkable success in modeling complex sequential data, predicting the next token of a sequence, and understanding structured data. Originally developed for NLP tasks, LLMs have since been adapted to diverse domains such as computer vision, audio, multi-modal applications, etc [1]. LLMs have been powered by two major architectures: attention-based models and, more recently, state-space models [2]. Mamba [2], a state-space model, is gaining traction in active research areas due to its capabilities to handle infinite sequence lengths and faster training and inference. In the manufacturing industry, LLMs are increasingly being leveraged to automate and optimize data-driven tasks, enhancing intelligent manufacturing operations. [3, 4]. In this proposal, we focus on applying an LLM-based model to semiconductor fabrication, specifically for manufacturing recipes. Semiconductor recipes are a core component in the manufacturing process and contain multiple interdependent input parameters for each step. Hence, unlike conventional models that predict the next token in a sequence, the recipe completion task requires predicting various parameters for each step. Additionally, some parameters are only valid under specific conditions and not every step. Therefore, we must predict which parameters are relevant or valid for a given step and the correct value for these parameters, and avoid outputting values for invalid parameters. To address the prediction of multiple values in a step, we proposed a decoder-only LLM architecture to consider recipe data as a sequence of tabular features. We employ Mamba [2] as the core sequence model to capture the relations across steps. The model is trained auto-regressively to predict the next step in a recipe with a set of all output parameters. To handle the presence of an invalid parameter, a dual-head output model is used: one for predicting the validity of the parameter at the target step and the other for regressing the value if deemed valid. Preliminary results show an effective direction for the application of Mamba in recipe modeling. Through this work, we envisage a state-space-based LLM to build an intelligent system that enables recipe completion tasks and supports other downstream knowledge-based inference on semiconductor recipes.
References [1] J. Wang, H. Jiang, Y. Liu, C. Ma, X. Zhang, Y. Pan, M. Liu, P. Gu, S. Xia, W. Li, et al., “A comprehensive review of multimodal large language models: Performance and challenges across different tasks,” arXiv preprint arXiv:2408.01319, 2024. [2] A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” arXiv preprint arXiv:2312.00752, 2023. [3] Y. Li, H. Zhao, H. Jiang, Y. Pan, Z. Liu, Z. Wu, P. Shu, J. Tian, T. Yang, S. Xu, et al., “Large language models for manufacturing,” arXiv preprint arXiv:2410.21418, 2024. [4] H. Wang, M. Liu, and W. Shen, “Industrial-generative pre-trained transformer for intelligent manufacturing systems,” IET Collaborative Intelligent Manufacturing, vol. 5, no. 2, p. e12078, 2023.