Advancing Moore’s Law requires continuous innovation in early defect detection, characterization, and process decision-making across different technology nodes and process locations. High-throughput optical inspection tools have been widely incorporated into semiconductor wafer manufacturing fabs to enable rapid detection of defect excursions and identification of mismatched process equipment. These monitors are deployed in hundreds of process locations throughout the production line. To accelerate excursion control and root cause analysis, it is crucial to develop accurate Automatic Defect Classification (ADC) models that categorize defects based on their root causes. Training individual ADC models for hundreds of process locations requires a substantial amount of manual classification data and considerable engineering time for model training and maintenance. A generic model offers advantages in terms of ease of use, simplified maintenance, and scalability to new monitors. However, the wide variation in images and defect appearances across different inline process locations presents challenges to the implementation of a generic model. To address these challenges, we designed a generic classification scheme that categorizes defects into Non-Visual Defects (NVD), surface defects, underlying defects, and scratches. Additionally, defects are classified by size to assess their impact on the die. This study presents a defect image classification approach that integrates image features extracted with deep learning models and traditional machine learning techniques. We propose a generic Random Forest classification model trained on a diverse image feature set, combining outputs from the YOLO (You Only Look Once) object detection model, convolutional neural networks (CNN), and custom-engineered features derived from the comparison between defect images and reference images. The integration of deep learning-based features enriches the model’s ability to capture high-level defect patterns, while the difference features provide targeted sensitivity to deviations from a known reference image by quantifying the pixel-level discrepancies between defect images and their defect-free counterparts. Our experimental results demonstrate that the hybrid model achieves high accuracy and precision across diverse defect categories and process locations, saving engineering time from manual classification significantly. The integration of YOLO, CNN, and the custom-engineered features ensures comprehensive feature extraction, while the Random Forest's ensemble approach provides robust classification capabilities. This architecture delivers strong performance across various wafer surfaces, from bare wafers to patterned wafers, and is effective throughout the entire semiconductor manufacturing process. This research underscores the potential of combining advanced machine learning techniques with differential imaging analysis to revolutionize defect classification in industrial applications. Future work will explore the scalability of this model across different manufacturing applications and its adaptability to real-time quality control systems.