Empowering ConvNeXt for Precision Classification of Industrial Surface Defects: A Comprehensive Approach with Multi-Scale Fusion
Main Article Content
Abstract
The industrial sector demands high precision in the classification of surface defects to ensure product quality. Traditional visual inspection methods, limited by their inconsistency and inability to scale, necessitate an advanced solution for defect detection and classification. This research introduces an enhanced ConvNeXt architecture that integrates deformable convolutions, attention mechanisms, and a multi-scale fusion approach to address the complex nature of defect imagery in industrial settings. Firstly, deformable convolutions are employed to provide the model with the flexibility to adapt to the varied and irregular shapes of surface defects. Unlike standard convolutions, these allow the network to modify its receptive field dynamically, enhancing its ability to capture crucial textural and geometric nuances. This adaptation significantly boosts the model’s accuracy in feature extraction from complex industrial surfaces. Secondly, to refine the focus within these enhanced feature maps, an attention mechanism is integrated. This mechanism prioritizes the most informative parts of the image, thus directing computational resources towards areas with potential defects. By doing so, it not only improves the model’s efficiency but also its effectiveness in recognizing subtle yet critical defect features that might otherwise be overlooked. Thirdly, the multi-scale fusion strategy is implemented to harmonize and leverage information across different scales and resolutions. This aspect of the model ensures comprehensive coverage and consistent performance across varying sizes and types of defects. It effectively aggregates the detailed local features captured by deformable convolutions and the prioritized global features enhanced by attention mechanisms, providing a robust classification output. Experimental results on diverse industrial datasets have demonstrated that the proposed model substantially outperforms existing methods in terms of both accuracy and reliability. The integration of these three advanced techniques—deformable convolutions, attention mechanisms, and multi-scale fusion—creates a synergistic effect that significantly elevates the capability of ConvNeXt for precise classification of industrial surface defects. This study not only proves the feasibility of enhancing a sophisticated architecture like ConvNeXt for industrial applications but also sets a new standard for automated defect classification systems, combining deep learning innovation with practical, impactful industrial use.