Deep learning approaches, represented by convolutional neural network (CNN), have achieved outstanding performance in machine vision tasks. However, their practical employment is often constrained by real-world data limitations such as incomplete labe...
Deep learning approaches, represented by convolutional neural network (CNN), have achieved outstanding performance in machine vision tasks. However, their practical employment is often constrained by real-world data limitations such as incomplete labels and image distortions. This dissertation proposes effective learning strategies to enhance the performance and robustness of CNNs under three representative limited data scenarios in machine vision including partially labeled training datasets, weakly labeled training datasets, and distorted query images.
First, we propose a domain-aware semi-supervised representation learning method for image analysis. Existing methods typically assume either fully labeled or entirely unlabeled datasets, making them less practical in scenarios where only a subset of instances is labeled. To address this for wafer map analysis, our method combines labeled and unlabeled wafer maps for representation learning while enforcing rotational invariance constraints.
Second, we present a weakly supervised learning method for detecting defective cells (fine-grained) using only module-level (coarse-grained) annotations, significantly reducing the annotation costs compared to traditional cell-level annotation. The method is based on the assumption that all cells in a normal module are non-defective, whereas at least one defective cell exists in a defective module. By leveraging this weak supervision, accurate cell-level defect detection can be achieved without fine-grained annotations.
Third, we develop a distortion-robust training method for CNNs that enables robust classification under image distortions. Instead of preprocessing or retraining with augmented data, our method incorporates consistency regularization into the supervised learning objective, encouraging the CNN to produce consistent predictions across distorted variants of an image.
The effectiveness of the proposed methods was demonstrated through experimental evaluations adapted to the specific challenges of each application domain, highlighting their applicability and potential to enhance machine vision-based automation in real-world industrial environments where access to high-quality data is limited.