What We Wanted

The continuous advancement of technology paves the way for the progression we see in a lot of industries nowadays – from healthcare, transportation, and manufacturing, to name a few. If they are ideal or determinist is irrelevant to this discussion. In this blog, we take a look at manufacturing industries, explore the challenges they are facing, and experiment with today’s state-of-the-art machine learning and artificial intelligence technologies which we think can help with those challenges.

In manufacturing industries, or any industry for that matter, quality assurance and control is one of the most vital parts of the process. It ensures that the goods, products, and/or services that go into stock and are delivered/provided to customers achieve the highest level of quality possible. Such quality is achieved by ensuring these deliverables meet a certain level of standard, and never go below a certain threshold. However, for manufacturing industries, ensuring such quality can be a challenging task because of four distinct reasons (Zou, et. al., 2022).

Rarity. In a production line, defects are rare to come by, which is both a good thing and a bad thing. It is a good thing because manufacturers want a high yield in terms of production. Defining what makes a good or a product bad, however, can be challenging as there are only a few to study from and associate patterns with.
Scale. Rarity can be more daunting when coupled with the second challenge which is the usual size of these defects which can be as small as millimeters or microns.
Sensitivity. The rarity and scale challenges make the manufacturing industry performance sensitive which requires highly accurate models to automate the defect detection process.
Spectrum. The automation is also a challenge as manufacturing industries have a wide range of domains or tasks, spanning components of all sizes and of different quality requirements.

These challenges pose a threat to the manufacturing industry, especially in the Philippines, where the industry is currently thriving. The semiconductor manufacturing industry in the Philippines has been silently dominating the country’s value of export. In 2022, 83% of the total export value from the country’s top commodity groups came from electronics products, beating out other minerals and other manufactured goods (PSA, 2022). This was also a 40% jump in export value from the 2021 figures. Of these electronic products, 47.4% accounted for semiconductors, with an equivalent export value of 1.6 trillion pesos (or $28.7B) (Statista, 2022). A simple miscalculation in quality equates to millions of pesos in losses which is not ideal for a business. Thus, it is imperative to create a system for early prediction and prevention of component defects in manufacturing processes.

What We Wanted

But how exactly do we create such a system that can accurately detect anomalies and make such detection interpretable, and be a guide for corrective action for the quality process? The team believes that these challenges, in the same level of difficulties faced by everyday rustics and elites, can be solved by leveraging the use of machine learning and the advancement in artificial intelligence (AI) works. The problems of defect data rarity, the scale of these defects, required model accuracy, and the wide spectrum of domains can be answered by combining concepts of advanced data mining, deep learning techniques, and explainable AI which can be used in the quality management of manufacturing industries.

Advanced Data Mining

One of the concepts under advanced data mining is outlier detection. What is an outlier? An outlier is a data point exemplifying significant deviations from usual patterns or expected behaviors. Such a concept can be applied to anomaly detection, identifying defects of an object, in this case, a manufacturing product, as an outlier from a set of determined normal patterns.

The challenge in anomaly detection, a growing area of research in computer vision (Gudovskiy, 2021), is almost aligned with those challenges faced by manufacturing industries such as anomalies usually being rare (rarity) and having a low probability to be acquired by sensors (scale). Thus, such limitations to the concept can be augmented by leveraging the use of deep learning techniques and explainable AI kits.

Deep Learning and Explainable AI

The team is working with a sponsor which experiences the same challenges if not more than the challenges presented. In line with their attempt to implement deep learning techniques to do anomaly detection on images, we did some research about these deep learning techniques that combine image classification, object recognition, and anomaly detection/segmentation. Some of the techniques we have seen are unsupervised anomaly detection and localization models such as Patch Distribution Model (PaDiM) and Conditional Normalizing Flows for Anomaly Detection (CFLOW-AD).

PaDiM

This is a ranking-based framework that concurrently detects and localizes anomalies in images in a one-class learning setting. It consists of two properties: (1) patch embedding, which is determined using a pre-trained convolutional neural network (CNN), and (2) probabilistic representation of the normal class, which is calculated using multivariate Gaussian distributions (Defard, et. al, 2020). This means the framework has two steps in aiding anomaly detection and localization namely embedding extraction and learning of the normality.

Image 1. Overview of PaDiM.

The diagram, lifted from the Defard, et. al. study (2020), provides an overview of the PaDiM mechanism. The proponents describe the mechanism as follows:

For each image patch corresponding to a position in the largest CNN feature map, PaDiM learns the Gaussian parameters from the set of N training embedding vectors, computed from N different training images and three different pre-trained CNN layers.

CFLOW-AD

This is a real-time model based on conditional normalizing flow framework adopted for anomaly detection with localization (Gudovskiy, 2021). This model also consists of two properties: (1) feature extraction through a CNN encoder mechanism and (2) likelihood estimation through a conditional normalizing flow network decoder.

Image 2. Overview of CFLOW-AD.

The diagram, lifted from the Gudovskiy, et. al. study (2021), provides an overview of the CFLOW-AD mechanism. The proponents describe the mechanism as follows:

The encoder is a CNN feature extractor with multi-scale pyramid pooling, which capture both global and local semantic information with the growing top-to-bottom receptive fields. Pooled feature vectors are processed by a set of decoders independently for each k-th scale. The decoder is a conditional normalizing flow network with a feature input and a conditional input, with spatial information form a positional encoder. The estimated multi-scale likelihoods are unsampled to the input size and added up to produce anomaly map.

Both of these models are trained through ResNet-18, an 18-layer residual learning network (He, et. al., 2015) (Abhishek, 2022), which learns residual functions with reference to the layer inputs. If you want to read more about PaDiM and CFLOW-AD, and the ResNet-18 backbone to be used in this study, you can access the following papers through these links: PaDiM, CFLOW-AD, and ResNet.

OpenVino

In addition to the above models, one other model that can aid through the interpretability of anomaly detection is Open Visual Inference and Neural Network Optimization (OpenVINO), which focuses on optimizing neural network inferences (Meel, n.d.).

These three models are included in Anomalib, a library for benchmarking, developing, and deploying deep-learning anomaly detection algorithms. It is a deep learning library that aims to collect anomaly detection algorithms for benchmarking on various datasets, providing ready-to-use implementations of anomaly detection algorithms as referred from their literature of origin (Github, n.d.) used primarily for experiment management, hyperparameter optimization, and edge inference.

Getting started with anomalib by using pip install:

pip install anomalib

Importing pertinent libraries such anomalib to get started:

from pathlib import Path

from anomalib.data.utils import read_image
from anomalib.deploy import OpenVINOInferencer
from pytorch_lightning import Trainer
from pytorch_lightning.callbacks import ModelCheckpoint

import os
from PIL import Image

What We Did

In order to test our hypothesis, the team experimented on the two aforementioned models and the inference model to a dataset of images, evaluated the results, and finally compared them to the results from the Zou, et. al. study (2022). An overview of the steps is shown below, which include data collection, data preparation, feature extraction, model training, model evaluation, and inference.

Image 3. Project Methodology Overview.

Data Collection

The dataset used for this experiment is the Visual Anomaly (VisA) dataset, which was an original curation of Zou et. al. (2022). VisA can also be retrieved from the Registry of Open Data on Amazon Web Services (AWS).

s3://amazon-visual-anomaly/VisA_20220922.tar/

It is the largest visual anomaly detection dataset containing 12 classes in 3 domains, with around 10,821 images, 9,621 of which are normal images, and 1,200 anomalous. For this study, we selected six classes namely Cashew, Chewing Gum, Fryum, Macaroni (Macaroni 1), Printed Circuit Board (PCB) 1, and PCB 3 (which will be referred to hereinafter as PCB 2).

Image 4. Sample Images of the Selected Subsets.

Type	Object	Normal	Anomalous	Anomaly Classes
Single Instance	Cashew	500	100	9
	Chewing Gum	503	100	6
	Fryum	500	100	8
Multiple Instances	Macaroni	1,000	100	7
Complex Structure	PCB1	1,004	100	4
	PCB2	1,006	100	4

Table 1. Overview of the Selected VisA Subsets.

The sample images show the different images for normal, anomalous, and the segmented anomaly for each of the selected object subsets. There are also three types of objects in this dataset: single instance and multiple instances, both simple structure, and the complex structure.

Data Preparation

The dataset includes images in the Joint Photographic Experts Group (JPEG or JPG) format. However, since the models used in the project can only process images in the Portable Network Graphics (PNG) format, it is necessary to convert the images. The following codes are used to convert the images from JPEG to PNG for both normal and anomalous sets of images per subset.

source_directory = "./cashew/Data/Images/Normal"
destination_directory = "./cashew/normal_png/"

# Create the destination directory if it doesn't exist
if not os.path.exists(destination_directory):
    os.makedirs(destination_directory)

# Iterate over the files in the source directory
for filename in os.listdir(source_directory):
    if filename.endswith(".JPG"):
        # Open the image file
        image_path = os.path.join(source_directory, filename)
        image = Image.open(image_path)

        # Convert and save the image as .png in the destination directory
        new_filename = os.path.splitext(filename)[0] + ".png"
        destination_path = os.path.join(destination_directory, new_filename)
        image.save(destination_path, "PNG")

        print(f"Converted {filename} to {new_filename} and saved to {destination_directory}")

source_directory = "./cashew/Data/Images/Anomaly/"
destination_directory = "./cashew/anomaly_png/"

# Create the destination directory if it doesn't exist
if not os.path.exists(destination_directory):
    os.makedirs(destination_directory)

# Iterate over the files in the source directory
for filename in os.listdir(source_directory):
    if filename.endswith(".JPG"):
        # Open the image file
        image_path = os.path.join(source_directory, filename)
        image = Image.open(image_path)

        # Convert and save the image as .png in the destination directory
        new_filename = os.path.splitext(filename)[0] + ".png"
        destination_path = os.path.join(destination_directory, new_filename)
        image.save(destination_path, "PNG")

        print(f"Converted {filename} to {new_filename} and saved to {destination_directory}")

After this, the images are resized to 256 x 256 pixels, before splitting the data into training (60%), validation (20%), and test (20%) sets in preparation for the model training. The following codes are used for the resize and then for the data split.

from anomalib.data.folder import Folder
from anomalib.data.task_type import TaskType

datamodule = Folder(
    root=Path.cwd() / "cashew",
    normal_dir="normal_png",
    abnormal_dir="anomaly_png",
    normal_split_ratio=0.2,
    image_size=(256, 256),
    train_batch_size=32,
    eval_batch_size=32,
    task=TaskType.CLASSIFICATION,
)
datamodule.setup()  # Split the data to train/val/test/prediction sets.
datamodule.prepare_data()  # Create train/val/test/predic dataloaders

i, data = next(enumerate(datamodule.val_dataloader()))
print(data.keys())

# Check image size
print(data["image"].shape)

Feature Extraction

As mentioned earlier, the team prepared two models for this study: Cflow and Padim, having an expected input size of 256 x 256 images, backbone architecture of resnet18, and selected layers for feature extraction. Layers 1, 2, and 3 are selected to represent different levels of abstraction in the feature hierarchy of the backbone, to capture various levels of details in the input images. The creation of the instance of the models from the anomalib.models used the following code:

from anomalib.models import Cflow

model = Cflow(
    input_size=(256, 256),
    backbone="resnet18",
    layers=["layer1", "layer2", "layer3"],
)

A list of callbacks to be used for the anomaly detection task at hand is also defined in the following code snippet used for configuring metrics, checkpoint, post-processing, normalization, and export-to-inference options. MetricsConfigurationCallback configures the metrics for the model evaluation specifying Area Under the Receiver Operating Characteristic curve or AU-ROC, and Area Under the Precision-Recall curve or AU-PR.

ModelCheckpoint saves the model’s weights during training such as the scores from the evaluation metric mentioned above. PostProcessingConfigurationCallback configures the post-processing methods to be used for anomaly detection, specifying the normalization method to be used as MinMax (also defined in the MinMaxNormalizationCallback and the threshold method as adaptive thresholding. Lastly, ExportCallback exports the trained model for inference specifying the expected input size, the directory path, and the export mode which is OpenVino (OPENVINO).

from anomalib.data.folder import Folder
from anomalib.data.task_type import TaskType

from anomalib.post_processing import NormalizationMethod, ThresholdMethod
from anomalib.utils.callbacks import (
    MetricsConfigurationCallback,
    MinMaxNormalizationCallback,
    PostProcessingConfigurationCallback,
)
from anomalib.utils.callbacks.export import ExportCallback, ExportMode

callbacks = [
    MetricsConfigurationCallback(
        task=TaskType.CLASSIFICATION,
        image_metrics=["AUROC", "AUPR"]
    ),
    ModelCheckpoint(
        mode="max",
        monitor="image_AUROC",
        \
    ),
    PostProcessingConfigurationCallback(
        normalization_method=NormalizationMethod.MIN_MAX,
        threshold_method=ThresholdMethod.ADAPTIVE,
    ),
    MinMaxNormalizationCallback(),
    ExportCallback(
        input_size=(256, 256),
        dirpath=str(Path.cwd()),
        filename="model",
        export_mode=ExportMode.OPENVINO,
    ),
]

Model Training

The model training comes next, using the following code snippets. The parameters include (1) callbacks which are defined in the preceding code snippet, (2) accelerator set to auto, automatically choosing the appropriate accelerator based on available hardware and system configuration, (3) auto_scale_batch_size set to False so the batch size is not automatically adjusted, (4) check_val_every_n_epoch set to 1, which means that validation step will be done every epoch, (5) devices set to 1, which means the training will utilize a single device, (6) gpus set to None, meaning no GPUs will be utilized, (7) max_epochs set to 10, which means the training will run to 10 epochs, (8) num_sanity_val_steps set to 0, which means no validation step will be executed during sanity checks, and (9) val_check_interval set to 1.0, which means a full validation check will be performed after every epoch.

trainer = Trainer(
    callbacks=callbacks,
    accelerator="auto",
    auto_scale_batch_size=False,
    check_val_every_n_epoch=1,
    devices=1,
    gpus=None,
    max_epochs=10,
    num_sanity_val_steps=0,
    val_check_interval=1.0,
)
trainer.fit(model=model, datamodule=datamodule)

trainer.fit starts the training process, with parameters model calling model as the instance Cflow and datamodule calling the datamodule as the input data folder.

Model Evaluation

As defined earlier, the metrics to be used for the evaluation of the models are AU-PR and AU-ROC. These are threshold-agnostic evaluation metrics used typically for anomaly detection and localization (Defard, et. al., 2020) (Gudovskiy, et. al, 2021) tasks.

AU-PR measures the performance of the models based on precision and recall, and the AU-PR score summarizes the trade-off between precision and recall. The higher the score, the better the performance of the model (wherein a score of 1.0 indicates that the model can classify all positive samples without any false positives).

AU-ROC, on the other hand, evaluates the performance of the models based on sensitivity and specificity, and the AU-ROC score summarizes the models' ability to distinguish between positive and negative samples. The higher the score, the better the performance of the model (wherein a score of 1.0 indicates that the model achieves 100% true positive rate or sensitivity).

These are evaluated further against the scores from the Zou, et. al. (2022) paper, which basically are the benchmark for our study. The following code snippet is used to determine the metric score for each of the model per object subset.

# Validation
test_results = trainer.test(model=model, datamodule=datamodule)

The result from this evaluation determines how well or how accurately our models can predict anomalies from the input images, answering one of the two things we wanted to do.

Inference

Using OPENVINO, inference engineering is executed to visualize how well the model predicts anomaly on the test images. The following steps and their corresponding code snippets are done and executed to get the inference results.

Loading a test image:

from matplotlib import pyplot as plt

image_path = "./cashew/anomaly_png/038.png"
image = read_image(path="./cashew/anomaly_png/038.png")
plt.imshow(image)

Loading the OpenVino model, creating an instance of the OpenVINOInferencer responsible for loading and running inference on OpenVino Intermediate Representation or IR model:

openvino_model_path = Path.cwd() / "weights" / "openvino" / "model.bin"
metadata_path = Path.cwd() / "weights" / "openvino" / "metadata.json"
print(openvino_model_path.exists(), metadata_path.exists())

inferencer = OpenVINOInferencer(
    path=openvino_model_path,  # Path to the OpenVINO IR model.
    metadata=metadata_path,  # Path to the metadata file.
    device="CPU",  # We would like to run it on an Intel CPU.
)

Performing inference, returning predicted outputs:

print(image.shape)
predictions = inferencer.predict(image=image)

Visualizing inference results, using the Visualizer class from anomalib.post_processing for a comprehensive visualization (VisualizationMode.FULL) and for two types of tasks namely classification and segmentation (TaskType.CLASSIFICATION and ‘TaskType.SEGMENTATION`, respectively):

from anomalib.post_processing import Visualizer, VisualizationMode
from PIL import Image

visualizer = Visualizer(mode=VisualizationMode.FULL, task=TaskType.CLASSIFICATION)
output_image = visualizer.visualize_image(predictions)
Image.fromarray(output_image)

visualizer = Visualizer(mode=VisualizationMode.FULL, task=TaskType.SEGMENTATION)
output_image = visualizer.visualize_image(predictions)
Image.fromarray(output_image)

The Image.fromarray method converts the assumed NumPy array representing the visualization result, to a Python Imaging Library or PIL image. The result from this inference determines how well or how accurately our models can segment anomalies from the input images, answering the second question of what we wanted to achieve.

The entire methodology is done for both Cflow and Padim, and for each of the selected object subsets (a total of 12 model training notebooks). All the notebooks used for training and inference may be accessed/retrieved through this Github repository or this SharePoint folder.

What We Observed

The proposed methodology for both Cflow and Padim yield desirable results, being able to classify/predict anomalous images and segment/visualize the anomalous areas within an object/component image. The following are the results for each of the selected object subsets.

Single Instance

Cashew

Who does not love to have a snack once in a while? If you are looking for a healthy snack, cashew nuts are one of your options. Cashew nuts are an important delicacy and confectionery, native from Brazil yet first introduced in India. These are a kidney- or heart-shaped seeds, with its color varying from bottle green to grayish brown, especially when already dried, and are a good source of bioactive compounds and proteins (Kluckzkovski & Martins, 2016).

For this subset, we are looking for damages on the dried fruit which are manufactured as ready-to-eat snacks.

Image 5. Results of Classification for Cashew.

Cflow and Padim gave fairly desirable results for the cashew subset, with the latter predicting Sample 3 (see Image 5) as 100% anomalous. Note that for the predicted heat map, the more red the area is, the more anomalous it is being predicted. This is applicable across all experiments on the different subsequent subsets.

Image 6. Results of Segmentation for Cashew.

The segmentation results for both models for the cashew subset are also very good, correctly masking and segmenting the anomalous areas of the component at hand.

Metric	`Cflow`	`Padim`
AU-PR	0.8858	0.8824
AU-ROC	0.8804	0.8736

Table 2. Model Evaluation for Cashew Subset.

Cflow slightly edges out Padim for the cashew subset, with 0.88 score for both AU-PR and AU-ROC metrics (see Table 2). Both models are able to classify the anomalous images fairly accurately.

Chewing Gum

People often have the fixation to do something with their mouth. Eating, smoking, and drinking - all of these activities work the mouth into overdrive. One fix to this overdrive is the chewing gum. A chewing gum is a soft, rubbery substance designed for chewing. It is a candy made by mixing a gum base with sweeteners and flavorings (West, 2023).

For this subset, similar to the previous one, we are looking at damages to the surface of the gum. Often manufactured as a smooth, glistening item, defects on this component may be easily identified.

Image 7. Results of Classification for Chewing Gum.

Cflow and Padim gave highly desirable results for the chewing gum subset, with the latter predicting Samples 1 and 3 (see Image 7) as 100% anomalous, with its heat map strongly identifying the anomalous areas.

Image 8. Results of Segmentation for Chewing Gum.

The segmentation results for both models on the chewing gum subset are highly expedient, with the predicted masks correctly identifying the anomalous areas of the component, resulting to the highly useful segmentation results for this subset.

Metric	`Cflow`	`Padim`
AU-PR	0.9896	0.9985
AU-ROC	0.9860	0.9984

Table 3. Model Evaluation for Chewing Gum Subset.

Both models have approximated perfection for both evaluation metric, having a 0.99 AU-PR and AU-ROC scores (see Table 3). Cflow and Padim can classify anomalies in the chewing gum subset almost flawlessly.

Fryum

Continuing with our fixation with the overworking mouths, let us try to check the fryum subset. Fryums are small flakes that come in circular and star shapes made of rice, tapioca, and potato flour (Lesson 11: Traditional Food Adjuncts, n.d.). They are cooked through deep-fry and served as a crispy tea-time snack. Again, we love to snack, yes? It is up to you if you would consider this a healthy snack option.

For this subset, we are looking for cracks or loose parts of the fryum shapes.

Image 9. Results of Classification for Fryum.

Both models, across all the samples examined, classified the anomalous images fairly well. Although, unlike in the previous subsets where Padim had identified some of the samples with 100% probability of being anomalous, the highest probability given in this subset is only at 79% (see Image 9).

Image 10. Results of Segmentation for Fryum.

Similar to the preceding subsets, the segmentation results for both models on the fryum subset are still highly expedient, with the predicted masks correctly identifying the anomalous areas of the component.

Metric	`Cflow`	`Padim`
AU-PR	0.8256	0.9021
AU-ROC	0.7644	0.8984

Table 4. Model Evaluation for Fryum Subset.

Padim is the better model for the fryum subset, at its AU-PR and AU-ROC are nearer 1.0 at the 0.90 range (see Table 4). This means that for fryum subset, Padim can classify anomalous images and identify anomalous pixels more accurately than Cflow.

Multiple Instances

Macaroni

Moving on to a main course, let us take a look at the macaroni subset. Derived from the Italian word maccheroni, most probably because of the pronunciation, macaroni is the name for various types of pasta shaped like a long or short tubes, with walls and central holes. It is often cooked in salted water, drained when al dente, and served with tomato-based sauces with meat or fish (La Cucina Italiana, n.d.).

For this hunger-inducing subset, we are looking at raw macaronis and the damages in its surface, often a hole or a chip.

Image 11. Results of Classification for Macaroni.

Similar to the preceding subsets, both models, across all the samples examined, classified the anomalous images fairly well, with Padim identifying Sample 1 as 100% anomalous (see Image 11).

Image 12. Results of Segmentation for Macaroni.

Similar to the preceding subsets, the segmentation results for both models on the macaroni subset are fairly good. The challenge here, however, is that there are multiple instances of the component in one image, and the defects are quite small. Both models and their predicted masks have identified these small defects well enough to have fair segmentation results.

Metric	`Cflow`	`Padim`
AU-PR	0.6700	0.8021
AU-ROC	0.7574	0.8676

Table 5. Model Evaluation for Macaroni Subset.

Similar to the fryum subset, Padim is the better model with higher AU-PR and AU-ROC scores. However, the scores are now lower than the previous subsets. Nevertheless, Padim can still classify anomalous images and identify anomalous pixels more accurately in this subset than Cflow.

Complex Structure

PCB1

Moving from the fixation of the mouth to the one of the core components in electronics, we take a look at printed circuit boards or PCBs. These are electronic assemblies that use copper conductors to create electrical connections between components (Peterson, 2020). They include a specific set of steps that aligns the purpose it will be used for.

For this subset, we are looking at a number of components at once in an image. That is, there could be multiple defects in a single image of the PCB, and we want our models to detect all of these defects accurately.

Image 13. Results of Classification for PCB 1.

Similar to the preceding subsets, both models, across all the samples examined, classified the anomalous images fairly well, with Padim identifying Sample 1 as 100% anomalous (see Image 13).

Image 14. Results of Segmentation for PCB 1.

Similar to the preceding subsets, the segmentation results for both models on the PCB1 subset are fairly good. Again, the challenge here is detecting all of the defects in the PCB with its different components at once.

Metric	`Cflow`	`Padim`
AU-PR	0.5951	0.7431
AU-ROC	0.7222	0.8975

Table 6. Model Evaluation for PCB 1 Subset.

Padim and Cflow, compared to the preceding subsets, performed their worst yet with the PCB1 subset. Padim still beats Cflow in terms of the model evaluation scores; that is, it is the better model to use for anomaly detection and localization for this subset as it can classify anomalous images more accurately than how Cflow can.

PCB2

We take a look at another subset of PCBs which the team deemed more complex given the number of components seen in a sample image.

Image 15. Results of Classification for PCB 2.

The models have a hard time classifying the anomalous images in this subset, with Padim giving only as high as 79% probability to Sample 2. Cflow even classified Sample 3 as non-anomalous, at 50% probability of being a normal PCB (see Image 15). Note that these samples are taken from the test set, or the anomalous images set.

Image 16. Results of Segmentation for PCB 2.

With the models having a hard time classifying the anomalous images, they also have a hard time identifying the correct areas or pixels in the image representative of the defect, with Cflow predicting no anomalies for Sample 3 (see Image 16)

Metric	`Cflow`	`Padim`
AU-PR	0.4717	0.5665
AU-ROC	0.6893	0.7452

Table 7. Model Evaluation for PCB 2 Subset.

As expected, the model evaluation metric scores for this subset for both models are the worst. The best score is 0.75 AU-ROC for Padim, indicating less sensitivity of the model for this subset. Cflow having a 0.47 AU-PR score indicates there may be a higher false positives in its classification of the anomalous images.

Overall Results

Metric	`PaDiM`	`Cflow`	`SimSiam`	`MoCo`	`SimCLR`
AU-PR	0.816	0.737	0.828	0.841	0.839
AU-ROC	0.880	0.800	0.812	0.830	0.826

Table 8. Comparison of Results with the Benchmark Models.

Table 8 shows the average model evaluation scores of different models used for anomaly detection and localization. Padim and Cflow are part of our implementation, while SimSiam, MoCo, and SimCLR are from the SPD or SPot-the-Difference study by Zou, et. al. in 2022.

In terms of the precision-recall curve, our implementation was not able to beat the benchmark, although Padim came close at 0.8 mark, the same range of the benchmarks. Padim, however, beats the benchmark in terms of sensitivity rate, having an AU-ROC score of 0.88, a 0.05 difference from MoCo.

What We Took Away

Referring to Table 8, our overall implementation of Padim and Cflow are an effective implementation of anomaly detection and localization. The model evaluation metric scores are comparable to our set benchmark, with Padim beating out every model in terms of sensitivity rate. Hence, our implementation is as effective as, if not more effective than, the implementation of our benchmark models.

Focusing on each of the subset, the team states that there is no one-model-fits-all for this anomaly detection task. With our experiment involving six out of the twelve subsets of the VisA dataset, we see how Padim and Cflow's performances change for each subset. If we were to be asked then what the rule to thumb is in selecting the better model for a certain object, the team states that for simpler objects or components such as chewing gum and cashew, any of the models can be used and will provide fairly accurate results. For more complex objects, however, Padim is the better model to use as it exemplifies high sensivity across all subsets while maintaining high accuracy in terms of classifying anomalous images.

The inference time for these models, for the size of the subsets, is quite fast. Thus, in terms of deployability, it is highly possible. However, note that the team trained the models for only ten epochs across all the selected subsets. This low number of epochs is due to the limitations of the machines used for training (our personal machines). Thus, in order to have more training epochs which can further improve the results of the study, the implementation of the models will require a higher investment, for instance, on machine requirements. The trade-off between computational efficiency and its cost is at play.

Future Work

For proponents looking for a project and interested in taking this to the next level, the team recommend to compare the anomaly detection performance of the models presented with reconstruction-based methods (mentioned in Zou, et. al., 2022). As mentioned earlier as well, the team recommend to train the models further with a higher number of epochs which could give better results. Lastly, future works may also include the exploration of other XAI visual techniques to see if there are other inference engines that can be used for the localization tasks. The techniques to be explored may include integrated gradients and attention mechanisms.

References

Abhishek, A. V. S., et. al. (2022, May 5). Resnet18 Model With Sequential Layer For Computing Accuracy On Image Classification Dataset. International Journal of Creative Research Thoughts, 10(5), 2320-2882. https://ijcrt.org/papers/IJCRT2205235.pdf

Amazon Web Services. (n.d.). Visual Anomaly (VisA). https://registry.opendata.aws/visa/

Defard, T., et. al. (2020, Nov 17). PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization. https://arxiv.org/pdf/2011.08785.pdf

Github. (n.d.). Anomalib. https://github.com/openvinotoolkit/anomalib

Gudovskiy, D., et. al. (2021, Jul 27). CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows. https://arxiv.org/pdf/2107.12571.pdf

He, K., et. al. (2015, Dec 10). Deep Residual Learning for Image Recognition. https://arxiv.org/pdf/1512.03385v1.pdf

Kluczkovski, A. M. & Martins, M. (2016). Cashew Nuts. Encyclopedia of Food and Health, 683-686. https://doi.org/10.1016/B978-0-12-384947-2.00123-9

La Cucina Italiana. (n.d.). Glossary – Maccheroni pasta. https://www.lacucinaitaliana.com/glossary/maccheroni-pasta?refresh_ce=

Lesson 11: Traditional Food Adjuncts. (n.d.). Convenience and Health Foods 3 (1+2). https://www.healthline.com/nutrition/chewing-gum-good-or-bad

Meel, V. (n.d.). What is OpenVINO? – The Ultimate Overview in 2023. viso.ai. https://viso.ai/computer-vision/intel-openvino-toolkit-overview/

Peterson, Z. (2020, Oct 5). What is a PCB? Altium. https://resources.altium.com/p/what-is-a-pcb

Philippine Statistics Authority (PSA). (2022, Oct 11). Highlights of the Philippine Export and Import Statistics August 2022 (Preliminary). https://psa.gov.ph/content/highlights-philippine-export-and-import-statistics-august-2022-preliminary

Statista Research Department. (2023, Mar 9). Export share of semiconductors Philippines 2020-2022. https://www.statista.com/statistics/1264606/philippines-export-share-of-semiconductors

West, H. (2023, May 4). Chewing Gum: Good or Bad? healthline. https://www.healthline.com/nutrition/chewing-gum-good-or-bad

Zou, Y., et. al. (2022, Jul 28). SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation. https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136900389.pdf

Generative AI Documentation

ChatGPT aided the development of this notebook/blog post through the following:

annotation of the code used in the model training notebooks
initial brief definition of terms (e.g., JPEG, PNG, AU-PR, AU-ROC) before confirming with in-text references
improvement to the few statements (limited to one to two) within the blog post
markdown syntax confirmation

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
cashew_cflow		cashew_cflow
cashew_padim		cashew_padim
chewinggum_cflow		chewinggum_cflow
chewinggum_padim		chewinggum_padim
diag		diag
fryum_cflow		fryum_cflow
fryum_padim		fryum_padim
macaroni_cflow		macaroni_cflow
macaroni_padim		macaroni_padim
pcb1_cflow		pcb1_cflow
pcb1_padim		pcb1_padim
pcb3_cflow		pcb3_cflow
pcb3_padim		pcb3_padim
README.md		README.md
ml3_finaloutput_blog.ipynb		ml3_finaloutput_blog.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What We Wanted

What We Did

What We Observed

What We Took Away

About

Releases

Packages

Languages

BJEnrik/deep-learning-proactive-quality-control

Folders and files

Latest commit

History

Repository files navigation

What We Wanted

What We Did

What We Observed

What We Took Away

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages