In this project, we detected anomalies in industrial product images using machine learning and deep learning models. We also analyzed the product images and performed data visualization and modeling to better understand and interpret the results.
The dataset used in this study is the MVTec AD dataset. MVTec AD is a benchmark dataset for evaluating anomaly detection methods, with a focus on industrial inspection. It contains over 5,000 high-resolution images divided into fifteen object and texture categories. Each category includes a set of defect-free training images and a test set containing images with various types of defects as well as defect-free images.
We have done the following steps in our project:
- Data Exploration & Visualization: Analyze product images to understand and identify defect patterns.
- Image Preprocessing & Feature Engineering: Prepare and transform images for ML/DL models.
- Machine Learning & Deep Learning Models: Design and train models such as Random Forest, CNN and Transfer Learning models.
- Model Evaluation: Assess performance of trained models.
- Prediction Interpretation: Use Grad-CAM to visualize defect regions.
Read our report: Final Report
Explore the anomaly detection demo: Industrial Anomaly Detection App
This repo contains jupyter notebooks, reports and final models for our project.
To create python environments for running different components of the project, different requirements.txt files will be provided:
- requirements.txt: for running the streamlit application for demos of the final models
- requirements_skw32.txt: for running any notebooks by author skw32
- requirementsi_Karine_KAVITHA.txt: for running any notebooks by author kavithaAra
- Requirementes_makhlouf_hanouti: for running any notebooks by author COMHANOUTI
- Les modèles avec segmentation entraînés realisés par Makhlouf HANOUTI étant volumineux, ils ne sont pas stockés dans le dépôt GitHub. Ils sont disponibles via le lien Google Drive suivant: https://drive.google.com/drive/folders/1-NRbQwA5-CATleMWONc361DOFB_-7Qcs?usp=drive_link.
├── LICENSE
├── README.md <- The top-level README for developers using this project.
├── data <- Should be in your computer but not on Github (only in .gitignore)
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's name, and a short `-` delimited description, e.g.
│ `1.0-alban-data-exploration`.
│
├── references <- Data dictionaries, manuals, links, and all other explanatory materials.
│
├── reports <- The reports that you'll make during this project as PDF
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── src <- Source code for use in this project.
│ ├── __init__.py <- Makes src a Python module
│ │
│ ├── features <- Scripts to turn raw data into features for modeling
│ │ └── build_features.py
│ │
│ ├── models <- Scripts to train models and then use trained models to make
│ │ │ predictions
│ │ ├── predict_model.py
│ │ └── train_model.py
│ │
│ ├── visualization <- Scripts to create exploratory and results oriented visualizations
│ │ └── visualize.py
Project based on the cookiecutter data science project template. #cookiecutterdatascience