Project 3a - Image classification example #1

James Canova
Sep 16, 2021
3 min read

Project start: 18 December 2021

Project finish: 20 December 2021

Objective: introduction to using Tensorflow, Keras, Numpy and Sklearn with a simple neural network

This project is to detect fake bank notes.

Program:

This example is from the Udemy course,

"Python for Computer Vision with OpenCV and Deep Learning", Section 8 - Deep Learning for Computer Vision, Lesson 73 - Keras Basics (ref. 2)

I can't say enough about this course. It covers in detail how to use the relevant software in detail and provide good detail of theory without being overwhelming. Explanations are clear and concise.

The Python modules (installed in project 2b):

1)Tensorflow: an open-source library developed by Google to run machine learning and deep learning applications

2)Keras: an interface to TensorFlow

3)Numpy: a Python library with functionality to handle operations on arrays, matrices and tensors

4)Sklearn: a Python library with machine learning functionality including classification and regression analysis

The relationship between TensorFlow and Keras is shown in the image below. This is taken from the book, "Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow, Concepts, Tools, and Techniques to Build Intelligent Systems" [ref. 3, p296]

The following .zip file contains a Jupyter Notebook program from the Udemy course (to run Jupyter Notebook, type jupyter-notebook):

The following .zip file contains the data required for this project:

In the data file, there are a total of 1,372 data sets. 33% are used for testing.anf the rest for training.

The inputs are characteristics of images, not the images themselves. These are in the first four columns of the data. The output is 0 (representing a true bank note) or 1 (indicating a fake bank note). This is the last column of the data.

The input layer has four inputs. There is one intermediate layer with eight nodes. There is one output.

Classification metrics are captured with a confusion matrix and a classification report. If you do take the Udemy course, these metrics are mostly covered in lesson 67.

Confusion matrix:

(image from https//:towardsdatascience.com)

where:

FP: false positive

FN: false negative

TP: true positive

TN: true negative

the confusion matrix is printed with the following command:

confusion_matrix(Y_test, predictions)

the result is:

array([[248,   9],
       [ 21, 175]])

the classification report is printed with the following command:

print (classification_report(Y_test, predictions))

the result is:

              precision    recall  f1-score   support

         0.0       0.92      0.96      0.94       257
         1.0       0.95      0.89      0.92       196

    accuracy                           0.93       453
   macro avg       0.94      0.93      0.93       453
weighted avg       0.93      0.93      0.93       453

https://towardsdatascience.com/accuracy-precision-recall-or-f1-331fb37c5cb9

1)Accuracy = (TP + TN)/(TP + TN + FP + FN)

This measure is useful because it is an intuitive answer. However, it does is not do so well if the data sets are unevenly split, i.e. there is a large difference between True Positives and True Negatives. It can give a significant over-estimate of the quality of a model.

For example, suppose data sets for classification of cats vs dogs contains 99 dogs and 1 cat. The model could simply be to always guess that the animal was a dog. The accuracy would be 99%, however, If the data sets were equally split then the accuracy would go down to 50%.

2)Precision = TP/TP+FP

This measure is useful when the cost of a False Positive is high, e.g. an email is mistakenly marked as spam.

3)Recall = TP/TP+FN

This measure is useful when the cost associated with a False Negative is high, e.g. a medical test indicates that a patient does not have an illness that they actually have.

4)F1 score = 2*(Recall * Precision) / (Recall + Precision)

This measure is like accuracy except it works well when data sets are evenly split or unevenly split.

Other: tbd

If you have any problems or need clarification please contact me: jscanova@gmail.com

Project 3a - Image classification example #1

Recent Posts

Comments