What is Image Classification?

Image classification means assigning labels to an image based on a predefined set of classes.

Practically this means that our task is to analyze an input image and return a label that categorizes the image. That label is always from a predefined set of possible categories.

For example – Check here.

Let us understand image classification through an analogy.

Explanation of image classification through the body parts example

In a fourth-standard classroom, teacher Smita is teaching organs of the body to students. The teacher will show the children an image of each organ and give a title/label for it. She will show an image of a heart and point out to students that this is the heart. Similarly, she will show images of all the organs with their labels. The teacher will repeat this exercise and do revisions until it is clear to the students which organ looks like what.

In image classification, we teach the system by showing images and labels of predefined categories.

How do we create image classification models? How do we teach the systems to classify the images accurately? 

We need to follow some steps to create an Image Classifier. Technically, we need to follow a classification pipeline to train the system to classify images.

Classification Pipeline

Image classification block diagram

The basic idea is to build an image classification model with Convolutional Neural Networks. We use a data-driven approach despite coding a rule-based algorithm to classify images. In a data-driven approach, we supply examples of what each category looks like and then teach our algorithm to recognize the difference between the categories using these examples.

We call these examples – the training dataset. It consists of images and labels associated with each image like {tom, jerry, spike}. 

It is crucial to give these examples to the system for supervised learning. These labels teach the system how to recognize each category. (Recall the organs of the body example – how the teacher points out which organ looks like what)

Now that we know what an image classifier model is. Let us understand how to create a Deep-Learning Image classifier model step-by-step.

Classification Pipeline:

Image Classification steps: 1. Collect Dataset 2. Split Dataset 3. Autotune 4. Train and Test

The classification pipeline has 5 steps: 

  1. Collect Data: collect and preprocess the raw data.  
  2. Split Data: split the preprocessed data into train, validation, and test data. 
  3. Autotune: find the best parameters on the validation data. 
  4. Train: train the final model with the best parameters on all the data. 
  5. Test: get metrics and predictions on test data. 

Step 1: Gather your Dataset

We need images and labels associated with each image. These images and their labels form our dataset. The labels should be from a finite and predefined set of categories like:

Categories – tom, jerry, spike.

Things to keep in mind:

  • The number of images from each category should be approximately uniform. Like 1000 images for Tom, 1000 for Jerry, and 1000 for Spike.
  • If we keep 2000 images for Jerry, our classifier will become naturally biased to this heavily represented category.
  • To prevent bias, avoid class imbalance and gather a uniform number of images for each category.

Step 2: Split Your Dataset

After gathering the initial data, we split it into two parts:

  1. A training set
  2. A testing set

A training set teaches our classifier what each category looks like. The classifier makes predictions on input data and then corrects itself if predictions are wrong.

After the classifier is trained, we evaluate the performance on a testing set.

You can split the training and testing set in the following ways:Pie Chart of train data and test data

Validation Set :

This data is from the training data and used as “fake test data” so we can tune our hyperparameters (Autotuning). We generally allocate 10%-20% of the training dataset for validation.

Step 3: Train Your Network

Once we are ready with all sets of the training data, we can start training our network. Our goal is to teach our neural network each category in our labeled data. When the model makes a mistake, it learns and improves itself.

Step 4: Evaluate

Last, we need to evaluate the performance of our trained network. We present each of the images in our testing dataset to the network and ask it to predict the label for that image. We tabulate the predictions of the trained model and compare them to the actual category of the image. Thus, we can determine the number of classifications our model got correct.

Image Classification Output

A deep-learning image classifier is ready, using a data-driven approach and supervised learning method.