Introduction:¶

This blog will provide an example of how to properly solve a problem of deep learning.

In this case I will be using this dataset from kaggle. This dataset contains images of human cells. There are two classes: the healthy ones and the (malaria) infected ones.

So, the purpose for this dataset is to detect whether a cell is healthy or infected.

Dependences¶

pytorch: neural network creation and tools for training and testing it

Why PyTorch¶

There are a few reasons why I've used PyTorch for this problem.

One of the things to clarify is that in this webpage there are contributions using both libraries. This is why any of the two can be chosen. Also, currently, there is a heavy-changing version of Tensorflow that is not fully developed yet. This is why I think that a blog using an old version of Tensorflow might get outdated soon and it's more nutritive to learn some of the PyTorch functionalities in this meantime.

Data Analysis¶

One of the musts we have when training a deep learning model with images is to look at the class distribution. This case is a binary classification and it has 13780 images for each classification.

This numbers can be obtained just by looking at the number of images in each class folder.

Also, one interesting thing to do is to look at some image samples to get an idea of the problem that's being dealt with. One example of each follow:

A healthy cell (top) and an infected cell (bottom).

The Neural Network Structure¶

We will see the code later, but the basic structure is composed of 3 parts:

Convolution + Max Pooling Layers: This part consists of 3 2-layer groups that are always a Convolution Layer and a Max Pooling Layer. The Convolution Layers use a ReLU activation function. The purpose of this part is to extract all the features from the whole image.
Linear ReLU Layers: The next part consists of 3 Linear Layers with a ReLU activation function. These 3 layers are placed here to learn the feature combination that makes a sample positive or negative. This is the 'decision' part.
Dropout Layer: The last layer of the network is a Dropout Layer. Dropout Layers are normally used to despise part of the activation. This causes to drop some of the overfitting the net might be doing.

Training¶

The usual way of training a neural network with PyTorch is to create a Python class and then use it in the train script.

The class features the whole structure creation, while the script uses it iteratively. This class is in a file named "model.py", and will be included onwards.

import torch.optim
import torch.nn as nn
import torch.nn.functional as func

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv_1 = nn.Conv2d(3, 16, kernel_size=4)
        self.pool_1 = nn.MaxPool2d(kernel_size=(3, 3), stride=2)
        self.conv_2 = nn.Conv2d(16, 32, kernel_size=(4, 4))
        self.pool_2 = nn.MaxPool2d(kernel_size=(3, 3), stride=2)
        self.conv_3 = nn.Conv2d(32, 64, kernel_size=(4, 4))
        self.pool_3 = nn.MaxPool2d(kernel_size=(4, 4), stride=2)
        self.fc_1 = nn.Linear(4 * 4 * 64, 500)
        self.fc_2 = nn.Linear(500, 100)
        self.fc_3 = nn.Linear(100, 2)
        self.dropout = nn.Dropout(0.25)

    def forward(self, x):
        x = self.pool_1(func.relu(self.conv_1(x)))
        x = self.dropout(x)
        x = self.pool_2(func.relu(self.conv_2(x)))
        x = self.dropout(x)
        x = self.pool_3(func.relu(self.conv_3(x)))
        x = self.dropout(x)
        x = x.reshape(-1, 4 * 4 * 64)
        x = func.relu(self.fc_1(x))
        x = self.dropout(x)
        x = func.relu(self.fc_2(x))
        x = self.dropout(x)
        x = self.fc_3(x)
        return x

    def get_criterion_and_optimizer(self):
        criterion = nn.CrossEntropyLoss()
        optimizer = torch.optim.Adam(self.parameters(), lr=0.001)

        return criterion, optimizer

First we will create the image loaders that will feed images to our net.

It's important to shuffle the samples, because the training will vary.

import os
import torch
import numpy as np

import datasets as ds
import model

test_prop = 0.2
datadir = 'data'

train_transforms = transforms.Compose([transforms.RandomResizedCrop(64),
                                       transforms.ToTensor(),
                                       transforms.Normalize([0.5, 0.5, 0.5],
                                                            [0.5, 0.5, 0.5])
                                       ])

test_transforms = transforms.Compose([transforms.RandomResizedCrop(64),
                                      transforms.ToTensor(),
                                      transforms.Normalize([0.5, 0.5, 0.5],
                                                           [0.5, 0.5, 0.5])
                                      ])

train_data = datasets.ImageFolder(datadir, transform=train_transforms)
test_data = datasets.ImageFolder(datadir, transform=test_transforms)

num_train = len(train_data)
indexes = list(range(num_train))
n_test_samples = int(np.floor(test_prop * num_train))
np.random.shuffle(indexes)

train_idx, test_idx = indexes[n_test_samples:], indexes[:n_test_samples]

train_sampler = SubsetRandomSampler(train_idx)
test_sampler = SubsetRandomSampler(test_idx)

train_loader = torch.utils.data.DataLoader(train_data, sampler=train_sampler, batch_size=64)
test_loader = torch.utils.data.DataLoader(test_data, sampler=test_sampler, batch_size=64)

Here we will initialize the basic constants, log info and create the empty (untrained) network.

model_data_path = 'model.data'
n_epochs = 20

print('Detected classes: ', end='')
print(train_loader.dataset.classes)

model = model.Net()
criterion, optimizer = model.get_criterion_and_optimizer()
test_loss_min = np.Inf

print()
print('Starting Training; n_epochs={}'.format(n_epochs))

Now, we will train, test and measure metrics once per each epoch.

for epoch in range(1, n_epochs + 1):

    # set losses to 0 at the beginning of each epoch
    train_loss = 0.0
    test_loss = 0.0

    # start training mode
    print('Epoch {:02d}: Starting Training'.format(epoch))
    model.train()

    # train by batches
    for data, target in train_loader:
        # clear the gradients of all optimized variables
        optimizer.zero_grad()
        output = model(data)

        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        train_loss += loss.item() * data.size(0)

    # start evaluation mode
    print('Epoch {:02d}: Starting Testing'.format(epoch))
    model.eval()

    # evaluate by batches
    for data, target in test_loader:
        output = model(data)
        loss = criterion(output, target)
        test_loss += loss.item() * data.size(0)

    # calulate the average losses
    avg_train_loss = train_loss / len(train_loader.dataset)
    avg_test_loss = test_loss / len(test_loader.dataset)
    print('Epoch {:02d}: \tTraining Loss: {:.6f} \tTest Loss: {:.6f}'.format(epoch, avg_train_loss, avg_test_loss))

    # save the model if it means an upgrade from the best result we've had
    if avg_test_loss <= test_loss_min:
        print('Test loss decreased ({:.6f} --> {:.6f}).  Saving model.'.format(test_loss_min, avg_test_loss))
        torch.save(model.state_dict(), model_data_path)
        test_loss_min = avg_test_loss

Results¶

With this model we've got decent results:

Accuracy: 0.86

Precision: 0.77

Recall: 0.93

However, this problem is specially interesting because of a restriction the nature of this problem sets.

This is a model to be used in a healthcare field. This means that the metrics of the model have to be inspected even more carefully than usual, because the usages of this model might make people's health depend on it. In this case (considering an infected cell the positive class) we need an incredibly high recall for the model. This is because the recall tells us what proportion of infected cells are detected, and we must keep that proportion as high as possible in order to miss the less infected patients as possible.

2019, Jun 07

Detecting Malaria Cells