Only Python Neural Net: Image Classification on CIFAR-10

image classification, Python neural network tutorial,

In this blog article, we will study the process of image classification using a neural network on the CIFAR-10 dataset. The CIFAR-10 dataset consists of 60,000 32×32 color images in 10 different classes. We will go through the entire process, from loading and preprocessing the data to building, training, and evaluating a neural network.

Sections

  1. Introduction to Image Classification and CIFAR-10
  2. Loading and Preprocessing the Data
  3. Building the Neural Network
  4. Training the Neural Network
  5. Evaluating the Model
  6. Improvements and Next Steps

1. Introduction to Image Classification and CIFAR-10

Image Classification: Image classification is a process where we categorize images into predefined classes. For example, classifying an image as a cat, dog, airplane, etc.

CIFAR-10 Dataset: The CIFAR-10 dataset is a collection of images used for training machine learning and computer vision algorithms. It has 60,000 images divided into 10 classes (e.g., airplane, automobile, bird, cat, etc.). Each image is 32×32 pixels and is in color.

2. Loading and Preprocessing the Data

Before we can use the CIFAR-10 dataset, we need to load and preprocess it. The preprocessing steps include normalizing the image data and splitting it into training and testing sets.

Data Loading

We will use a helper function to load the CIFAR-10 data from a binary file. Here’s how we can implement the data loading function:

import numpy as np
import pickle

def unpickle(file):
with open(file, 'rb') as fo:
dict = pickle.load(fo, encoding='bytes')
return dict

def load_cifar10():
# Implement CIFAR-10 data loading here
# For brevity, we'll assume the data is loaded into X_train, y_train, X_test, y_test
# X_train shape: (50000, 3072), y_train shape: (50000,)
# X_test shape: (10000, 3072), y_test shape: (10000,)
pass

Data Preprocessing

We normalize the pixel values to be between 0 and 1 by dividing by 255.

# Normalize the data
X_train = X_train / 255.0
X_test = X_test / 255.0

3. Building the Neural Network

We will build a simple feedforward neural network with one hidden layer. This network will consist of an input layer, a hidden layer with ReLU activation, and an output layer with Softmax activation.

Neural Network Class

Here’s the complete implementation of the neural network:

class NeuralNetwork:
def __init__(self, input_size, hidden_size, output_size):
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size

# Initialize weights and biases
self.W1 = np.random.randn(self.input_size, self.hidden_size) / np.sqrt(self.input_size)
self.b1 = np.zeros((1, self.hidden_size))
self.W2 = np.random.randn(self.hidden_size, self.output_size) / np.sqrt(self.hidden_size)
self.b2 = np.zeros((1, self.output_size))

def forward(self, X):
# Forward pass
self.z1 = np.dot(X, self.W1) + self.b1
self.a1 = self.relu(self.z1)
self.z2 = np.dot(self.a1, self.W2) + self.b2
self.a2 = self.softmax(self.z2)
return self.a2

def backward(self, X, y, output, learning_rate):
# Backward pass
one_hot_y = self.one_hot_encode(y)
dz2 = output - one_hot_y
dW2 = np.dot(self.a1.T, dz2)
db2 = np.sum(dz2, axis=0, keepdims=True)
dz1 = np.dot(dz2, self.W2.T) * self.relu_derivative(self.z1)
dW1 = np.dot(X.T, dz1)
db1 = np.sum(dz1, axis=0)

# Update parameters
self.W2 -= learning_rate * dW2
self.b2 -= learning_rate * db2
self.W1 -= learning_rate * dW1
self.b1 -= learning_rate * db1

def train(self, X, y, epochs, batch_size, learning_rate):
for epoch in range(epochs):
for i in range(0, X.shape[0], batch_size):
X_batch = X[i:i+batch_size]
y_batch = y[i:i+batch_size]

output = self.forward(X_batch)
self.backward(X_batch, y_batch, output, learning_rate)

if epoch % 10 == 0:
loss = self.calculate_loss(X, y)
print(f"Epoch {epoch}, Loss: {loss}")

def predict(self, X):
output = self.forward(X)
return np.argmax(output, axis=1)

def calculate_loss(self, X, y):
output = self.forward(X)
one_hot_y = self.one_hot_encode(y)
return -np.sum(one_hot_y * np.log(output + 1e-8)) / X.shape[0]

@staticmethod
def relu(x):
return np.maximum(0, x)

@staticmethod
def relu_derivative(x):
return (x > 0).astype(float)

@staticmethod
def softmax(x):
exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
return exp_x / np.sum(exp_x, axis=1, keepdims=True)

@staticmethod
def one_hot_encode(y):
one_hot = np.zeros((y.size, 10))
one_hot[np.arange(y.size), y] = 1
return one_hot

4. Training the Neural Network

The training process involves feeding the training data through the network, computing the loss, and updating the weights using backpropagation.

if __name__ == "__main__":
# Load and preprocess data
X_train, y_train, X_test, y_test = load_cifar10()

# Normalize the data
X_train = X_train / 255.0
X_test = X_test / 255.0

# Initialize and train the model
input_size = 3072 # 32 * 32 * 3
hidden_size = 128
output_size = 10
model = NeuralNetwork(input_size, hidden_size, output_size)

model.train(X_train, y_train, epochs=100, batch_size=32, learning_rate=0.01)

5. Evaluating the Model

After training the model, we evaluate its performance on both the training and testing sets by calculating the accuracy.

    # Evaluate the model
train_predictions = model.predict(X_train)
test_predictions = model.predict(X_test)

train_accuracy = np.mean(train_predictions == y_train)
test_accuracy = np.mean(test_predictions == y_test)

print(f"Train Accuracy: {train_accuracy}")
print(f"Test Accuracy: {test_accuracy}")

6. Improvements and Next Steps

To improve this basic neural network, consider the following:

  • Adding More Layers: Increase the network’s depth by adding more hidden layers.
  • Regularization: Apply techniques like L1/L2 regularization or dropout to prevent overfitting.
  • Data Augmentation: Enhance the training data by applying transformations like rotation, flipping, and cropping.
  • Learning Rate Schedulers: Adjust the learning rate during training for better convergence.
  • Advanced Architectures: Explore more complex architectures like Convolutional Neural Networks (CNNs), which are particularly effective for image data.

This guide has walked you through the process of building, training, and evaluating a neural network for image classification on the CIFAR-10 dataset. By following these steps and considering potential improvements, you can develop more sophisticated models and achieve better performance in your image classification tasks.

Leave a Reply