AlexNet: An Important Architecture In 3 Easy Points

Ajay Ohri


Like human learning, machine learning also needs to be upgraded from time to time. An ImageNet competition took place in 2012, and it was won by a student named Alex, and that was the birth of alexnet. An alexnet architecture consists of convolutional layers, normalization layers, fully connected layers, softmax layers, and max-pooling layers. Thus, before understanding what is Alexnet we need to understand what do you mean by the convolutional layer. A convolutional layer is an artificial neural network that is designed to process pixel data. In other words, it is basically a powerful artificial intelligence than traditional neural networks used for image recognition.

CNN or Convolutional neural network can be seen in different architectures like LeNet-5, but another CNN architecture, alexnet, outperformed this. Now, alexnet architecture won the competition of annual Image Net in 2012 by producing an error rate of 15.3%, which was almost 10%lower if compared to the second place architecture error rate of 26.2%. This almost more than 10% error rate different rate speaks for itself how much a revolution was needed in this architecture. This is the reason alexnet became one of the leading architectures to understand the deep learning of computer visions.

  1. The Problem
  2. The Dataset
  3. Alexnet

1) The Problem

  • LeNet was one of the earlier CNN which was really popular for recognizing the handwritten number on the cheques but which were digitized in pixel images. But Lenet couldn’t process high-resolution images due to limitations of computing resources. 
  • Another problem that was faced earlier was The Vanishing Gradient issue. This occurred as the earlier architectures used saturated activation functions like sigmoid etc. This makes it difficult for the network to train. 
  • The next problem that needed to be addressed was to avoid any overfitting issues or learned variables becoming higher than the allowed limit. Higher learned variables mean amplifying the surrounded neurons which surround the neutron that we need to focus on. This increases the complication. 
  • There was a need to reduce the training error rate at an equivalent faster rate. 
  • One more important thing which was missing earlier was the absence of CNN architecture to handle high challenging datasets with a simple training module.

2) The Dataset

Image net is one of the biggest image set containing more than 15 million images of more than 20000 different categories and having subcategories which again contain a minimum of 500 images. Image net is also one of the highest competitions which are organized annually, and it was won by alexanet in 2012. In this global annual contest, different software programs compete against each other for image classification and detection of images through testing of data from million training, testing, validation images.

The top 5 error rates are chosen, and the best algorithm with the least error rate among these 5 is chosen as a winner. In 2017, SENet was able to achieve a record-dropping error rate of 2.251%. This artificial algorithm error rate was almost half of the top 5 error rates of humans, which showed how far machine learning has come in all these years. 

3) Alexnet

Alexnet has revolutionized the field of machine learning. AlexNet is constituted of 5 convolutional layers and 3 fully connected layers. That’s the reason it was better than Lenet, as it contains more filters per layer and stacked convolutional layers. Each such filter is further connected with the activation function. 

  • Using Relu (Rectified Linear Unit ) activation function in every filter helps in achieving the same efficiency output but by accelerating the speed by almost 6 times than traditional architectures functions like Tanh or Sigmoid. By incorporating the nonlinearity function of ReLu, alexnet was able to train deep sets of CNNs at a much faster rate. 
  • There was a need to address the issue of overfitting, and to counter that problem  AlexaNet adopted the technique of dropping out layers. In drop out, a neutron is dropped with a probability of 0.5 to avoid forward or backward propagation. But this also increases the training time to double. 
  • Another key point in alexnet architecture was using the Overlapping Max pooling feature. Creating overlapped receptive fields is a technique to reduce overfitting. This helps in reducing the top 1 and top 5 errors by 0.4% and 0.3%, respectively. 
  • Local Response Normalisation was also introduced by AlexNet to intensify the excited neutron and dampen the surrounding neurons and hence preventing the learned variables from increasing unnecessarily.


Now, after reading the above article, we have a basic knowledge of what is alexnet. Initially, proper data sets were absent to run deep machine learning algorithms, and that was the reason it was never popular in the business world or even the real world. But now it is possible with alexnet as it comprises 8 layers with more than 62 million learnable parameters.  It is a leading architecture that has not only opened a new era of research in machine learning but has also provided deep learning modules for easy implementation of the same.

If you are interested in making a career in the Data Science domain, our 11-month in-person Postgraduate Certificate Diploma in Data Science course can help you immensely in becoming a successful Data Science professional. 

Also Read

Related Articles

Please wait while your application is being created.
Request Callback