ResNet: A Comprehensive Overview (2021)


A series of breakthroughs in the field of computer vision has occurred in recent years. We are getting state-of-the-art results on problems like image classification and image recognition, especially since the introduction of deep Convolutional neural networks. In this article, we will learn, resnet, resnet architecture, resnet model, and densenet vs resnet.

  1. Revisiting ResNet
  2. Recent Variants and Interpretations of ResNet

1) Revisiting ResNet

Given enough power, we know that a feedforward network with a single layer can reflect any function, according to the universal approximation theorem. On the other hand, the layer could be huge, and the network could be vulnerable to overfitting the data. As a result, there is a growing consensus among researchers that our network architecture must become more complex.

However, merely piling layers together would not increase network width. Deep networks are difficult to train due to the well-known vanishing gradient problem. When a gradient is back-propagated to earlier layers, repeated multiplication will cause the gradient to become infinitely small. As a result, as the network becomes larger, the output becomes saturated, if not quickly degraded. We usually stack some additional layers in Deep Neural Networks to solve a complex problem, which improves accuracy and efficiency.

The idea behind incorporating more layers is that these layers can learn more sophisticated features as time goes by. In the case of image recognition, the first layer would learn to recognize edges, the second layer would learn to distinguish patterns, and the third layer would learn to detect objects, and so on. However, it has been discovered that the standard Convolutional neural network model has a maximal depth threshold. A plot depicting error percent on training and testing data for a 20 layer Network and a 56 layer Network. There is evidence that these types of networks are easier to optimize and that they can achieve precision from significantly increased depth.

2) Recent Variants and Interpretations of ResNet

Resnet ‘s architecture is being extensively researched as it grows in importance in the science community. In this segment, I’ll first present some new ResNet -based architectures, followed by a paper that explains how to handle ResNet as an ensemble of several smaller networks. Any subsequent winning architecture uses more layers in a deep neural network to reduce the error rate after the first CNN-based architecture (AlexNet) won the ImageNet 2012 competition. This works for a small number of layers, but as the number of layers grows, a common deep learning issue called Vanishing/Exploding gradient emerges. As a result, the gradient becomes 0 or excessively high.

As a result, as the number of layers increases, so does the training and test error rate. This architecture implemented the idea of the Residual Network to solve the problem of the vanishing/exploding gradient. We use a strategy called skip connections in this network. The skip link bypasses a few layers of preparation and leads directly to the production. This network employs a VGG-19-inspired 34-layer plain network architecture, after which the shortcut connectivity is introduced. The architecture is then transformed into a residual network as a result of these shortcut links. According to what we’ve seen so far, raising the depth can improve the network’s precision as long as over-fitting is avoided.

However, as the depth of the network grows, the signal needed to adjust the weights, which results from comparing ground truth and estimation at the network’s end, becomes very small at the earlier layers. It basically means that the learning in earlier layers is almost non-existent. This is referred to as a vanishing gradient. The second issue with training deeper networks is that the optimization is done on a large parameter space, resulting in naively adding layers and higher training error. Residual networks, as seen in the figure, allow for the training of deep networks by building the network using modules called residual models. This is referred to as a deterioration problem.


Residual Networks, or resnet, instead of learning unreferenced functions, study residual functions regarding the layer inputs. Residual nets allow these layers to match a residual mapping rather than hoping that each several stacked layers explicitly fit a desired underlying mapping. They build networks by stacking residual blocks on top of each other: a ResNet-50, for example, has fifty layers made up of these blocks. Even though the 18 layer network is simply a subspace of the 34 layer network, it works better.

If the network is wider, ResNet outperforms by a large amount. ResNet-50 is a convolutional neural network that has been educated on millions of images from the ImageNet database. The network has 50 layers that can sort images into 1000 different object types, including keyboards, mice, pencils, and various animals. As a result, the network has learned a variety of rich feature representations for various images. The network’s image input resolution is 224 by 224 pixels. There are also ResNet18, ResNet101, and ResNet152 versions.

If you are interested in making a career in the Data Science domain, our 11-month in-person Postgraduate Certificate Diploma in Data Science course can help you immensely in becoming a successful Data Science professional. 

Also Read

Related Articles

Please wait while your application is being created.
Request Callback