Handling variable shaped & high resolution images for multi-class classification problem
Document typeMaster thesis
Rights accessOpen Access
Convolutional Neural Networks (CNNs) are usually trained using a pre-determined fixed spatial image size. While scale-invariance is considered important for visual representations, CNNs are not scale invariant with respect to the spatial resolution of the input image; since a change in image dimension may lead to a non-linear change of their output. At the same time, there are applications (e.g. in medicine) where images come in multiple scales and shapes not leaving any space for applying common transformations with which images are deformed and shrinked losing important information. Leaving high-resolution information can be a big also burden, resource-wise, with high computational costs, memory and time requirements. Like that there has been a shift of focus in research from parameter optimization and connections readjustment towards an improved architectural design of the network; since different state of the art networks such as Xception, ResNext, PolyNet and others explore the effect of different transformations on CNNs’ learning capacity. Instead of modifying the internals of CNNs METavlitó project focuses mainly on the pre-processing stage of the network in order to handle high-resolution images, as well as, the variability in their shape. METavlitó proposes two components, one for clustering images’ resolution into buckets and a training component for scale invariant learning employing an input agnostic architecture decreasing the average GPU memory requirements. Compared to a classic approach which follows the common pre-processing transformations (resizing & cropping) before training, our solution, using the same architecture controls more the overfiting, increases the accuracy by 3 ����� 5% and decreases the average GPU memory needs by approximately 43% and thus, the total duration of the training and validation time.
All rights reserved. This work is protected by the corresponding intellectual and industrial property rights. Without prejudice to any existing legal exemptions, reproduction, distribution, public communication or transformation of this work are prohibited without permission of the copyright holder