Mini VGG Network Structure Project
Description
Use google colab ipnyb or jupyter notebook for each experiment on one model variant. Name the files as minivgg, var1, var2 and var3 and var4.
Implement a mini-vgg network and several of its variants. Train and test the models on the cifar-10 dataset (https://www.cs.toronto.edu/~kriz/cifar.html ). Investigate and compare the performance of the variants to the original model. Write half to one page to 1) summarize your experiment results and discuss 2) the classification performance of the models
as well as 3) their size (# of parameters) and 4) the computation time to train them.
Mini-VGG network structure:
Layer Type(window size) n filters
1 Conv3 64
2 Conv3 64
3 Maxpool 2×2
4 Conv3 128
5 Conv3 128
6 Maxpool 2×2
7 Conv3 256
8 Conv3 256
9 Maxpool 2×2
10* fully Connected 512
11** soft-max
* Note your need a reshape layer before this layer to reshape the data
** Use cross entropy loss (torch.nn.CrossEntropyLoss or tf.nn.softmax_cross_entropy_with_logits()) feed
the loss function with the logits before softmax activation but get the prediction for accuracy after
softmax activation)
Report the performance of each network by doing the following:
A) Plot training loss vs validation loss
B) Plot training accuracy vs validation accuracy
C) Calculate test accuracy
1. Implement the mini-vgg model and report its performance. Use ReLU activation function for all the all conv/fc layers except the last one.
2. Variant 1: Change the ReLU activation functions to SELU and Swish. Would the performance improve?
3. Variant 2: Remove the maxpool layers. Using stride=2 in the conv layer before the maxpool to achieve similar size reduction. Would the performance improve?
4. Variant 3: Add a few dropout layers in the model. Would the performance improve? Try 2 different ways to add the dropout layers. Describe the ones you tried and their performance.
5. Variant 4: Remove layers 9 and 10. Add two layers of (1, 1) convolution: conv (1, 1) x 128; conv (1, 1) x10. Then add GlobalAveragePooling2D to merge feature maps before pass them to softmax. This is an all-convolutional structure (no fully connected layers).
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."