This is an old revision of the document!
Due: October 31st at 11:59pm
In the first few slides about neural networks (also section 7.1 in chapter e-7) we discussed the expressive power of multi-layer perceptrons with a “sign” activation function. Describe in detail a multi-layer perceptron that implements the following decision boundary:
In this segment of the assignment we will explore classification of handwritten digits with neural networks. For that task, we will use part of the MNIST dataset, which is very commonly used in the machine learning community. Your task is to explore various aspects of multi-layer neural networks using this dataset.
For simplicity, use 25 percent of the data for evaluating network performance, and the rest reserve for training. Normalize the data by dividing the features by the maximum value, which will normalize them to the range [0,1] (since the minimum is 0). As a basis for your implementation use the neural network code I showed in class.
Here's what you need to do:
The code that was provided does not really have a bias for all but the first layer. For 5 extra points, modify the code so that it correctly uses a bias for all layers.
Submit your report via Canvas. Python code can be displayed in your report if it is short, and helps understand what you have done. The sample LaTex document provided in assignment 1 shows how to display Python code. Submit the Python code that was used to generate the results as a file called assignment5.py
(you can split the code into several .py files; Canvas allows you to submit multiple files). Typing
$ python assignment5.py
should generate all the tables/plots used in your report.
A few general guidelines for this and future assignments in the course:
We will take off points if these guidelines are not followed.
Grading sheet for assignment 5 Part 1: 15 points. Part 2: 85 points. (25 points): Exploration of a network with a single hidden layer (25 points): Exploration of a network with two hidden layers (15 points): How to add weight decay (15 points): Linear activation function ( 5 points): Fixing the code so it handles the bias term correctly