This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
assignments:assignment5 [2016/10/17 19:25] asa [Submission] |
assignments:assignment5 [2016/10/17 19:30] asa [Part 2: Exploring neural networks for digit classification] |
||
---|---|---|---|
Line 16: | Line 16: | ||
In this segment of the assignment we will explore classification of handwritten digits with neural networks. For that task, we will use part of the [[http://yann.lecun.com/exdb/mnist/ |MNIST]] dataset, which is very commonly used in the machine learning community. | In this segment of the assignment we will explore classification of handwritten digits with neural networks. For that task, we will use part of the [[http://yann.lecun.com/exdb/mnist/ |MNIST]] dataset, which is very commonly used in the machine learning community. | ||
Your task is to explore various aspects of multi-layer neural networks using this dataset. | Your task is to explore various aspects of multi-layer neural networks using this dataset. | ||
- | + | We have prepared a {{ :assignments:mnist.tar.gz |small subset}} of the data with given split into training and test data. | |
- | For simplicity, use 25 percent of the data for evaluating network performance, and the rest reserve for training. | + | |
- | Normalize the data by dividing the features by the maximum value, which will normalize them to the range [0,1] (since the minimum is 0). | + | |
- | As a basis for your implementation use the neural network code I showed in class. | + | |
Here's what you need to do: | Here's what you need to do: | ||
+ | * The code that was provided does not really have a bias for all but the first layer. Modify the code so that it correctly uses a bias for all layers. This part is only worth 5 points, and you can do the rest of the assignment with the original version of the code. | ||
* Plot network accuracy as a function of the number of hidden units for a single-layer network with a logistic activation function. Use a range of values where the network displays both under-fitting and over-fitting. | * Plot network accuracy as a function of the number of hidden units for a single-layer network with a logistic activation function. Use a range of values where the network displays both under-fitting and over-fitting. | ||
* Plot network accuracy as a function of the number of hidden units for a two-layer network with a logistic activation function. Here, also demonstrate performance in a range of values where the network exhibits both under-fitting and over-fitting. Does this dataset benefit from the use of more than one layer? | * Plot network accuracy as a function of the number of hidden units for a two-layer network with a logistic activation function. Here, also demonstrate performance in a range of values where the network exhibits both under-fitting and over-fitting. Does this dataset benefit from the use of more than one layer? | ||
* Add weight decay regularization to the neural network class you used (explain in your report how you did it). Does the network demonstrate less over-fitting on this dataset with the addition of weight decay? | * Add weight decay regularization to the neural network class you used (explain in your report how you did it). Does the network demonstrate less over-fitting on this dataset with the addition of weight decay? | ||
* The provided implementation uses the same activation function in each layer. For solving regression problems we need to use a linear activation function to produce the output of the network. Explain why, and what changes need to be made in the code. | * The provided implementation uses the same activation function in each layer. For solving regression problems we need to use a linear activation function to produce the output of the network. Explain why, and what changes need to be made in the code. | ||
- | |||
- | The code that was provided does not really have a bias for all but the first layer. For 5 extra points, modify the code so that it correctly uses a bias for all layers. | ||