This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
assignments:assignment4 [2016/09/30 10:24] asa |
assignments:assignment4 [2016/10/06 15:09] asa [Part 4: Using SVMs] |
||
---|---|---|---|
Line 17: | Line 17: | ||
* Consider the following statement: The set of all key support vectors is unique. Prove this, or show a counter-example. | * Consider the following statement: The set of all key support vectors is unique. Prove this, or show a counter-example. | ||
- | * Using the definition of key support vectors prove a tighter bound on the leave-one-out cross validation error: | + | * In class we argued that the fraction of examples that are support vectors provide a bound on the leave-one-out error. Using the definition of key support vectors prove a tighter bound on the leave-one-out cross validation error can be obtained: |
$$ | $$ | ||
E_{cv} \leq \frac{\textrm{number of key support vectors}}{N}, | E_{cv} \leq \frac{\textrm{number of key support vectors}}{N}, | ||
Line 27: | Line 27: | ||
Suppose you are given a linearly separable dataset, and you are training the soft-margin SVM, which uses slack variables with the soft-margin constant $C$ set | Suppose you are given a linearly separable dataset, and you are training the soft-margin SVM, which uses slack variables with the soft-margin constant $C$ set | ||
- | with the soft margin constant $C$ set | ||
to some positive value. | to some positive value. | ||
Consider the following statement: | Consider the following statement: | ||
- | Since increasing the $\xi_i$ can only increase the objective of the primal problem (which | + | Since increasing the $\xi_i$ can only increase the cost function of the primal problem (which |
- | we are trying to minimize), at the optimal solution to the primal problem, all the | + | we are trying to minimize), at the solution to the primal problem, i.e. the hyperplane that minimizes the primal cost function, all the |
training examples will have $\xi_i$ equal | training examples will have $\xi_i$ equal | ||
to zero. | to zero. | ||
Line 98: | Line 97: | ||
Next, we will compare the accuracy of an SVM with a Gaussian kernel on the raw data with accuracy obtained when the data is normalized to be unit vectors (the values of the features of each example are divided by its norm). | Next, we will compare the accuracy of an SVM with a Gaussian kernel on the raw data with accuracy obtained when the data is normalized to be unit vectors (the values of the features of each example are divided by its norm). | ||
This is different than standardization which operates at the level of individual features. Normalizing to unit vectors is more appropriate for this dataset as it is sparse, i.e. most of the features are zero. | This is different than standardization which operates at the level of individual features. Normalizing to unit vectors is more appropriate for this dataset as it is sparse, i.e. most of the features are zero. | ||
- | Perform your comparison by comparing the accuracy measured by the area under the ROC curve in five-fold cross validation. | + | Perform your comparison by comparing the accuracy measured by the area under the ROC curve in five-fold cross validation, where the classifier/kernel parameters are chosen by |
- | The optimal values of kernel parameters should be measured by cross-validation, where the optimal SVM/kernel parameters are chosen using grid search on the training set of each fold. | + | by nested cross-validation, i.e. using grid search on the training set of each fold. |
Use the scikit-learn [[http://scikit-learn.org/stable/tutorial/statistical_inference/model_selection.html | Use the scikit-learn [[http://scikit-learn.org/stable/tutorial/statistical_inference/model_selection.html | ||
| grid-search]] class for model selection. | | grid-search]] class for model selection. | ||
Line 110: | Line 109: | ||
===== Submission ===== | ===== Submission ===== | ||
- | Submit the pdf of your report via Canvas. Python code can be displayed in your report if it is succinct (not more than a page or two at the most) or submitted separately. The latex sample document shows how to display Python code in a latex document. Code needs to be there so we can make sure that you implemented the algorithms and data analysis methodology correctly. Canvas allows you to submit multiple files for an assignment, so DO NOT submit an archive file (tar, zip, etc). Canvas will only allow you to submit pdfs (.pdf extension) or python code (.py extension). | + | Submit your report via Canvas. Python code can be displayed in your report if it is short, and helps understand what you have done. The sample LaTex document provided in assignment 1 shows how to display Python code. Submit the Python code that was used to generate the results as a file called ''assignment3.py'' (you can split the code into several .py files; Canvas allows you to submit multiple files). Typing |
- | For this assignment there is a strict 8 page limit (not including references and code that is provided as an appendix). We will take off points for reports that go over the page limit. | + | |
- | In addition to the code snippets that you include in your report, make sure you provide complete code from which we can see exactly how your results were generated. | + | <code> |
+ | $ python assignment4.py | ||
+ | </code> | ||
+ | should generate all the tables/plots used in your report. | ||
+ | |||
+ | |||
===== Grading ===== | ===== Grading ===== |