{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Numpy\n",
    "\n",
    "Numpy is python's library for numerical linear algebra, and provides a wealth of functionality for working with vectors, matrices, and tensors.\n",
    "\n",
    "Our first step is to **import** the package (note the \"import as\" shortcut):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Arrays are the standard data containers in numpy, and can have any number of dimensions.\n",
    "\n",
    "Let's create a one dimensional array:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3])"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'numpy.ndarray'>\n",
      "(3,)\n"
     ]
    }
   ],
   "source": [
    "a = np.array([1, 2, 3])\n",
    "a\n",
    "print(type(a))\n",
    "print(a.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Accessing/modifying array elements:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 2 3\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([5, 2, 3])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print(a[0], a[1], a[2])\n",
    "a[0] = 5                  # Change an element of the array\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Two dimensional arrays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[1, 2, 3],\n",
       "        [4, 5, 6]]), (2, 3))"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 2 4\n",
      "1 2 4\n"
     ]
    }
   ],
   "source": [
    "b = np.array([[1,2,3],[4,5,6]])\n",
    "b,b.shape\n",
    "\n",
    "print(b[0, 0], b[0, 1], b[1, 0])\n",
    "print(b[0][0], b[0][1], b[1][0])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some functions for creating arrays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0.  0.]\n",
      " [ 0.  0.]]\n",
      "[[ 1.  1.]\n",
      " [ 1.  1.]]\n",
      "[[ 7.  7.]\n",
      " [ 7.  7.]]\n",
      "[[ 1.  0.]\n",
      " [ 0.  1.]]\n",
      "[[ 0.13849416  0.46435903  0.9068119 ]\n",
      " [ 0.86278242  0.66594523  0.71520984]\n",
      " [ 0.65630407  0.39285181  0.99686805]]\n",
      "[ 2.   2.1  2.2  2.3  2.4  2.5  2.6  2.7  2.8  2.9]\n",
      "[ 1.   1.6  2.2  2.8  3.4  4. ]\n"
     ]
    }
   ],
   "source": [
    "a = np.zeros((2,2))      # Create an array of zeros\n",
    "print(a)\n",
    "\n",
    "b = np.ones((2,2))       # Create an array of ones\n",
    "print(b)\n",
    "\n",
    "c = np.full((2,2), 7.0)  # Create a constant array (can also be done with the ones function)\n",
    "print(c)\n",
    "\n",
    "\n",
    "d = np.eye(2)            # Create a 2x2 identity matrix\n",
    "print(d)\n",
    "\n",
    "e = np.random.random((3,3))\n",
    "print(e)  \n",
    "\n",
    "f = np.arange(2, 3, 0.1)\n",
    "print(f)\n",
    "\n",
    "g = np.linspace(1., 4., 6)\n",
    "print(g)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sidenote - getting **help** on python objects:\n",
    "\n",
    "For getting help e.g. on the Numpy **linspace** function you can do one of the following:\n",
    "\n",
    "```python\n",
    "?np.linspace\n",
    "```\n",
    "\n",
    "or\n",
    "\n",
    "```python\n",
    "help(np.linspace)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Array indexing\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  2,  3,  4],\n",
       "       [ 5,  6,  7,  8],\n",
       "       [ 9, 10, 11, 12]])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])\n",
    "X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll think of this matrix as containing three feature vectors:  each row of the matrix is a vector of features, and each column is the set of values that a given feature has in the data.\n",
    "\n",
    "To access the vector of features of the first example in the dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([1, 2, 3, 4]), (4,))"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "row = X[0]    # the first row of X\n",
    "row, row.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To access a column of the matrix (a single feature):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([1, 5, 9]), (3,))"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "col = X[:, 0]\n",
    "col, col.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are multiple ways of accessing sub-arrays.  The first uses **slices**, similarly to python lists:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([[ 6,  7,  8],\n",
       "        [10, 11, 12]]), (2, 3))"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "submatrix = X[1:3, 1:4]\n",
    "submatrix, submatrix.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And here's my favorite way of indexing an array, using an integer array:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  2,  3,  4],\n",
       "       [ 9, 10, 11, 12]])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X[ [0, 2] ]   # extract a given set of rows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1,  3],\n",
       "       [ 5,  7],\n",
       "       [ 9, 11]])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X[:, [0,2]]  # extract a given set of columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Operations on arrays\n",
    "\n",
    "You can multiply an array by a scalar:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2, 2])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([1,1])\n",
    "x * 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add arrays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2, 1])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x + np.array([1,0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dot products "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's construct a weight vector for a linear classifier:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1, -1,  2, -1])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "w = np.array([1,-1, 2, -1])\n",
    "w"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can easily compute the dot/inner product of a row of \n",
    "X with the weight vector:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.dot(X[0], w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or you can do it for the whole matrix at once:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 5, 9])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.dot(X, w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This can also be done using methods:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 5, 9])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X.dot(w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and can also be achieved using"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 5, 9])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.inner(X, w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Other linear algebra operations\n",
    "\n",
    "Numpy has many other useful things it can do for you when it comes to vectors and matrices."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's very easy to find the inverse of a matrix:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.66666667, -0.33333333,  0.        ],\n",
       "       [ 0.        ,  2.        , -1.        ],\n",
       "       [-0.33333333, -1.33333333,  1.        ]])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Z = np.array([[2,1,1],[1,2,2],[2,3,4]])\n",
    "np.linalg.inv(Z)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And we can easily verify that this is correct:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ True,  True,  True],\n",
       "       [ True,  True,  True],\n",
       "       [ True,  True,  True]], dtype=bool)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.dot(Z, np.linalg.inv(Z))==np.eye(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can compute some useful statistics over your matrix such the mean and standard deviation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(6.5, array([ 5.,  6.,  7.,  8.]), array([  2.5,   6.5,  10.5]))"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])\n",
    "X.mean(), X.mean(axis=0), X.mean(axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3.4520525295346629,\n",
       " array([ 3.26598632,  3.26598632,  3.26598632,  3.26598632]),\n",
       " array([ 1.11803399,  1.11803399,  1.11803399]))"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X.std(), X.std(axis=0), X.std(axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are lots of other things you can do with numpy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3, 4],\n",
       "       [5, 6, 7, 8]])"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3, 4, 5, 6, 7, 8])"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array( [1,2,3,4] )\n",
    "y = np.array( [5,6,7,8] )\n",
    "np.vstack([x,y])\n",
    "np.hstack([x,y])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Avoid loops when you can\n",
    "\n",
    "Consider the following piece of code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 2,  2,  4],\n",
       "       [ 5,  5,  7],\n",
       "       [ 8,  8, 10],\n",
       "       [11, 11, 13]])"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])\n",
    "v = np.array([1, 0, 1])\n",
    "y = np.empty_like(x)   # Create an empty matrix with the same shape as x\n",
    "\n",
    "# Add the vector v to each row of the matrix x with an explicit loop\n",
    "for i in range(4):\n",
    "    y[i, :] = x[i, :] + v\n",
    "\n",
    "y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we know, loops are slow in python.  There is a much more efficient way of doing this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 2,  2,  4],\n",
       "       [ 5,  5,  7],\n",
       "       [ 8,  8, 10],\n",
       "       [11, 11, 13]])"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])\n",
    "v = np.array([1, 0, 1])\n",
    "y = x + v\n",
    "y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is called **broadcasting**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Numpy documentation\n",
    "\n",
    "These were some of the basics that are relevant to our course.  You can find more details in the  [Numpy user manual](https://docs.scipy.org/doc/numpy/user/) and the detailed [reference guide](https://docs.scipy.org/doc/numpy/reference).\n",
    "The following is a good resource for learning how to [vectorize](http://www.labri.fr/perso/nrougier/from-python-to-numpy/) Python code using Numpy for obtaining good performance."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some aspects of these tutorial were inspired by this [tutorial](http://cs231n.github.io/python-numpy-tutorial/)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}