# Numpy vs. Python Lists

While I was learning Machine Learning, I made my mind not to use any of the third-party-libraries (including numpy and Pandas) until I got to know what happened in the background and feel okay to use libraries.

It was a bit harder to write everything from scratch. Even the
`dot product`

must be written, cause python doesn't have that inbuilt.

Finally, I've felt comfortable and could clearly understand what is happening in the background (cause, Machine Learning is all about Math).

## Implementing Stochastic Gradient Descent

Stochastic gradient descent is actually scanning through the training examples, and then it'll take a little gradient descent step with respect to the cost function of just that training example.

The cost function measures how well the hypothesis is doing on a single example.

\[{cost( \Theta, (x^{(i)}, y^{(i)})) = \dfrac{1}{2}(h_\Theta(x^{(i)}) - y^{(i)})^2}\]

\[{J_{train}(\Theta) = \dfrac{1}{m} \sum_{i=1}^{m} cost( \Theta, (x^{(i)}, y^{(i)}))}\]

### A dumb Stochastic Gradient Descent Implementation

#### Using Lists:

def dot(v, w): return sum(v_i * w_i for v_i, w_i in zip(v, w)) def predict_multi(x_i, beta): return dot(x_i, beta) def error_multi(beta, x, y): return y - predict_multi(x, beta) def stochastic_gradient_descent(x, y, theta, learning_rate=0.001): for _ in range(1500): for x_i, y_i in zip(x, y): i = 0 for x_ii, theta_i in zip(x_i, theta): theta[i] = theta_i - learning_rate * (-error_multi(theta, x_i, y_i)) * x_ii i += 1 return theta

So, the benchmark results

%timeit stochastic_gradient_descent(multi_data, daily_minutes_good, theta) 1.85 s ± 21.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

#### Using Numpy:

import numpy as np def stochastic_gradient_descent_np(x, y, theta, learning_rate=0.001): for _ in range(1500): for x_i, y_i in zip(x, y): theta = theta - learning_rate * (np.sum(x_i * theta) - y_i) * x_i return theta

That's it, boom. The Numpy implementation of Gradient Descent actually uses a vectorized implementation.

Benchmark using Numpy:

%timeit stochastic_gradient_descent_np(np_multi_data, np_daily, np_theta) 1.96 s ± 7.53 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Using Numpy doesn't just make the implementation simpler but also faster.