Monday, 14 November 2016

Back.. again! Artificial neural networks have tickled my fancy

I'm really impressed with my foresight when naming this blog all those years ago, it truly does live on the edge of abandon as I haven't touched it in 3 years! I've decided to fire it back up again and will (at least try) to actively post as I delve into the world of statistical / machine learning! I just want a place to document my thoughts/learnings/projects as I delve into this fascinating area. I've decided to use Python as my language of choice as I've always wanted to learn it but never got around to it or had the chance. Now I have no excuses!

So, without further ado...

Artificial Neural Networks

I'm going to assume some familiarity with ANNs - what they are, their general structure/philosophy. See here for a comprehensive introduction.
As a first pass, I'll introduce a motivating example, go through the math and then implement (in Python) an ANN from scratch.

Definition of the problem

We'll look at a classification problem in 2 dimensions (to make visualisation easier, read: possible) where our old friend Logistic Regression struggles. It will highlight a key fact that needs to be understood in any sort of machine learning / statistical problem; understanding when and what tools to use to attack a problem.

The data:

We'll use a toy dataset from sklearn called make_moons, with the following call:

 import pandas as pd
import sklearn.datasets
import seaborn as sb
import sklearn.linear_model

X,y = sklearn.datasets.make_moons(200, noise=0.2)
df = pd.DataFrame()
df['x1'] = X[:,0]
df['x2'] = X[:,1]
df['z'] = y
sb.lmplot('x1','x2',hue='z', data=df, fit_reg = False) 

The task:

We want to calculate the decision boundary between the two (blue and green) classes. At a glance it's clearly non-linear, so let's see how a Logistic Regression classification goes!

Results:

We can see that the Logistic Regression can only fit a linear decision boundary (see below) - even though the actual decision boundary is decidedly non-linear.

We can see why the Logistic Regression decision boundary is linear from the following:
$$P_{boundary} \equiv \frac{1}{1+e^{- \theta \cdot x}} = 0.5$$
$$\implies 1 = e^{- \theta \cdot x}$$
$$\implies \theta \cdot x = 0$$ which defines a plane (line in 2 dimensions).

In part 2 I will define an ANN that we will work through to establish exactly how it works and how it is trained. We will then construct and implement it in python to get an idea of the decision boundaries it can produce.