In the last post we discussed how a perceptron could ‘learn’ the best set of weights and biases to classify some data. We discussed that what ‘best’ really means, in the context of minimising some error function. Finally, we motivated the need to move to a continuous model of the perceptron. Now, we will motivate the need to connect several perceptrons together in order to build more complex non-linear models.

So far, all the models we’ve built using the perceptron have been linear. In reality, a data set is rarely capable of being classified by a simple line. …

In the last post, we introduced the concept of a perceptron and how it can be used to model a linear classifier.

A perceptron takes in *n *input features, *x, *and multiplies each by a corresponding weight,* w, *adds on a bias term and finally applies an activation function to the result and spits out a number. Previously we used a step function as the activation in our example of trying to find a model which would classify whether or not a plant would grow, based on time spent in the sun and the amount of water it was given.

The general task of any model is to take in some known quantities and predict some unknown quantity we care about.

Whether it be predicting the price of a house, given its location and the number of bedrooms, or predicting the probability of passing a test, given previous mock test scores. The general idea is the same - take in some things we know, and figure out something we don’t.

So how do we go about building these models? Let’s take the following problem, we want to predict if a plant will grow, given the input features *time spent in…*

Imaging scientist intern