Powered by GitBook

3.1 Neural Network Overview

Logistic regression
Backward calculation

3.2 Neural Network Representation

2 Layer NN (input layer is ignored)

Input layer - vector x
Hidden layer (Params w and b)
output layer - y-hat predict value (params w and b)

3.3 Computing a neural network output

Repeat logistic regression computation

Circle represent two steps:
1. z = wx + b
2. a = sigmoid(z)
Subscript represent node index

3.4 Vectorizing across multiple examples

Given m training examples, for the ith(i > 2) layer:

input: last layer output
apply formula:
- z = wx + b
- a = sigmoid(z)

Notations:

Square brackets i: layer i
Round brackets i: training example i

3.5 Explanation for vectorized implementation

Simplify the justification, ignore b, b = 0
wx(i) should get a column vector
z[i]: one column corresponds to one example z value at ith layer

For multiple layer NN, just do the two steps repeatly.

3.6 One hidden layer NN - Activation function

Use g(z) represent activation function
Sigmoid function is an activation function
tanh function, the hyperbolic tangent function, range from -1 to 1. centered 0, if z is too large or too small, the slope would be very small
tanh function always works better than sigmoid function, because the mean is close to 0, has the effect of centering the data
activation function can be different in different layer
use square bracket to represent different layer

Pros and cons:

sigmoid function: never use it except for the output layer if you are doing binary classification
tanh function: always works better than sigmoid function
reLu function: default one
leak reLu function: a = max(0.01z, z)

results matching ""

No results matching ""