3.1 Neural Network Overview

  • Logistic regression
  • Backward calculation

3.2 Neural Network Representation

2 Layer NN (input layer is ignored)

  • Input layer - vector x
  • Hidden layer (Params w and b)
  • output layer - y-hat predict value (params w and b)

3.3 Computing a neural network output

Repeat logistic regression computation

  • Circle represent two steps:

    1. z = wx + b
    2. a = sigmoid(z)
  • Subscript represent node index

3.4 Vectorizing across multiple examples

Given m training examples, for the ith(i > 2) layer:

  • input: last layer output
  • apply formula:
    • z = wx + b
    • a = sigmoid(z)

Notations:

  • Square brackets i: layer i
  • Round brackets i: training example i

3.5 Explanation for vectorized implementation

  • Simplify the justification, ignore b, b = 0
  • wx(i) should get a column vector
  • z[i]: one column corresponds to one example z value at ith layer

For multiple layer NN, just do the two steps repeatly.

3.6 One hidden layer NN - Activation function

  • Use g(z) represent activation function
  • Sigmoid function is an activation function
  • tanh function, the hyperbolic tangent function, range from -1 to 1. centered 0, if z is too large or too small, the slope would be very small
  • tanh function always works better than sigmoid function, because the mean is close to 0, has the effect of centering the data
  • activation function can be different in different layer
  • use square bracket to represent different layer

Pros and cons:

  • sigmoid function: never use it except for the output layer if you are doing binary classification
  • tanh function: always works better than sigmoid function
  • reLu function: default one
  • leak reLu function: a = max(0.01z, z)

results matching ""

    No results matching ""