[Neural Network and Deep Learning] Forward Propagation in a Deep Network

Software Courses/Neural network and Deep learning

[Neural Network and Deep Learning] Forward Propagation in a Deep Network

김 정 환 2020. 3. 23. 13:14

This note is based on Coursera course by Andrew ng.

(It is just study note for me. It could be copied or awkward sometimes for sentence anything, because i am not native. But, i want to learn Deep Learning on English. So, everything will be bettter and better :))

Let's see how we can perform forward propagation in a deep network. Given a single training example x, here is how we compute the activations of the first layer.

We have done all this for a single training example. How about for doing it in a vectorized way for the whole training set at the same time?

When implementing a deep neural network, one of the debugging tools Andrew often use to check the correctness of his code is to pull of paper, and work through the dimensions and matrix he is working with. Let's see how to do that. If we implementation forward propagation, the first step will be z1 = w1*x +b1. Let's ignore the bias terms b for now, and focus on the parameters w. And think about the dimension of z, w and x. And we can define the dimesion of z, w and x as follow.

So, what we figured out above is that the dimension of w1 has to be n1 by n0. And more generally, the dimensions of w[l] must be n[l]by n[l-1].

Now, let's think about the dimension of the vector b. w[1] * x is going to be a (3, 1) vector, so we have to add other (3, 1) vector in order to get a (3, 1) vector as the output. So, more general rule is that b[l] should be n[l] by 1 dimenstional.

Now, in a vectorized implementaion, we would have Z[1] = X[1] * x + b[1]. The dimension of Z[1] is, instead of being n[1] by 1, it ends up being n[1] by m, and m is the size we are trying to set. The dimension of W[1] stays the same. And X is, instead of n[1] by 1, all our training examples stacked horizontally. The final detail is that b[1] is still n[1] by 1, but when we take W[1]*X and add it to b, then through Python broadcasting, this will get duplicated and turn n[1] by m matrix, and then add the element wise.

'Software Courses > Neural network and Deep learning' 카테고리의 다른 글

[Neural Network and Deep Learning] Building blocks of deep neural networks (0)	2020.03.24
[Neural Network and Deep Learning] Why deep representations? (0)	2020.03.24
[Neural Network and Deep Learning] Deep L-layer neural network (0)	2020.03.23
[Neural Network and Deep Learning] Random Initialization (0)	2020.03.19
[Neural Network and Deep Learning] Derivatives of activation functions (0)	2020.03.18

현재글[Neural Network and Deep Learning] Forward Propagation in a Deep Network

거창한 시작 쉽고 간단하고 재미있게

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

거창한 시작