Understanding Contrastive Divergence
-
31-10-2019 - |
質問
I’m trying to understand, and eventually build a Restricted Boltzmann Machine. I understand that the update rule - that is the algorithm used to change the weights - is something called “contrastive divergence”. I looked this up on Wikipedia and found these steps:
- Take a training sample v, compute the probabilities of the hidden units and sample a hidden activation vector h from this probability distribution.
- Compute the outer product of v and h and call this the positive gradient.
- From h, sample a reconstruction v' of the visible units, then resample the hidden activations h' from this. (Gibbs sampling step)
- Compute the outer product of v' and h' and call this the negative gradient.
- ...
I don’t understand step 3 and I’m struggling to grasp the concept of Gibbs sampling. Would someone explain this simply to me? I have covered neural networks if that helps you.
正しい解決策はありません
所属していません datascience.stackexchange