![]() ![]() ![]() Since MDNs parameterize probability distributions, they are a great way to capture randomness in the data. The importance of π: what is the probability the red point was drawn from each of the three distributions? Last year, I wrote an Jupyter notebook about MDNs. Think of \(\pi\) as the probability that the output value was drawn from that particular component’s distribution. They also estimate a parameter \(\pi\) for each of these distributions. Their output parameters are \(\mu\), \(\sigma\), and \(\rho\) for several multivariate Gaussian components. Think of Mixture Density Networks as neural networks which can measure their own uncertainty. For the purposes of this post, just remember that RNNs are extremely good at modeling sequential data. LSTMs, for example, use three different tensors to perform ‘erase’, ‘write’, and ‘read’ operations on a ‘memory’ tensor: the \(f\), \(i\), \(o\), and \(C\) tensors respectively ( more on this). ![]() These networks use a differentiable form of memory to keep track of time-dependent patterns in data. Arrows represent how data flows through the model (gradients flow backwards) The recurrent structure allows the model to feed information forward from past iterations. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |