- "Luckily, there are actually many approaches for this, and one of them are energy-based models. The fundamental idea of energy-based models is that you can turn any function that predicts values larger than zero into a probability distribution by dviding by its volume. Imagine we have a neural network, which has as output a single neuron, like in regression. We can call this network $E_{\\theta}(\\mathbf{x})$, where $\\theta$ are our parameters of the network, and $\\mathbf{x}$ the input data (e.g. an image). The output of $E_{\\theta}$ is a scalar value between $-\\infty$ and $\\infty$. Now, we can use basic probability theory to *normalize* the scores of all possible inputs:\n",
0 commit comments