Sunday, October 23, 2022

Plot of the univariate Gaussian showing the mean μ and the standard deviation σ.

 


The concept of probability for discrete variables can be extended to that of a probability density p(x) over a continuous variable x and is such that the probability of x lying in the interval (x, x+δx) is given by p(x)δx for δx → 0. The probability density can be expressed as the derivative of a cumulative distribution function P(x).

 


An illustration of a distribution over two variables, X, which takes 9 possible values, and Y , which takes two possible values. The top left figure shows a sample of 60 points drawn from a joint probability distribution over these variables. The remaining figures show histogram estimates of the marginal distributions p(X) and p(Y ), as well as the conditional distribution p(X|Y = 1) corresponding to the bottom row in the top left figure.

 


Plots of the solutions obtained by minimizing the sum-of-squares error function using the M = 9 polynomial for N = 15 data points (left plot) and N = 100 data points (right plot). We see that increasing the size of the data set reduces the over-fitting problem.