Regression Loss Function Duality-Probability Explanation

The learning process is a statistic process, inferencing is probabilistic. The training samples are sampled from unknown probability distributions Pr(X) and Pr(Y). The result of NN gives out the estimated distribution Pr(Y|X). Integrate out X to get Y: Pr(Y)=∫ Pr(Y|X)Pr(X) dx.

When I started to crack machine learning, I read the book “Pattern Recognition and Machine…