Skip to main content

What is Logarithmic (Log) Loss in Data Science Projects?

Log Loss is the most important classification metrics based on probabilities.

Logarithmic (Log) Loss measures the performance of a classification model where the prediction input is probability value between 0 and 1. The goal of our machine learning model's is to minimize this value. A model to be perfect, it's log loss should be zero.
Logarithmic (Log) Loss
Logarithmic Log Loss

The value of log loss increases when the predicted probability diverges from the actual values. Say for instance we predicted a probability of  0.01 when the actual observation label is 1. This would be bad and result in a high log loss.

Log loss takes into account the uncertainty of your prediction based on how much it varies from the actual label.

On the other hand accuracy is the count of predictions where your predicted value equals the actual value. Accuracy is not always a good indicator because of its yes or no nature.

Example

A model predicts probabilities of  [0.7,  0.5,  0.1] for three houses. The first two houses were sold while the third was not sold. So the actual output could be represented numerically as [1, 1, 0].

The first house was sold and the model said that was 70% likely. So the likelihood function after looking at one prediction is 0.7.

The second house was sold and the model said that was 50% likely. There is a rule of probability that the probability of multiple independent events is the product of their individual probabilities. So, we get the combined likelihood from the first two predictions by multiplying their associated probabilities. That is 0.7 *0.5, which happens to be 0.32.

Now we get to the third prediction that the house did not sell. The model said it was 10% likely to sell. That means it was 90% likely not to sell.

So the observed outcome of not selling was 90% likely accordingly to the model. So, we multiply the previous result of 0.32 by 0.9.

We could step through all our predictions. Each time we'd find the probability associated with the outcome that actually occurred, and we'd multiply that by the previous result. That's the likelihood.

Each prediction is between 0 and 1. If you multiply enough numbers in the range, the result gets so small that computers can't keep track to it. So, as a clever computational trick, we instead keep track of the log of the likelihood. This is in a range that's easy to keep track of. We multiply this by negative 1 to maintain a common convention that lower loss scores are better.

So now you know what is Logarithmic  (Log) Loss in Data Science Projects?

Comments

Popular posts from this blog