Kullback–Leibler divergence is a statistical method for concept drift detection, and is sometimes referred to as relative entropy. The KL divergence tries to quantify how much one probability distribution differs from another, so if we have the distributions Q and P where the Q distribution is the distribution of the old data and P is that of the new data we would like to calculate:
* The “||” represents the divergence.
We can see that if P(x) is high and Q(x) is low, the divergence will be high.
If P(x) is low and Q(x) is high, the divergence will be high as well but not as much.
If P(x) and Q(x) are similar, then the divergence will be lower.