Statistical Methods for Concept Drift Detection

Statistical methods are used to compare the difference between distributions. In some cases, a divergence is used, which is a type of distance metric between distributions. In other cases, a test is run to receive a score.

When to use Statistical Methods:

The idea in the statistical methods section is to assess the distribution between two datasets.

We can use these tools to find differences between data from different timeframes and measure the differences in the behavior of the data as time goes on.

As for these methods the label is not needed and no additional memory is required, we can get a quick indicator for changes in the input features/output to the model. That would help us start investigating the situation even before any potential degradation in the model’s performance metrics. On the other hand, the lack of a label and disregarding memory of past events and other features could result in false positives if not handled correctly.

If interested, learn and read more about these concepts in our articles concept drift in machine learning 101 and 8 Concept Drift Detection Methods.

Start Monitoring Your Models in Minutes