The most advanced ML Observability product in the market
Building an ML platform is nothing like putting together Ikea furniture; obviously, Ikea is way more difficult. However, they both, similarly, include many different parts that help create value when put together. As every organization sets out on a unique path to building its own machine learning platform, taking on the project of building a […]
Start integrating our products and tools.
We’re excited 😁 to share that Forbes has named Aporia a Next Billion-Dollar Company. This recognition comes on the heels of our recent $25 million Series A funding and is a huge testament that Aporia’s mission and the need for trust in AI are more relevant than ever. We are very proud to be listed […]
Concept Drift is a situation in which the statistical properties of a target variable (what the model is trying to predict) changes over time in unforeseen ways.
Visually, we can say that a concept is a way to separate between the blue and green dots in the plot above. The black line represents a concept that separates the blue and green dots.
Types of Drifts in Machine Learning
For the following definitions let’s denote the following parameters:
X- Model’s input population.
ŷ – Model’s prediction.
Y- True label population.
Concept drift: a change in the distribution of p(Y |X), meaning that there was a change in the relationship between the input of the model and the true label.
Prediction drift: a change in the distribution of the predicted label – p(ŷ |X), meaning that there was a change in the relationship between the input of the model and the model’s prediction.
Label drift: a change in the probability of a label p(Y).
Feature drift: a change in the probability of p(X), meaning there was a change in the distribution of the model’s input.
In order to better understand the effects of concept drift, we need to distinguish between two types of concept drift:
Virtual drift: when p(X) changes but p(Y|X)does not change. Meaning that there was a change in the features’ underlying distribution, but the model’s performance hasn’t changed.
Real drift: There was a change in p(Y|X), meaning the performance of the model changed.
Virtual drift vs real drift is illustrated in the following figure.
Learn more about concept drift here: