In models, one often has to make do with subpar labels before production. Once high-quality labels are available, it becomes possible to re-evaluate the model’s performance. This is the most conclusive metric for detecting model degradation. Unfortunately, in most use cases it can take weeks, months, or even years until high-quality labels are available. Waiting for high-quality labels for re-evaluation might result in late detection of issues after serious damage has already been done.