Drift Metrics

Data or influence drift metrics compare a feature or influence values across two different data splits. TruEra uses 'range' to indicate a numerical feature's domain size, and 'card' to indicate the cardinality of a categorical feature's domain. In all cases, a result of 0 indicates no drift.

Name Feature Type Range Notes
Wasserstein Numerical [0, range] Also referred to as "Earth Mover's Distance".
Wasserstein (unordered) Categorical [0, 1] Same as Total Variation Distance.
Wasserstein (ordered) Categorical [0, card] Interprets categorical values as integers arranged on a [0, card] line.
Total variation distance Categorical [0, 1] Half of L1 Distance.
Jensen-Shannon distance Both [0, 1]
Chi-Square test Categorical [0, +inf)
Population stability index Both [0, +inf) Numeric version uses base deciles as segments. Not symmetric.
L1 Categorical [0, 2]
L2 Categorical [0, sqrt(2)]
LInfinity Categorical [0, 1]
Difference of mean Numerical (-inf, +inf) Not a metric. Not symmetric. Directional and can have cancellation.
Energy distance Numerical [0, +inf)
Kolmogorov-Smirnov statistic Numerical [0, +inf)

