LogoLogo
HomeBlogGitHub
latest
latest
  • New DOCS
  • What is Evidently?
  • Get Started
    • Evidently Cloud
      • Quickstart - LLM tracing
      • Quickstart - LLM evaluations
      • Quickstart - Data and ML checks
      • Quickstart - No-code evaluations
    • Evidently OSS
      • OSS Quickstart - LLM evals
      • OSS Quickstart - Data and ML monitoring
  • Presets
    • All Presets
    • Data Drift
    • Data Quality
    • Target Drift
    • Regression Performance
    • Classification Performance
    • NoTargetPerformance
    • Text Evals
    • Recommender System
  • Tutorials and Examples
    • All Tutorials
    • Tutorial - Tracing
    • Tutorial - Reports and Tests
    • Tutorial - Data & ML Monitoring
    • Tutorial - LLM Evaluation
    • Self-host ML Monitoring
    • LLM as a judge
    • LLM Regression Testing
  • Setup
    • Installation
    • Evidently Cloud
    • Self-hosting
  • User Guide
    • 📂Projects
      • Projects overview
      • Manage Projects
    • 📶Tracing
      • Tracing overview
      • Set up tracing
    • 🔢Input data
      • Input data overview
      • Column mapping
      • Data for Classification
      • Data for Recommendations
      • Load data to pandas
    • 🚦Tests and Reports
      • Reports and Tests Overview
      • Get a Report
      • Run a Test Suite
      • Evaluate Text Data
      • Output formats
      • Generate multiple Tests or Metrics
      • Run Evidently on Spark
    • 📊Evaluations
      • Evaluations overview
      • Generate snapshots
      • Run no code evals
    • 🔎Monitoring
      • Monitoring overview
      • Batch monitoring
      • Collector service
      • Scheduled evaluations
      • Send alerts
    • 📈Dashboard
      • Dashboard overview
      • Pre-built Tabs
      • Panel types
      • Adding Panels
    • 📚Datasets
      • Datasets overview
      • Work with Datasets
    • 🛠️Customization
      • Data drift parameters
      • Embeddings drift parameters
      • Feature importance in data drift
      • Text evals with LLM-as-judge
      • Text evals with HuggingFace
      • Add a custom text descriptor
      • Add a custom drift method
      • Add a custom Metric or Test
      • Customize JSON output
      • Show raw data in Reports
      • Add text comments to Reports
      • Change color schema
    • How-to guides
  • Reference
    • All tests
    • All metrics
      • Ranking metrics
    • Data drift algorithm
    • API Reference
      • evidently.calculations
        • evidently.calculations.stattests
      • evidently.metrics
        • evidently.metrics.classification_performance
        • evidently.metrics.data_drift
        • evidently.metrics.data_integrity
        • evidently.metrics.data_quality
        • evidently.metrics.regression_performance
      • evidently.metric_preset
      • evidently.options
      • evidently.pipeline
      • evidently.renderers
      • evidently.report
      • evidently.suite
      • evidently.test_preset
      • evidently.test_suite
      • evidently.tests
      • evidently.utils
  • Integrations
    • Integrations
      • Evidently integrations
      • Notebook environments
      • Evidently and Airflow
      • Evidently and MLflow
      • Evidently and DVCLive
      • Evidently and Metaflow
  • SUPPORT
    • Migration
    • Contact
    • F.A.Q.
    • Telemetry
    • Changelog
  • GitHub Page
  • Website
Powered by GitBook
On this page
  • How it works
  • Tabular Data
  • Text Data
  • Text Descriptors Drift
  • Embeddings
  • Dataset-level drift
  • Input data requirements
  • Resources
  1. Reference

Data drift algorithm

PreviousRanking metricsNextAPI Reference

Last updated 2 months ago

You are looking at the old Evidently documentation: this API is available with versions 0.6.7 or lower. Check the newer version .

In some tests and metrics, Evidently uses the default Data Drift Detection algorithm. It helps detect the distribution drift in the individual features, prediction, or target. This page describes how the default algorithm works.

How it works

Evidently compares the distributions of the values in a given column (or columns) of the two datasets. You should pass these datasets as reference and current. Evidently applies several statistical tests and drift detection methods to detect if the distribution has changed significantly. It returns a "drift detected" or "not detected" result.

There is a default logic to choosing the appropriate drift test for each column. It is based on:

  • column type: categorical, numerical, text data or embeddings

  • the number of observations in the reference dataset

  • the number of unique values in the column (n_unique)

Tabular Data

For small data with <= 1000 observations in the reference dataset:

  • For binary categorical features (n_unique <= 2): proportion difference test for independent samples based on Z-score.

All tests use a 0.95 confidence level by default.

For larger data with > 1000 observations in the reference dataset:

All metrics use a threshold = 0.1 by default.

Text Data

Text content drift using a domain classifier. Evidently trains a binary classification model to discriminate between data from reference and current distributions.

The default for small data with <= 1000 observations detects drift if the ROC AUC of the drift detection classifier > possible ROC AUC of the random classifier at a 95th percentile.

The default for larger data with > 1000 observations detects drift if the ROC AUC > 0.55.

Text content drift detection method

For small data, the drift score is the ROC-AUC score of the domain classifier computed on a validation dataset. The ROC AUC of the created classifier is compared to the ROC AUC of the random classifier at a set percentile. To ensure the result is statistically meaningful, we repeat the calculation 1000 times with randomly assigned target class probabilities. This produces a distribution with a mean of 0,5. We then take the 95th percentile (default) of this distribution and compare it to the ROC-AUC score of the classifier. If the classifier score is higher, we consider the data drift to be detected. You can also set a different percentile as a parameter.

For large data, the ROC AUC of the obtained classifier is directly compared against the set ROC AUC threshold.

Text Descriptors Drift

The descriptors are treated as tabular features. The default drift detection methods for tabular features apply.

Embeddings

Embedding drift using a classifier. Evidently trains a binary classification model to discriminate between data from reference and current distributions.

The default for small data with <= 1000 observations detects drift if the ROC AUC of the drift detection classifier > possible ROC AUC of the random classifier at a 95th percentile.

The default for larger data with > 1000 observations detects drift if the ROC AUC > 0.55.

Dataset-level drift

With Presets like DatasetDriftPreset(), Metrics like DatasetDriftMetric() or Tests like TestShareOfDriftedColumns() you can also set a rule on top of the individual feature drift results to detect dataset-level drift.

For example, you can declare dataset drift if 50% of all features (columns) drifted or if ⅓ of the most important features drifted.The default in DatasetDriftPreset() is 0.5.

Note that by default this includes all columns in the dataset. Suppose your dataset contains the prediction column, and you want to separate it from input drift detection. In that case, you should pre-process your dataset to exclude it or specify a list of columns you want to test for drift, and pass the list as a parameter.

Input data requirements

Empty columns

To evaluate data or prediction drift in the dataset, you need to ensure that the columns you test for drift are not empty. If these columns are empty in either reference or current data, Evidently will not calculate distribution drift and will raise an error.

Empty values

If some columns contain empty or infinite values (+-np.inf), these values will be filtered out when calculating distribution drift in the corresponding column.

Resources

To build up a better intuition for which tests are better in different kinds of use cases, you can read our in-depth blogs with experimental code:

Additional links:

For numerical columns (n_unique > 5): .

For categorical columns or numerical columns with n_unique <= 5: .

For numerical columns (n_unique > 5):.

For categorical columns or numerical with n_unique <= 5):.

You can always modify this drift detection logic. You can select any of the statistical tests available in the library (including PSI, K-L divergence, Jensen-Shannon distance, Wasserstein distance, etc.), specify custom thresholds, or pass a custom test. You can read more about using .

You can set different thresholds. You can specify a custom threshold as a .

You can also check for drift in Text Descriptors. There is an additional method that detects drift in Text Descriptors (such as text length, share of OOV words). This test is available as part of . You can also include it as a TextDescriptorsDriftMetric() in a custom Report, or in a Test Suite accordingly.

You can choose other embedding drift detection methods. You can specify custom thresholds and parameters such as dimensionality reduction and choose from other methods, including Euclidean distance, Cosine Similarity, Maximum Mean Discrepancy, and share of drifted embeddings. You must specify this as a .

You can set different thresholds. You can specify a custom threshold as a .

By default, drift tests do not react to changes or increases in the number of empty values. Since the high number of nulls can be an important indicator, we recommend grouping the data drift tests (that check for distribution shift) with data integrity tests (that check for a share of nulls). You can choose from several null-related and metrics and set a threshold.

.

.

two-sample Kolmogorov-Smirnov test
chi-squared test
Wasserstein Distance
Jensen--Shannon divergence
data drift parameters and available drift detection methods
parameter
Text Overview Preset
parameter
parameter
Which test is the best? We compared 5 methods to detect data drift on large datasets
Shift happens: how to detect drift in ML embeddings
How to interpret data and prediction drift together?
Do I need to monitor data drift if I can measure the ML model quality?
"My data drifted. What's next?" How to handle ML model drift in production.
What is the difference between outlier detection and data drift detection?
here
tests