# Data drift parameters

{% hint style="info" %}
**You are looking at the old Evidently documentation**: this API is available with versions 0.6.7 or lower. Check the newer docs version [here](https://docs.evidentlyai.com/introduction).
{% endhint %}

**Pre-requisites**:

* You know how to generate Reports or Test Suites with default parameters.
* You know how to pass custom parameters for Reports or Test Suites.
* You know how to use Column Mapping to set the input data type.

## Default

All Presets, Tests, and Metrics that include data or target (prediction) drift evaluation use the default [Data Drift algorithm](/reference/data-drift-algorithm.md). It automatically selects an appropriate drift detection method based on the feature type and volume.

You can override the defaults by passing a custom parameter to the chosen Test, Metric, or Preset. You can define the drift detection method, the threshold, or both.

## Code example

You can refer to an example How-to-notebook showing how to pass custom drift parameters:

{% embed url="<https://github.com/evidentlyai/evidently/blob/ad71e132d59ac3a84fce6cf27bd50b12b10d9137/examples/how_to_questions/how_to_specify_stattest_for_a_testsuite.ipynb>" %}

## Examples

To set a custom drift method and threshold on the **column level**:

```python
ColumnDriftMetric(column_name='feature1', stattest='wasserstein', stattest_threshold=0.2) 
```

If you have a Preset, Test or Metric that checks for drift in **multiple columns** at the same time, you can set a custom drift method for all columns, all numerical/categorical columns, or for each column individually.

Here is how you set the drift detection method for all categorical columns:

```python
DataDriftPreset(cat_stattest='ks', cat_statest_threshold=0.05)
```

To set a custom condition for the **dataset drift** (share of drifting columns in the dataset) in the relevant Metrics or Presets:

```python
DatasetDriftMetric(drift_share=0.7)
```

Note that this works slightly differently for Tests. To set a custom condition for the **dataset drift** when you run a relevant **Test**, you should set a condition for the share of drifted features using standard `lt` and `gt` parameters:

```python
TestShareOfDriftedColumns(lt=0.5)
```

When you set drift threshold for `ColumnDriftTest()`, you should use `stattest_threshold` and other parameters the same way as it works in Metrics (not `lt` and `gt`).

## Tabular drift detection

The following methods and parameters apply to **tabular** data (as parsed automatically or specified as numerical or categorical columns in the column mapping).

### Drift parameters - Tabular

The following drift detection parameters are available in the `DataDriftTable()`, `DatasetDriftMetric()`, `ColumnDriftMetric()`, related Tests, and Presets that contain them.

| Parameter                                                                                  | Description                                                                                                                                                                                                               |
| ------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `stattest`                                                                                 | Defines the drift detection method for a given column (if a single column is tested), or all columns in the dataset (if multiple columns are tested).                                                                     |
| `stattest_threshold`                                                                       | <p>Sets the drift threshold in a given column or all columns.<br>The threshold meaning varies based on the drift detection method, e.g., it can be the value of a distance metric or a p-value of a statistical test.</p> |
| `drift_share`                                                                              | Defines the share of drifting columns as a condition for Dataset Drift in `DatasetDriftMetric` or inside a Preset.                                                                                                        |
| <p><code>cat\_stattest</code><br><code>cat\_stattest\_threshold</code></p>                 | Sets the drift method and/or threshold for all categorical columns in the dataset.                                                                                                                                        |
| <p><code>num\_stattest</code><br><code>num\_stattest\_threshold</code></p>                 | Sets the drift method and/or threshold for all numerical columns in the dataset.                                                                                                                                          |
| <p><code>per\_column\_stattest</code><br><code>per\_column\_stattest\_threshold</code></p> | Sets the drift method and/or threshold for the listed columns (accepts a dictionary).                                                                                                                                     |

{% hint style="info" %}
**How to check available parameters.** You can verify which parameters are available for a specific test, metric, or preset in the [All tests](/reference/all-tests.md) or [All metrics](/reference/all-metrics.md) tables or consult the [API reference](https://github.com/evidentlyai/docs-old/blob/main/customization/\[../reference/api-reference]\(https:/docs.evidentlyai.com/reference/api-reference\)/README.md)
{% endhint %}

### Drift detection methods - Tabular

To use the following drift detection methods, pass them using the `stattest` parameter.

| StatTest                                                         | Applicable to                                                                                                                      | Drift score                                                                                                                               |
| ---------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| <p><code>ks</code><br>Kolmogorov–Smirnov (K-S) test</p>          | <p>tabular data<br>only numerical<br><br><strong>Default method for numerical data, if <= 1000 objects</strong></p>                | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>chisquare</code><br>Chi-Square test</p>                 | <p>tabular data<br>only categorical<br><br><strong>Default method for categorical with > 2 labels, if <= 1000 objects</strong></p> | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>z</code><br>Z-test</p>                                  | <p>tabular data<br>only categorical<br><br><strong>Default method for binary data, if <= 1000 objects</strong></p>                 | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>wasserstein</code><br>Wasserstein distance (normed)</p> | <p>tabular data<br>only numerical<br><br><strong>Default method for numerical data, if > 1000 objects</strong></p>                 | <p>returns <code>distance</code><br>drift detected when <code>distance</code> >= <code>threshold</code><br>default threshold: 0.1</p>     |
| <p><code>kl\_div</code><br>Kullback-Leibler divergence</p>       | <p>tabular data<br>numerical and categorical</p>                                                                                   | <p>returns <code>divergence</code><br>drift detected when <code>divergence</code> >= <code>threshold</code><br>default threshold: 0.1</p> |
| <p><code>psi</code><br>Population Stability Index (PSI)</p>      | <p>tabular data<br>numerical and categorical</p>                                                                                   | <p>returns <code>psi\_value</code><br>drift detected when <code>psi\_value</code> >= <code>threshold</code><br>default threshold: 0.1</p> |
| <p><code>jensenshannon</code><br>Jensen-Shannon distance</p>     | <p>tabular data<br>numerical and categorical<br><br><strong>Default method for categorical, if > 1000 objects</strong></p>         | <p>returns <code>distance</code><br>drift detected when <code>distance</code> >= <code>threshold</code><br>default threshold: 0.1</p>     |
| <p><code>anderson</code><br>Anderson-Darling test</p>            | <p>tabular data<br>only numerical</p>                                                                                              | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>fisher\_exact</code><br>Fisher's Exact test</p>         | <p>tabular data<br>only categorical</p>                                                                                            | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>cramer\_von\_mises</code><br>Cramer-Von-Mises test</p>  | <p>tabular data<br>only numerical</p>                                                                                              | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>g-test</code><br>G-test</p>                             | <p>tabular data<br>only categorical</p>                                                                                            | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>hellinger</code><br>Hellinger Distance (normed)</p>     | <p>tabular data<br>numerical and categorical</p>                                                                                   | <p>returns <code>distance</code><br>drift detected when <code>distance</code> >= <code>threshold</code><br>default threshold: 0.1</p>     |
| <p><code>mannw</code><br>Mann-Whitney U-rank test</p>            | <p>tabular data<br>only numerical</p>                                                                                              | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>ed</code><br>Energy distance</p>                        | <p>tabular data<br>only numerical</p>                                                                                              | <p>returns <code>distance</code><br>drift detected when <code>distance</code> >= <code>threshold</code><br>default threshold: 0.1</p>     |
| <p><code>es</code><br>Epps-Singleton tes</p>                     | <p>tabular data<br>only numerical</p>                                                                                              | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>t\_test</code><br>T-Test</p>                            | <p>tabular data<br>only numerical</p>                                                                                              | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>empirical\_mmd</code><br>Empirical-MMD</p>              | <p>tabular data<br>only numerical</p>                                                                                              | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |
| <p><code>TVD</code><br>Total-Variation-Distance</p>              | <p>tabular data<br>only categorical</p>                                                                                            | <p>returns <code>p\_value</code><br>drift detected when <code>p\_value</code> < <code>threshold</code><br>default threshold: 0.05</p>     |

## Text drift detection

Text drift detection applies to columns with **raw text data**, as specified in column mapping.

{% hint style="info" %}
**Embedding drift detection.** If you work with embeddings, you can use [Embeddings Drift Detection methods](/user-guide/customization/embeddings-drift-parameters.md).
{% endhint %}

### Drift parameters - Text

The following text drift detection parameters are available in the `DataDriftTable()`, `DatasetDriftMetric()`, `ColumnDriftMetric()`, related Tests and Presets that contain them.

| Parameter                 | Description                                                                                                                                        |
| ------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| `stattest`                | Defines the drift detection method for a given column that contains text data, or for all columns in the dataset if all columns contain text data. |
| `stattest_threshold`      | Sets the threshold as a drift detection parameter.                                                                                                 |
| `text_stattest`           | Defines the drift detection method for all text columns in the dataset.                                                                            |
| `text_stattest_threshold` | Sets the threshold as a drift detection parameter.                                                                                                 |

### Drift detection methods - Text

To use the following text drift detection methods, pass them using the `stattest` parameter.

| StatTest                                                                                                                      | Description                                                                                                                                                                                        | Drift score                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| ----------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p><code>perc\_text\_content\_drift</code><br>Text content drift (domain classifier, with statistical hypothesis testing)</p> | <p>Applies only to text data. Trains a classifier model to distinguish between text in “current” and “reference” datasets.<br><br><strong>Default for text data when <= 1000 objects.</strong></p> | <ul><li>returns <code>roc\_auc</code> of the classifier as a <code>drift\_score</code></li><li>drift detected when <code>roc\_auc</code> > possible ROC AUC of the random classifier at a set percentile</li><li><code>threshold</code> sets the percentile of the possible ROC AUC values of the random classifier to compare against</li><li>default threshold: 0.95 (95th percentile)</li><li><code>roc\_auc</code> values can be 0 to 1 (typically 0.5 to 1); a higher value means more confident drift detection</li></ul> |
| <p><code>abs\_text\_content\_drift</code><br>Text content drift (domain classifier)</p>                                       | <p>Applies only to text data. Trains a classifier model to distinguish between text in “current” and “reference” datasets.<br><br><strong>Default for text data when > 1000 objects.</strong></p>  | <ul><li>returns <code>roc\_auc</code> of the classifier as a <code>drift\_score</code></li><li>drift detected when <code>roc\_auc</code> > <code>threshold</code></li><li><code>threshold</code> sets the ROC AUC threshold</li><li>default threshold: 0.55</li><li><code>roc\_auc</code> values can be 0 to 1 (typically 0.5 to 1); a higher value means more confident drift detection</li></ul>                                                                                                                              |

### Text descriptors drift

You can also check for distribution drift in text descriptors (such as text length, etc.)

To use this method, call a separate `TextDescriptorsDriftMetric()`. You can pass any of the tabular drift detection methods as a parameter.

```python
report = Report(metrics=[
    TextDescriptorsDriftMetric("Review_Text"),
])

report.run(reference_data=reviews_ref, current_data=reviews_cur, column_mapping=column_mapping)
report
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs-old.evidentlyai.com/user-guide/customization/options-for-statistical-tests.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
