> For the complete documentation index, see [llms.txt](https://docs-old.evidentlyai.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs-old.evidentlyai.com/v0.1.57/reports/categorical-target-drift.md).

# Categorical Target Drift

**TL;DR:** The report explores the changes in the categorical target function (prediction).

* Performs a suitable **statistical test** to compare target (prediction) **distribution**.
* **Plots the relations** between each individual feature and the target (prediction)

## Summary

The **Target Drift** report helps detect and explore changes in the target function and/or model predictions.

The **Categorical Target Drift** report is suitable for problem statements with the categorical target function: binary classification, multi-class classification, etc.

## Requirements

To run this report, you need to have input features, and **target and/or prediction** columns available.

You will need **two** datasets. The **reference** dataset serves as a benchmark. We analyze the change by comparing the **current** production data to the **reference** data.

You can potentially choose any two datasets for comparison. But keep in mind that only the **reference** dataset will be used as a basis for comparison.

## How it works

We estimate the drift for the **target** (actual values) and **predictions** in the same manner. If both columns are passed to the dashboard, we build two sets of plots.

If only one of them (either target or predictions) is provided, we build one set of plots. If neither target nor predictions column is available, you will get an error.

To estimate the **categorical target (prediction) drift**, we compare the distribution of the target (prediction) in the two datasets. There is a default logic to choosing the appropriate statistical test, based on:

* the number of observations in the reference dataset
* the number of unique values in the target (n\_unique)

For **small data with <= 1000 observations** in the reference dataset:

* For categorical target with **n\_unique > 2**: [chi-squared test](https://en.wikipedia.org/wiki/Chi-squared_test).
* For **binary** categorical target (n\_unique <= 2), we use the proportion difference test for independent samples based on Z-score.

All tests use a 0.95 confidence level by default.

For **larger data with > 1000 observations** in the reference dataset we use [Jensen–Shannon divergence](https://en.wikipedia.org/wiki/Jensen–Shannon_divergence) with a threshold = 0.1.

{% hint style="info" %}
You can modify the drift detection logic by selecting a statistical test already available in the library, including PSI, K–L divergence, Jensen-Shannon distance, Wasserstein distance. See more details about [available tests](/v0.1.57/user-guide/customization/options-for-statistical-tests.md). You can also set a different confidence level or implement a custom test, by defining [custom options](/v0.1.57/user-guide/customization/options-for-data-target-drift.md).
{% endhint %}

## How it looks

The report includes 2 components. All plots are interactive.

### 1. Target (Prediction) Drift

The report first shows the **comparison of target (prediction) distributions** in the current and reference dataset. The result of the statistical test and P-value are displayed in the title.

For a classification problem with three classes, it can look like this (an example of the extreme target drift with the appearance of a new class):

![](/files/ngpcBReCShx1kdIymO8x)

### 2. Target (Prediction) Behavior By Feature

The report generates an interactive table with the **visualizations of dependencies between the target and each feature**.

![](/files/QShh1I6xb7Rkl21l6vbu)

If you click on any feature, you get a plot that shows the feature distribution for the different target labels.

![](/files/GtoUPjWFTCpPoVSv9wdw)

These plots help analyze how feature values relate to the target labels and identify the differences between the datasets.

We recommend paying attention to the behavior of the most important features since significant changes might confuse the model and cause higher errors.

## Report customization

You can set different [Options for Data / Target drift](/v0.1.57/user-guide/customization/options-for-data-target-drift.md) and [Options for Quality Metrics](/v0.1.57/user-guide/customization/options-for-quality-metrics.md) to modify the report components.

You can also select which components of the reports to display or choose to show the short version of the report: [Select Widgets](/v0.1.57/user-guide/customization/select-widgets-to-display.md).

If you want to create a new plot or metric, you can [Custom Widgets and Tabs](/v0.1.57/user-guide/customization/add-a-custom-widget-or-tab.md).

## When to use the report

Here are our suggestions on when to use it—best combined with the [Data Drift report.](/v0.1.57/reports/data-drift.md)

**1. Before model retraining.** Before feeding fresh data into the model, you might want to verify whether it even makes sense.

**2. When you are debugging the model decay.** If you observe a drop in performance, this report can help see what has changed.

**3. When you are flying blind, and no ground truth is available.** If you do not have immediate feedback, you can use this report to explore the changes in the model output and the relationship between the features and prediction. This can help anticipate [data and concept drift](https://evidentlyai.com/blog/machine-learning-monitoring-data-and-concept-drift).

## JSON Profile

If you choose to generate a JSON profile, it will contain the following information:

```yaml
{
 cat_target_drift": {
    "name": "cat_target_drift",
    "datetime": "datetime",
    "data": {
      "utility_columns": {
        "date": null,
        "id": null,
        "target": "target",
        "prediction": null
      },
      "cat_feature_names": [],
      "num_feature_names": [],
      "metrics": {
        "target_name": "target",
        "target_type": "cat",
        "target_drift": p_value
      }
    }
  },
  "timestamp": "timestamp"
}
```

## Examples

* Browse our [examples](/v0.1.57/examples.md) for sample Jupyter notebooks.

You can also read the initial [release blog](https://evidentlyai.com/blog/evidently-014-target-and-prediction-drift).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs-old.evidentlyai.com/v0.1.57/reports/categorical-target-drift.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
