Get a Report
How to generate Reports using Evidently Python library.
Last updated
How to generate Reports using Evidently Python library.
Last updated
Check the sample notebooks in .
After , import the Report
component and the necessary metric_presets
or metrics
you plan to use:
Here is the general flow:
Input data. Prepare data as a Pandas DataFrame. This will be your current
data to run evaluations for. For some checks, you may need a second reference
dataset. Check the .
Schema mapping. Define your data schema using . Optional, but highly recommended.
Define the Report. Create a Report
object and list the selected metrics
.
Run the Report. Run the Report on your current_data
. If applicable, pass the reference_data
and column_mapping
.
Get the results. View the Report in Jupyter notebook, export the metrics, or to Evidently Platform.
You can use Metric Presets, which are pre-built Reports that work out of the box, or create a custom Report selecting Metrics one by one.
To generate a Report using Metric Preset, simply include the selected Metric Preset in the metrics
list.
Example 1. To generate the Data Quality Report for a single dataset and get the visual output in Jupyter notebook or Colab:
If nothing else is specified, the Report will run with the default parameters for all columns in the dataset.
Example 2. You can include multiple Presets in a Report. To combine Data Drift and Data Quality and run them over two datasets, including a reference dataset necessary for data drift evaluation:
It will display the combined Report in Jupyter notebook or Colab.
Example 3. To export the values computed inside the Report, export it as a Python dictionary.
Example 4. You can customize some of the Metrics inside the Preset. For example, set a custom decision threshold (instead of default 0.5) when computing classification quality metrics:
Example 5. You can pass a list of columns to the Preset, so column-specific Metrics are generated only for those columns, not the entire dataset.
While Presets are a great starting point, you may want to customize the Report by choosing Metrics or adjusting their parameters even more. To do this, create a custom Report.
First, define which Metrics you want to include in your custom Report. Metrics can be either dataset-level or column-level.
Dataset-level metrics. Some Metrics evaluate the entire dataset. For example, a Metric that checks for data drift across the whole dataset or calculates accuracy.
To create a custom Report with dataset-level metrics, create a Report
object and list the metrics
:
Column-level Metrics. Some Metrics focus on individual columns, like evaluating distribution drift or summarizing specific columns. To include column-level Metrics, pass the name of the column to each such Metric:
Combining Metrics and Presets. You can mix Metrics Presets and individual Metrics in the same Report, and also combine column-level and dataset-level Metrics.
Metrics can have optional or required parameters. For example, the data drift detection algorithm selects a method automatically, but you can override this by specifying your preferred method (Optional). To calculate the number of values matching a regular expression, you must always define this expression (Required).
Example 1. How to specify a regular expression (required parameter):
Example 2. How to specify a custom Data Drift test (optional parameter).
Available Presets. There are other Presets: for example, DataDriftPreset
, RegressionPreset
and ClassificationPreset
. Check the list of to understand individual, and all available .
Raw data in visuals. Visuals in the Reports are aggregated. This reduces load time and Report size for larger datasets, even with millions of rows. If you work with small datasets or samples, you can .
There are more output formats!. You can also export Report results in formats like HTML, JSON, dataframe, and more. Refer to the for details.
Refer to the table to see defaults and available parameters that you can pass for each Preset.
Available Metrics: See the table. For a preview, check .
Row-level evals: To generate row-level scores for text data, check .
Generating multiple column-level Metrics: You can use a helper function to easily generate multiple column-level Metrics for a list of columns. See the page on .
Reference: The available parameters for each Metric are listed in the table.