LogoLogo
HomeBlogGitHub
latest
latest
  • New DOCS
  • What is Evidently?
  • Get Started
    • Evidently Cloud
      • Quickstart - LLM tracing
      • Quickstart - LLM evaluations
      • Quickstart - Data and ML checks
      • Quickstart - No-code evaluations
    • Evidently OSS
      • OSS Quickstart - LLM evals
      • OSS Quickstart - Data and ML monitoring
  • Presets
    • All Presets
    • Data Drift
    • Data Quality
    • Target Drift
    • Regression Performance
    • Classification Performance
    • NoTargetPerformance
    • Text Evals
    • Recommender System
  • Tutorials and Examples
    • All Tutorials
    • Tutorial - Tracing
    • Tutorial - Reports and Tests
    • Tutorial - Data & ML Monitoring
    • Tutorial - LLM Evaluation
    • Self-host ML Monitoring
    • LLM as a judge
    • LLM Regression Testing
  • Setup
    • Installation
    • Evidently Cloud
    • Self-hosting
  • User Guide
    • 📂Projects
      • Projects overview
      • Manage Projects
    • 📶Tracing
      • Tracing overview
      • Set up tracing
    • 🔢Input data
      • Input data overview
      • Column mapping
      • Data for Classification
      • Data for Recommendations
      • Load data to pandas
    • 🚦Tests and Reports
      • Reports and Tests Overview
      • Get a Report
      • Run a Test Suite
      • Evaluate Text Data
      • Output formats
      • Generate multiple Tests or Metrics
      • Run Evidently on Spark
    • 📊Evaluations
      • Evaluations overview
      • Generate snapshots
      • Run no code evals
    • 🔎Monitoring
      • Monitoring overview
      • Batch monitoring
      • Collector service
      • Scheduled evaluations
      • Send alerts
    • 📈Dashboard
      • Dashboard overview
      • Pre-built Tabs
      • Panel types
      • Adding Panels
    • 📚Datasets
      • Datasets overview
      • Work with Datasets
    • 🛠️Customization
      • Data drift parameters
      • Embeddings drift parameters
      • Feature importance in data drift
      • Text evals with LLM-as-judge
      • Text evals with HuggingFace
      • Add a custom text descriptor
      • Add a custom drift method
      • Add a custom Metric or Test
      • Customize JSON output
      • Show raw data in Reports
      • Add text comments to Reports
      • Change color schema
    • How-to guides
  • Reference
    • All tests
    • All metrics
      • Ranking metrics
    • Data drift algorithm
    • API Reference
      • evidently.calculations
        • evidently.calculations.stattests
      • evidently.metrics
        • evidently.metrics.classification_performance
        • evidently.metrics.data_drift
        • evidently.metrics.data_integrity
        • evidently.metrics.data_quality
        • evidently.metrics.regression_performance
      • evidently.metric_preset
      • evidently.options
      • evidently.pipeline
      • evidently.renderers
      • evidently.report
      • evidently.suite
      • evidently.test_preset
      • evidently.test_suite
      • evidently.tests
      • evidently.utils
  • Integrations
    • Integrations
      • Evidently integrations
      • Notebook environments
      • Evidently and Airflow
      • Evidently and MLflow
      • Evidently and DVCLive
      • Evidently and Metaflow
  • SUPPORT
    • Migration
    • Contact
    • F.A.Q.
    • Telemetry
    • Changelog
  • GitHub Page
  • Website
Powered by GitBook
On this page
  • Overview
  • How it works
  • Step 1. Install DVCLive and Evidently
  • Step 2. Load the data
  • Step 3. Define column mapping
  • Step 4. Define what to log
  • Step 5. Define the comparison windows
  • Step 6. Run and log experiments in DVC
  • Step 7
  1. Integrations
  2. Integrations

Evidently and DVCLive

Log Evidently metrics with DVC via DVCLive.

PreviousEvidently and MLflowNextEvidently and Metaflow

Last updated 1 month ago

You are looking at the old Evidently documentation: this API is available with versions 0.6.7 or lower. Check the newer docs version .

This is a community-contributed integration. Author: .

TL;DR: You can use Evidently to calculate metrics, and DVCLive to log and view the results.

Jupyter notebook with en example:

Overview

How it works

You can then generate the calculation output in a Python dictionary format. You should explicitly define which parts of the output to send to DVCLive Tracking.

Step 1. Install DVCLive and Evidently

Install DVCLive, Evidently, and pandas (to handle the data) in your Python environment:

$ pip install dvc dvclive evidently pandas

Step 2. Load the data

$ wget https://archive.ics.uci.edu/static/public/275/bike+sharing+dataset.zip
$ unzip bike+sharing+dataset.zip
import pandas as pd

df = pd.read_csv("raw_data/day.csv", header=0, sep=',', parse_dates=['dteday'])
df.head()

Step 3. Define column mapping

You should specify the categorical and numerical features so that Evidently performs the correct statistical test for each of them. While Evidently can parse the data structure automatically, manually specifying the column type can minimize errors.

from evidently.pipeline.column_mapping import ColumnMapping

data_columns = ColumnMapping()
data_columns.numerical_features = ['weathersit', 'temp', 'atemp', 'hum', 'windspeed']
data_columns.categorical_features = ['holiday', 'workingday']

Step 4. Define what to log

Specify which metrics you want to calculate. In this case, you can generate the Data Drift report and log the drift score for each feature.

from evidently.report import Report
from evidently.metric_preset import DataDriftPreset


def eval_drift(reference, production, column_mapping):
    data_drift_report = Report(metrics=[DataDriftPreset()])
    data_drift_report.run(
        reference_data=reference, current_data=production, column_mapping=column_mapping
    )
    report = data_drift_report.as_dict()

    drifts = []

    for feature in (
        column_mapping.numerical_features + column_mapping.categorical_features
    ):
        drifts.append(
            (
                feature,
                report["metrics"][1]["result"]["drift_by_columns"][feature][
                    "drift_score"
                ],
            )
        )

    return drifts

You can adapt what you want to calculate by selecting a different Preset or Metric from those available in Evidently.

Step 5. Define the comparison windows

Specify the period that is considered reference: Evidently will use it as the base for the comparison. Then, you should choose the periods to treat as experiments. This emulates the production model runs.

#set reference dates
reference_dates = ('2011-01-01 00:00:00','2011-01-28 23:00:00')

#set experiment batches dates
experiment_batches = [
    ('2011-01-01 00:00:00','2011-01-29 23:00:00'),
    ('2011-01-29 00:00:00','2011-02-07 23:00:00'),
    ('2011-02-07 00:00:00','2011-02-14 23:00:00'),
    ('2011-02-15 00:00:00','2011-02-21 23:00:00'),
]

Step 6. Run and log experiments in DVC

There are two ways to track the results of Evidently with DVCLive:

  1. you can save the results of each item in the batch in one single experiment (each experiment corresponds to a git commit), in separate steps

  2. or you can save the result of each item in the batch as a separate experiment

Step 7

Option 1. One single experiment

from dvclive import Live

with Live() as live:
    for date in experiment_batches:
        live.log_param("begin", date[0])
        live.log_param("end", date[1])

        metrics = eval_drift(
            df.loc[df.dteday.between(reference_dates[0], reference_dates[1])],
            df.loc[df.dteday.between(date[0], date[1])],
            column_mapping=data_columns,
        )

        for feature in metrics:
            live.log_metric(feature[0], round(feature[1], 3))

        live.next_step()

You can then inspect the results using

$ dvc plots show

In a Jupyter notebook environment, you can show the plots as a cell output simply by using Live(report="notebook").

Option 2. Multiple experiments

from dvclive import Live

for step, date in enumerate(experiment_batches):
    with Live() as live:
        live.log_param("begin", date[0])
        live.log_param("end", date[1])
        live.log_param("step", step)

        metrics = eval_drift(
            df.loc[df.dteday.between(reference_dates[0], reference_dates[1])],
            df.loc[df.dteday.between(date[0], date[1])],
            column_mapping=data_columns,
        )

        for feature in metrics:
            live.log_metric(feature[0], round(feature[1], 3))

You can the inspect the results using

$ dvc exp show
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  Experiment                 Created    weathersit    temp   atemp     hum   windspeed   holiday   workingday   step   begin                 end
 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  workspace                  -               0.231       0       0   0.062       0.012     0.275        0.593   3      2011-02-15 00:00:00   2011-02-21 23:00:00
  master                     10:02 AM            -       -       -       -           -         -            -   -      -                     -
  ├── a96b45c [muggy-rand]   10:02 AM        0.231       0       0   0.062       0.012     0.275        0.593   3      2011-02-15 00:00:00   2011-02-21 23:00:00
  ├── 78c6668 [pawky-arcs]   10:02 AM        0.155   0.399   0.537   0.684       0.611     0.588        0.699   2      2011-02-07 00:00:00   2011-02-14 23:00:00
  ├── c1dd720 [joint-wont]   10:02 AM        0.779   0.098   0.107    0.03       0.171     0.545        0.653   1      2011-01-29 00:00:00   2011-02-07 23:00:00
  └── d0ddb8d [osmic-impi]   10:02 AM        0.985       1       1       1           1      0.98        0.851   0      2011-01-01 00:00:00   2011-01-29 23:00:00
 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
import dvc.api

dvc.api.exp_show()

can be used to track the results of. In the following we demonstrate it through an example.

Evidently calculates a rich set of metrics and statistical tests. You can choose any of the or combine to define what you want to measure. For example, you can evaluate prediction drift and feature drift together.

Load the and save it locally. For demonstration purposes, we treat this data as the input data for a live model. To use with production models, you should make your prediction logs available.

This is how it looks:

We will demonstrate both, and show you how to inspect the results regardless of your IDE. However, if you are using VSCode, we recommend using the to inspect the results.

and inspecting the resulting dvc_plots/index.html, which should look like this:

In a Jupyter notebook environment, you can access the experiments results using the :

DVCLive
Evidently
pre-built reports
individual metrics
data from UCI repository
DVC extension for VS Code
Python DVC api
here
Francesco Calcavecchia
evidently/examples/integrations/dvclive_logging/dvclive_integration.ipynb at ad71e132d59ac3a84fce6cf27bd50b12b10d9137 · evidentlyai/evidentlyGitHub
Logo