OSS Quickstart - Data and ML monitoring
Run your first evaluation using Evidently open-source, for tabular data.
It's best to run this example in Jupyter Notebook or Google Colab so that you can render HTML Reports directly in a notebook cell.
Installation
Install Evidently using the pip package manager:
!pip install evidently
Imports
Import the Evidently components and a toy “Iris” dataset:
import pandas as pd
from sklearn import datasets
from evidently.test_suite import TestSuite
from evidently.test_preset import DataStabilityTestPreset
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
iris_data = datasets.load_iris(as_frame='auto')
iris_frame = iris_data.frame
Run a Test Suite
Split the data into two batches. Run a set of pre-built data quality Tests to evaluate the quality of the current_data
:
data_stability= TestSuite(tests=[
DataStabilityTestPreset(),
])
data_stability.run(current_data=iris_frame.iloc[:60], reference_data=iris_frame.iloc[60:], column_mapping=None)
data_stability
This will automatically generate tests on share of nulls, out-of-range values, etc. – with test conditions generated based on the first "reference" dataset.
Get a Report
Get a Data Drift Report to see if the data distributions shifted between two datasets:
data_drift_report = Report(metrics=[
DataDriftPreset(),
])
data_drift_report.run(current_data=iris_frame.iloc[:60], reference_data=iris_frame.iloc[60:], column_mapping=None)
data_drift_report
What's next?
Want more details on Reports and Test Suites? See an in-depth tutorial.
Tutorial - Reports and TestsWant to set up monitoring? Send the evaluation results to Evidently Cloud for analysis and tracking. See the Quickstart:
Quickstart - LLM evaluationsWorking with LLMs? Check the Quickstart:
Quickstart - LLM evaluationsNeed help? Ask in our Discord community.
Last updated