LogoLogo
HomeBlogGitHub
latest
latest
  • New DOCS
  • What is Evidently?
  • Get Started
    • Evidently Cloud
      • Quickstart - LLM tracing
      • Quickstart - LLM evaluations
      • Quickstart - Data and ML checks
      • Quickstart - No-code evaluations
    • Evidently OSS
      • OSS Quickstart - LLM evals
      • OSS Quickstart - Data and ML monitoring
  • Presets
    • All Presets
    • Data Drift
    • Data Quality
    • Target Drift
    • Regression Performance
    • Classification Performance
    • NoTargetPerformance
    • Text Evals
    • Recommender System
  • Tutorials and Examples
    • All Tutorials
    • Tutorial - Tracing
    • Tutorial - Reports and Tests
    • Tutorial - Data & ML Monitoring
    • Tutorial - LLM Evaluation
    • Self-host ML Monitoring
    • LLM as a judge
    • LLM Regression Testing
  • Setup
    • Installation
    • Evidently Cloud
    • Self-hosting
  • User Guide
    • 📂Projects
      • Projects overview
      • Manage Projects
    • 📶Tracing
      • Tracing overview
      • Set up tracing
    • 🔢Input data
      • Input data overview
      • Column mapping
      • Data for Classification
      • Data for Recommendations
      • Load data to pandas
    • 🚦Tests and Reports
      • Reports and Tests Overview
      • Get a Report
      • Run a Test Suite
      • Evaluate Text Data
      • Output formats
      • Generate multiple Tests or Metrics
      • Run Evidently on Spark
    • 📊Evaluations
      • Evaluations overview
      • Generate snapshots
      • Run no code evals
    • 🔎Monitoring
      • Monitoring overview
      • Batch monitoring
      • Collector service
      • Scheduled evaluations
      • Send alerts
    • 📈Dashboard
      • Dashboard overview
      • Pre-built Tabs
      • Panel types
      • Adding Panels
    • 📚Datasets
      • Datasets overview
      • Work with Datasets
    • 🛠️Customization
      • Data drift parameters
      • Embeddings drift parameters
      • Feature importance in data drift
      • Text evals with LLM-as-judge
      • Text evals with HuggingFace
      • Add a custom text descriptor
      • Add a custom drift method
      • Add a custom Metric or Test
      • Customize JSON output
      • Show raw data in Reports
      • Add text comments to Reports
      • Change color schema
    • How-to guides
  • Reference
    • All tests
    • All metrics
      • Ranking metrics
    • Data drift algorithm
    • API Reference
      • evidently.calculations
        • evidently.calculations.stattests
      • evidently.metrics
        • evidently.metrics.classification_performance
        • evidently.metrics.data_drift
        • evidently.metrics.data_integrity
        • evidently.metrics.data_quality
        • evidently.metrics.regression_performance
      • evidently.metric_preset
      • evidently.options
      • evidently.pipeline
      • evidently.renderers
      • evidently.report
      • evidently.suite
      • evidently.test_preset
      • evidently.test_suite
      • evidently.tests
      • evidently.utils
  • Integrations
    • Integrations
      • Evidently integrations
      • Notebook environments
      • Evidently and Airflow
      • Evidently and MLflow
      • Evidently and DVCLive
      • Evidently and Metaflow
  • SUPPORT
    • Migration
    • Contact
    • F.A.Q.
    • Telemetry
    • Changelog
  • GitHub Page
  • Website
Powered by GitBook
On this page
  • Overview
  • Code example
  • Collector configuration
  • CollectorConfig Object
  • CollectorTrigger
  • Setup via file
  • Setup via API
  • Update reference via API
  • Send events via API
  • Send events via curl
  1. User Guide
  2. Monitoring

Collector service

How to send data in near real-time using the Collector service.

PreviousBatch monitoringNextScheduled evaluations

Last updated 2 months ago

You are looking at the old Evidently documentation. Check the newer version .

Overview

In this scenario, you deploy an Evidently Collector service for near real-time monitoring.

Evidently Collector is a service that allows you to collect online events into batches, create Reports or TestSuites over batches of data, and save them as snapshots to your Workspace.

You will need to POST the predictions from the ML service to the Evidently Collector service. You can POST data on every prediction or batch them. The Evidently collector service will perform asynchronous computation of monitoring snapshots based on the provided configuration.

You can also pass the path to the optional reference dataset.

If you receive delayed ground truth, you can later compute and log the model quality to the same Project. You can run it as a separate process or a batch job.

Code example

Refer to this example:

Collector configuration

Before sending events, you must configure the collector and start the service.

You can choose either of the two options:

  • Create configuration via code, save it to a JSON file, and run the service using it.

  • Run the service first and create configuration via API.

The collector service can simultaneously run multiple “collectors” that compute and save snapshots to different Workspaces or Projects. Each one is represented by a CollectorConfig object.

CollectorConfig Object

You can configure the following parameters:

Parameter
Type
Description

trigger

CollectorTrigger

Defines when to create a new snapshot from the current batch.

report_config

ReportConfig

Configures the contents of the snapshot: Report or TestSuite computed for each batch of data.

reference_path

Optional[str]

Local path to a .parquet file with the reference dataset.

cache_reference

bool

Defines whether to cache reference data or re-read it each time.

api_url

str

URL where the Evidently UI Service runs and snapshots will be saved to. For Evidently Cloud, use api_url="https://app.evidently.cloud"

api_secret

Optional[str]

Evidently UI Service secrets.

project_id

str

ID of the project to save snapshots to.

You can create a ReportConfig object from Report or TestSuite objects. You must run them first so that all Metrics and Tests are collected (including when you use Presets or Test/Metric generators).

report = Report(...) 
report.run(...) 
report_config = ReportConfig.from_report(report) 

# or 

test_suite = TestSuite(...) 
test_suite.run(...) 
report_config = ReportConfig.from_test_suite(test_suite)

CollectorTrigger

Currently, there are two options available:

  • IntervalTrigger: triggers the snapshot calculation at set intervals (in seconds).

  • RowsCountTrigger: triggers the snapshot calculation when a specific row count is reached.

Note: we are also working on CronTrigger and other triggers. Would you like to see additional scenarios? Please open a GitHub issue with your suggestions.

Setup via file

You can define the configuration and save it as a JSON file. Example:

config = CollectorServiceConfig(collectors={
        "main": CollectorConfig(
            trigger=IntervalTrigger(interval=60 * 60),
            report_config=ReportConfig.from_report(report),
            reference_path="reference_data.parquet",
            project_id="834ec9a0-ee58-4e64-816b-c593b0b6c45c",
            api_url="http://localhost:8000"
        )
    })

config.save("collector_config.json")

Then, run the following command:

evidently collector --config-path collector_config.json

Setup via API

First, run the collector service:

evidently collector

Then, use the CollectorClient to add a new collector config:

config = CollectorConfig(
        trigger=IntervalTrigger(interval=60 * 60),
        report_config=ReportConfig.from_report(report),
        reference_path="reference_data.parquet",
        project_id="834ec9a0-ee58-4e64-816b-c593b0b6c45c",
        api_url="<http://localhost:8000>"
    )

Update reference via API

To specify the path to the reference dataset:

reference: pd.DataFrame = ...
client = CollectorClient("<http://localhost:8001>")
client.set_reference("main", reference)

Send events via API

To send events from your ML service:

client = CollectorClient("http://localhost:8001")

events: pd.DataFrame = ...
client.send_data("main", events)

Send events via curl

To send data with curl:

curl 

POST '.../<collector config id>/data'

headers {'evidently-secret': '...', 'Content-Type': 'application/json'}

body '{"column1": {"0": 7.0, "1": 5.0}, "column2": {"0": "a", "1": "b"}}'

Example:

curl -d '{"column1": {"0": 7.0, "1": 5.0}, "column2": {"0": "a", "1": "b"}}' -H 'Content-Type: application/json' http://0.0.0.0:8001/default/data

This is how it looks in the Terminal.

Sending data:

The data is received by the collector service:

🔎
here
https://github.com/evidentlyai/evidently/tree/main/examples/integrations/collector_service