LogoLogo
HomeBlogGitHub
latest
latest
  • New DOCS
  • What is Evidently?
  • Get Started
    • Evidently Cloud
      • Quickstart - LLM tracing
      • Quickstart - LLM evaluations
      • Quickstart - Data and ML checks
      • Quickstart - No-code evaluations
    • Evidently OSS
      • OSS Quickstart - LLM evals
      • OSS Quickstart - Data and ML monitoring
  • Presets
    • All Presets
    • Data Drift
    • Data Quality
    • Target Drift
    • Regression Performance
    • Classification Performance
    • NoTargetPerformance
    • Text Evals
    • Recommender System
  • Tutorials and Examples
    • All Tutorials
    • Tutorial - Tracing
    • Tutorial - Reports and Tests
    • Tutorial - Data & ML Monitoring
    • Tutorial - LLM Evaluation
    • Self-host ML Monitoring
    • LLM as a judge
    • LLM Regression Testing
  • Setup
    • Installation
    • Evidently Cloud
    • Self-hosting
  • User Guide
    • 📂Projects
      • Projects overview
      • Manage Projects
    • 📶Tracing
      • Tracing overview
      • Set up tracing
    • 🔢Input data
      • Input data overview
      • Column mapping
      • Data for Classification
      • Data for Recommendations
      • Load data to pandas
    • 🚦Tests and Reports
      • Reports and Tests Overview
      • Get a Report
      • Run a Test Suite
      • Evaluate Text Data
      • Output formats
      • Generate multiple Tests or Metrics
      • Run Evidently on Spark
    • 📊Evaluations
      • Evaluations overview
      • Generate snapshots
      • Run no code evals
    • 🔎Monitoring
      • Monitoring overview
      • Batch monitoring
      • Collector service
      • Scheduled evaluations
      • Send alerts
    • 📈Dashboard
      • Dashboard overview
      • Pre-built Tabs
      • Panel types
      • Adding Panels
    • 📚Datasets
      • Datasets overview
      • Work with Datasets
    • 🛠️Customization
      • Data drift parameters
      • Embeddings drift parameters
      • Feature importance in data drift
      • Text evals with LLM-as-judge
      • Text evals with HuggingFace
      • Add a custom text descriptor
      • Add a custom drift method
      • Add a custom Metric or Test
      • Customize JSON output
      • Show raw data in Reports
      • Add text comments to Reports
      • Change color schema
    • How-to guides
  • Reference
    • All tests
    • All metrics
      • Ranking metrics
    • Data drift algorithm
    • API Reference
      • evidently.calculations
        • evidently.calculations.stattests
      • evidently.metrics
        • evidently.metrics.classification_performance
        • evidently.metrics.data_drift
        • evidently.metrics.data_integrity
        • evidently.metrics.data_quality
        • evidently.metrics.regression_performance
      • evidently.metric_preset
      • evidently.options
      • evidently.pipeline
      • evidently.renderers
      • evidently.report
      • evidently.suite
      • evidently.test_preset
      • evidently.test_suite
      • evidently.tests
      • evidently.utils
  • Integrations
    • Integrations
      • Evidently integrations
      • Notebook environments
      • Evidently and Airflow
      • Evidently and MLflow
      • Evidently and DVCLive
      • Evidently and Metaflow
  • SUPPORT
    • Migration
    • Contact
    • F.A.Q.
    • Telemetry
    • Changelog
  • GitHub Page
  • Website
Powered by GitBook
On this page
  • Code example
  • Compute feature importances
  • Pass your own importances
  1. User Guide
  2. Customization

Feature importance in data drift

How to show feature importance in Data Drift evaluations.

PreviousEmbeddings drift parametersNextText evals with LLM-as-judge

Last updated 1 month ago

You are looking at the old Evidently documentation: this API is available with versions 0.6.7 or lower. Check the newer docs version .

You can add feature importances to the dataset-level data drift Tests and Metrics:

  • DataDriftTable

  • TestShareOfDriftedColumns

Code example

Notebook example on showing feature importance:

Compute feature importances

By default, the feature importance column is not shown. To display them, you must set the feature_importance parameter as True.

report = Report(metrics = [
    DataDriftTable(feature_importance=True)
])

If you do not specify anything else, Evidently will train a random forest model using the provided dataset and derive the feature importances.

Notes:

  • This is only possible if your dataset contains the target column.

  • If you have both current and reference datasets, two different models will be trained. You will have two columns with feature importance: one for reference and one for current data.

  • If your dataset also contains the prediction column, you should clearly label it using Column Mapping to avoid it being treated as a feature.

Pass your own importances

You can also pass the list of feature importances derived during the model training process. This is a recommended option.

In this case, pass it as a list using the additional_data parameter when running the Report.

report = Report(metrics = [
    DataDriftTable(feature_importance=True)
])
report.run(reference_data=reference,
           current_data=current.loc['2011-01-29 00:00:00':'2011-02-07 23:00:00'],
           column_mapping=column_mapping,
           additional_data = {'current_feature_importance':
              dict(map(lambda i,j : (i,j), numerical_features + categorical_features, regressor.feature_importances_))
            }
           )

You can pass the current_feature_importance – a single column will appear in this case. You can also optionally pass reference_feature_importance.

🛠️
here
evidently/examples/how_to_questions/how_to_add_feature_importances_to_drift.ipynb at ad71e132d59ac3a84fce6cf27bd50b12b10d9137 · evidentlyai/evidentlyGitHub
Logo