LogoLogo
HomeBlogGitHub
latest
latest
  • New DOCS
  • What is Evidently?
  • Get Started
    • Evidently Cloud
      • Quickstart - LLM tracing
      • Quickstart - LLM evaluations
      • Quickstart - Data and ML checks
      • Quickstart - No-code evaluations
    • Evidently OSS
      • OSS Quickstart - LLM evals
      • OSS Quickstart - Data and ML monitoring
  • Presets
    • All Presets
    • Data Drift
    • Data Quality
    • Target Drift
    • Regression Performance
    • Classification Performance
    • NoTargetPerformance
    • Text Evals
    • Recommender System
  • Tutorials and Examples
    • All Tutorials
    • Tutorial - Tracing
    • Tutorial - Reports and Tests
    • Tutorial - Data & ML Monitoring
    • Tutorial - LLM Evaluation
    • Self-host ML Monitoring
    • LLM as a judge
    • LLM Regression Testing
  • Setup
    • Installation
    • Evidently Cloud
    • Self-hosting
  • User Guide
    • 📂Projects
      • Projects overview
      • Manage Projects
    • 📶Tracing
      • Tracing overview
      • Set up tracing
    • 🔢Input data
      • Input data overview
      • Column mapping
      • Data for Classification
      • Data for Recommendations
      • Load data to pandas
    • 🚦Tests and Reports
      • Reports and Tests Overview
      • Get a Report
      • Run a Test Suite
      • Evaluate Text Data
      • Output formats
      • Generate multiple Tests or Metrics
      • Run Evidently on Spark
    • 📊Evaluations
      • Evaluations overview
      • Generate snapshots
      • Run no code evals
    • 🔎Monitoring
      • Monitoring overview
      • Batch monitoring
      • Collector service
      • Scheduled evaluations
      • Send alerts
    • 📈Dashboard
      • Dashboard overview
      • Pre-built Tabs
      • Panel types
      • Adding Panels
    • 📚Datasets
      • Datasets overview
      • Work with Datasets
    • 🛠️Customization
      • Data drift parameters
      • Embeddings drift parameters
      • Feature importance in data drift
      • Text evals with LLM-as-judge
      • Text evals with HuggingFace
      • Add a custom text descriptor
      • Add a custom drift method
      • Add a custom Metric or Test
      • Customize JSON output
      • Show raw data in Reports
      • Add text comments to Reports
      • Change color schema
    • How-to guides
  • Reference
    • All tests
    • All metrics
      • Ranking metrics
    • Data drift algorithm
    • API Reference
      • evidently.calculations
        • evidently.calculations.stattests
      • evidently.metrics
        • evidently.metrics.classification_performance
        • evidently.metrics.data_drift
        • evidently.metrics.data_integrity
        • evidently.metrics.data_quality
        • evidently.metrics.regression_performance
      • evidently.metric_preset
      • evidently.options
      • evidently.pipeline
      • evidently.renderers
      • evidently.report
      • evidently.suite
      • evidently.test_preset
      • evidently.test_suite
      • evidently.tests
      • evidently.utils
  • Integrations
    • Integrations
      • Evidently integrations
      • Notebook environments
      • Evidently and Airflow
      • Evidently and MLflow
      • Evidently and DVCLive
      • Evidently and Metaflow
  • SUPPORT
    • Migration
    • Contact
    • F.A.Q.
    • Telemetry
    • Changelog
  • GitHub Page
  • Website
Powered by GitBook
On this page
  • Submodules
  • anderson_darling_stattest module
  • Example
  • chisquare_stattest module
  • Example
  • cramer_von_mises_stattest module
  • Example
  • class CramerVonMisesResult(statistic, pvalue)
  • energy_distance module
  • Example
  • epps_singleton_stattest module
  • Example
  • fisher_exact_stattest module
  • Example
  • g_stattest module
  • Example
  • hellinger_distance module
  • Example
  • jensenshannon module
  • Example
  • kl_div module
  • Example
  • ks_stattest module
  • Example
  • mann_whitney_urank_stattest module
  • Example
  • psi module
  • Example
  • registry module
  • class StatTest(name: str, display_name: str, func: Callable[[pandas.core.series.Series, pandas.core.series.Series, str, float], Tuple[float, bool]], allowed_feature_types: List[str], default_threshold: float = 0.05)
  • exception StatTestInvalidFeatureTypeError(stattest_name: str, feature_type: str)
  • exception StatTestNotFoundError(stattest_name: str)
  • class StatTestResult(drift_score: float, drifted: bool, actual_threshold: float)
  • get_stattest(reference_data: Series, current_data: Series, feature_type: str, stattest_func: Optional[Union[str, Callable[[Series, Series, str, float], Tuple[float, bool]], StatTest]])
  • register_stattest(stat_test: StatTest)
  • t_test module
  • Example
  • tvd_stattest module
  • Example
  • utils module
  • generate_fisher2x2_contingency_table(reference_data: Series, current_data: Series)
  • get_binned_data(reference_data: Series, current_data: Series, feature_type: str, n: int, feel_zeroes: bool = True)
  • get_unique_not_nan_values_list_from_series(current_data: Series, reference_data: Series)
  • permutation_test(reference_data, current_data, observed, test_statistic_func, iterations=100)
  • wasserstein_distance_norm module
  • Example
  • z_stattest module
  • Example
  • proportions_diff_z_stat_ind(ref: DataFrame, curr: DataFrame)
  • proportions_diff_z_test(z_stat, alternative='two-sided')
  1. Reference
  2. API Reference
  3. evidently.calculations

evidently.calculations.stattests

Available statistical tests. For detailed information about statistical tests see module documentation.

Submodules

anderson_darling_stattest module

Anderson-Darling test of two samples.

Name: “anderson”

Import:

>>> from evidently.calculations.stattests import anderson_darling_test

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import anderson_darling_test
>>> options = DataDriftOptions(all_features_stattest=anderson_darling_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="anderson")

chisquare_stattest module

Chisquare test of two samples.

Name: “chisquare”

Import:

>>> from evidently.calculations.stattests import chi_stat_test

Properties:

  • only for categorical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import chi_stat_test
>>> options = DataDriftOptions(all_features_stattest=chi_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="chisquare")

cramer_von_mises_stattest module

Cramer-Von-mises test of two samples.

Name: “cramer_von_mises”

Import:

>>> from evidently.calculations.stattests import cramer_von_mises

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import cramer_von_mises
>>> options = DataDriftOptions(all_features_stattest=cramer_von_mises)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="cramer_von_mises")

class CramerVonMisesResult(statistic, pvalue)

Bases: object

energy_distance module

Energy-distance test of two samples.

Name: “ed”

Import:

>>> from evidently.calculations.stattests import energy_dist_test

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import energy_dist_test
>>> options = DataDriftOptions(all_features_stattest=energy_dist_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="ed")

epps_singleton_stattest module

Epps-Singleton test of two samples.

Name: “es”

Import:

>>> from evidently.calculations.stattests import epps_singleton_test

Properties:

  • only for numerical features

  • returns p-value

  • default threshold 0.05

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import epps_singleton_test
>>> options = DataDriftOptions(all_features_stattest=epps_singleton_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="es")

fisher_exact_stattest module

Fisher’s exact test of two samples.

Name: “fisher_exact”

Import:

>>> from evidently.calculations.stattests import fisher_exact_test

Properties:

  • only for categorical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import fisher_exact_test
>>> options = DataDriftOptions(all_features_stattest=fisher_exact_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="fisher_exact")

g_stattest module

G-test of two samples.

Name: “g_test”

Import:

>>> from evidently.calculations.stattests import g_test

Properties:

  • only for categorical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import g_test
>>> options = DataDriftOptions(all_features_stattest=g_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="g_test")

hellinger_distance module

Hellinger distance of two samples.

Name: “hellinger”

Import:

>>> from evidently.calculations.stattests import hellinger_stat_test

Properties:

  • only for categorical and numerical features

  • returns distance

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import hellinger_stat_test
>>> options = DataDriftOptions(all_features_stattest=hellinger_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="hellinger")

jensenshannon module

Jensen-Shannon distance of two samples.

Name: “jensenshannon”

Import:

>>> from evidently.calculations.stattests import jensenshannon_stat_test

Properties:

  • only for categorical and numerical features

  • returns distance

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import jensenshannon_stat_test
>>> options = DataDriftOptions(all_features_stattest=jensenshannon_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="jensenshannon")

kl_div module

Kullback-Leibler divergence of two samples.

Name: “kl_div”

Import:

>>> from evidently.calculations.stattests import kl_div_stat_test

Properties:

  • only for categorical and numerical features

  • returns divergence

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import kl_div_stat_test
>>> options = DataDriftOptions(all_features_stattest=kl_div_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="kl_div")

ks_stattest module

Kolmogorov-Smirnov test of two samples.

Name: “ks”

Import:

>>> from evidently.calculations.stattests import ks_stat_test

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import ks_stat_test
>>> options = DataDriftOptions(all_features_stattest=ks_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="ks")

mann_whitney_urank_stattest module

Mann-Whitney U-rank test of two samples.

Name: “mannw”

Import:

>>> from evidently.calculations.stattests import mann_whitney_u_stat_test

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import mann_whitney_u_stat_test
>>> options = DataDriftOptions(all_features_stattest=mann_whitney_u_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="mannw")

psi module

PSI of two samples.

Name: “psi”

Import:

>>> from evidently.calculations.stattests import psi_stat_test

Properties:

  • only for categorical and numerical features

  • returns PSI value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import psi_stat_test
>>> options = DataDriftOptions(all_features_stattest=psi_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="psi")

registry module

class StatTest(name: str, display_name: str, func: Callable[[pandas.core.series.Series, pandas.core.series.Series, str, float], Tuple[float, bool]], allowed_feature_types: List[str], default_threshold: float = 0.05)

Bases: object

Attributes:

allowed_feature_types : List[str]

default_threshold : float = 0.05

display_name : str

func : Callable[[Series, Series, str, float], Tuple[float, bool]]

name : str

exception StatTestInvalidFeatureTypeError(stattest_name: str, feature_type: str)

Bases: ValueError

exception StatTestNotFoundError(stattest_name: str)

Bases: ValueError

class StatTestResult(drift_score: float, drifted: bool, actual_threshold: float)

Bases: object

Attributes:

actual_threshold : float

drift_score : float

drifted : bool

get_stattest(reference_data: Series, current_data: Series, feature_type: str, stattest_func: Optional[Union[str, Callable[[Series, Series, str, float], Tuple[float, bool]], StatTest]])

register_stattest(stat_test: StatTest)

t_test module

T test of two samples.

Name: “t_test”

Import:

>>> from evidently.calculations.stattests import t_test

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import t_test
>>> options = DataDriftOptions(all_features_stattest=t_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="t_test")

tvd_stattest module

Total variation distance of two samples.

Name: “TVD”

Import:

>>> from evidently.calculations.stattests import tvd_test

Properties:

  • only for numerical features

  • returns distance

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import tvd_test
>>> options = DataDriftOptions(all_features_stattest=tvd_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="TVD")

utils module

generate_fisher2x2_contingency_table(reference_data: Series, current_data: Series)

Generate 2x2 contingency matrix for fisher exact test :param reference_data: reference data :param current_data: current data

  • Raises

    ValueError – if reference_data and current_data are not of equal length

  • Returns

    contingency_matrix for binary data

  • Return type

    contingency_matrix

get_binned_data(reference_data: Series, current_data: Series, feature_type: str, n: int, feel_zeroes: bool = True)

Split variable into n buckets based on reference quantiles :param reference_data: reference data :param current_data: current data :param feature_type: feature type :param n: number of quantiles

  • Returns

    % of records in each bucket for reference current_percents: % of records in each bucket for current

  • Return type

    reference_percents

get_unique_not_nan_values_list_from_series(current_data: Series, reference_data: Series)

Get unique values from current and reference series, drop NaNs

permutation_test(reference_data, current_data, observed, test_statistic_func, iterations=100)

Perform a two-sided permutation test :param reference_data: reference data :param current_data: current data :param observed: observed value :param test_statistic_func: the test statistic function :param iterations: number of times to permute

  • Returns

    two-sided p_value

  • Return type

    p_value

wasserstein_distance_norm module

Wasserstein distance of two samples.

Name: “wasserstein”

Import:

>>> from evidently.calculations.stattests import wasserstein_stat_test

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import wasserstein_stat_test
>>> options = DataDriftOptions(all_features_stattest=wasserstein_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="wasserstein")

z_stattest module

Mann-Whitney U-rank test of two samples.

Name: “mannw”

Import:

>>> from evidently.calculations.stattests import mann_whitney_u_stat_test

Properties:

  • only for numerical features

  • returns p-value

Example

Using by object:

>>> from evidently.options import DataDriftOptions
>>> from evidently.calculations.stattests import mann_whitney_u_stat_test
>>> options = DataDriftOptions(all_features_stattest=mann_whitney_u_stat_test)

Using by name:

>>> from evidently.options import DataDriftOptions
>>> options = DataDriftOptions(all_features_stattest="mannw")

proportions_diff_z_stat_ind(ref: DataFrame, curr: DataFrame)

proportions_diff_z_test(z_stat, alternative='two-sided')

Previousevidently.calculationsNextevidently.metrics

Last updated 2 months ago