LogoLogo
HomeBlogGitHub
latest
latest
  • New DOCS
  • What is Evidently?
  • Get Started
    • Evidently Cloud
      • Quickstart - LLM tracing
      • Quickstart - LLM evaluations
      • Quickstart - Data and ML checks
      • Quickstart - No-code evaluations
    • Evidently OSS
      • OSS Quickstart - LLM evals
      • OSS Quickstart - Data and ML monitoring
  • Presets
    • All Presets
    • Data Drift
    • Data Quality
    • Target Drift
    • Regression Performance
    • Classification Performance
    • NoTargetPerformance
    • Text Evals
    • Recommender System
  • Tutorials and Examples
    • All Tutorials
    • Tutorial - Tracing
    • Tutorial - Reports and Tests
    • Tutorial - Data & ML Monitoring
    • Tutorial - LLM Evaluation
    • Self-host ML Monitoring
    • LLM as a judge
    • LLM Regression Testing
  • Setup
    • Installation
    • Evidently Cloud
    • Self-hosting
  • User Guide
    • 📂Projects
      • Projects overview
      • Manage Projects
    • 📶Tracing
      • Tracing overview
      • Set up tracing
    • 🔢Input data
      • Input data overview
      • Column mapping
      • Data for Classification
      • Data for Recommendations
      • Load data to pandas
    • 🚦Tests and Reports
      • Reports and Tests Overview
      • Get a Report
      • Run a Test Suite
      • Evaluate Text Data
      • Output formats
      • Generate multiple Tests or Metrics
      • Run Evidently on Spark
    • 📊Evaluations
      • Evaluations overview
      • Generate snapshots
      • Run no code evals
    • 🔎Monitoring
      • Monitoring overview
      • Batch monitoring
      • Collector service
      • Scheduled evaluations
      • Send alerts
    • 📈Dashboard
      • Dashboard overview
      • Pre-built Tabs
      • Panel types
      • Adding Panels
    • 📚Datasets
      • Datasets overview
      • Work with Datasets
    • 🛠️Customization
      • Data drift parameters
      • Embeddings drift parameters
      • Feature importance in data drift
      • Text evals with LLM-as-judge
      • Text evals with HuggingFace
      • Add a custom text descriptor
      • Add a custom drift method
      • Add a custom Metric or Test
      • Customize JSON output
      • Show raw data in Reports
      • Add text comments to Reports
      • Change color schema
    • How-to guides
  • Reference
    • All tests
    • All metrics
      • Ranking metrics
    • Data drift algorithm
    • API Reference
      • evidently.calculations
        • evidently.calculations.stattests
      • evidently.metrics
        • evidently.metrics.classification_performance
        • evidently.metrics.data_drift
        • evidently.metrics.data_integrity
        • evidently.metrics.data_quality
        • evidently.metrics.regression_performance
      • evidently.metric_preset
      • evidently.options
      • evidently.pipeline
      • evidently.renderers
      • evidently.report
      • evidently.suite
      • evidently.test_preset
      • evidently.test_suite
      • evidently.tests
      • evidently.utils
  • Integrations
    • Integrations
      • Evidently integrations
      • Notebook environments
      • Evidently and Airflow
      • Evidently and MLflow
      • Evidently and DVCLive
      • Evidently and Metaflow
  • SUPPORT
    • Migration
    • Contact
    • F.A.Q.
    • Telemetry
    • Changelog
  • GitHub Page
  • Website
Powered by GitBook
On this page
  • Subpackages
  • Submodules
  • classification_performance module
  • class ConfusionMatrix(labels: Sequence[Union[str, int]], values: list)
  • class DatasetClassificationQuality(accuracy: float, precision: float, recall: float, f1: float, roc_auc: Optional[float] = None, log_loss: Optional[float] = None, tpr: Optional[float] = None, tnr: Optional[float] = None, fpr: Optional[float] = None, fnr: Optional[float] = None, rate_plots_data: Optional[Dict] = None, plot_data: Optional[Dict] = None)
  • class PredictionData(predictions: pandas.core.series.Series, prediction_probas: Optional[pandas.core.frame.DataFrame], labels: List[Union[str, int]])
  • calculate_confusion_by_classes(confusion_matrix: ndarray, class_names: Sequence[Union[str, int]])
  • calculate_matrix(target: Series, prediction: Series, labels: List[Union[str, int]])
  • calculate_metrics(column_mapping: ColumnMapping, confusion_matrix: ConfusionMatrix, target: Series, prediction: PredictionData)
  • calculate_pr_table(binded)
  • collect_plot_data(prediction_probas: DataFrame)
  • get_prediction_data(data: DataFrame, data_columns: DatasetColumns, pos_label: Optional[Union[str, int]], threshold: float = 0.5)
  • k_probability_threshold(prediction_probas: DataFrame, k: Union[int, float])
  • threshold_probability_labels(prediction_probas: DataFrame, pos_label: Union[str, int], neg_label: Union[str, int], threshold: float)
  • data_drift module
  • class ColumnDataDriftMetrics(column_name: str, column_type: str, stattest_name: str, drift_score: float, drift_detected: bool, threshold: float, current_distribution: Distribution, reference_distribution: Distribution, current_small_distribution: Optional[list] = None, reference_small_distribution: Optional[list] = None, current_scatter: Optional[Dict[str, list]] = None, x_name: Optional[str] = None, plot_shape: Optional[Dict[str, float]] = None, current_correlations: Optional[Dict[str, float]] = None, reference_correlations: Optional[Dict[str, float]] = None)
  • class DatasetDrift(number_of_drifted_columns: int, dataset_drift_score: float, dataset_drift: bool)
  • class DatasetDriftMetrics(number_of_columns: int, number_of_drifted_columns: int, share_of_drifted_columns: float, dataset_drift: bool, drift_by_columns: Dict[str, ColumnDataDriftMetrics], options: DataDriftOptions, dataset_columns: DatasetColumns)
  • ensure_prediction_column_is_string(*, prediction_column: Optional[Union[str, Sequence]], current_data: DataFrame, reference_data: DataFrame, threshold: float = 0.5)
  • get_dataset_drift(drift_metrics, drift_share=0.5)
  • get_drift_for_columns(*, current_data: DataFrame, reference_data: DataFrame, dataset_columns: DatasetColumns, data_drift_options: DataDriftOptions, drift_share_threshold: Optional[float] = None, columns: Optional[List[str]] = None)
  • get_one_column_drift(*, current_data: DataFrame, reference_data: DataFrame, column_name: str, options: DataDriftOptions, dataset_columns: DatasetColumns, column_type: Optional[str] = None)
  • data_integration module
  • get_number_of_all_pandas_missed_values(dataset: DataFrame)
  • get_number_of_almost_constant_columns(dataset: DataFrame, threshold: float)
  • get_number_of_almost_duplicated_columns(dataset: DataFrame, threshold: float)
  • get_number_of_constant_columns(dataset: DataFrame)
  • get_number_of_duplicated_columns(dataset: DataFrame)
  • get_number_of_empty_columns(dataset: DataFrame)
  • data_quality module
  • class ColumnCorrelations(column_name: str, kind: str, values: Distribution)
  • class DataQualityGetPlotData()
  • class DataQualityPlot(bins_for_hist: Dict[str, pandas.core.frame.DataFrame])
  • class DataQualityStats(rows_count: int, num_features_stats: Optional[Dict[str, FeatureQualityStats]] = None, cat_features_stats: Optional[Dict[str, FeatureQualityStats]] = None, datetime_features_stats: Optional[Dict[str, FeatureQualityStats]] = None, target_stats: Optional[Dict[str, FeatureQualityStats]] = None, prediction_stats: Optional[Dict[str, FeatureQualityStats]] = None)
  • class FeatureQualityStats(feature_type: str, number_of_rows: int = 0, count: int = 0, infinite_count: Optional[int] = None, infinite_percentage: Optional[float] = None, missing_count: Optional[int] = None, missing_percentage: Optional[float] = None, unique_count: Optional[int] = None, unique_percentage: Optional[float] = None, percentile_25: Optional[float] = None, percentile_50: Optional[float] = None, percentile_75: Optional[float] = None, max: Optional[Union[int, float, bool, str]] = None, min: Optional[Union[int, float, bool, str]] = None, mean: Optional[float] = None, most_common_value: Optional[Union[int, float, bool, str]] = None, most_common_value_percentage: Optional[float] = None, std: Optional[float] = None, most_common_not_null_value: Optional[Union[int, float, bool, str]] = None, most_common_not_null_value_percentage: Optional[float] = None, new_in_current_values_count: Optional[int] = None, unused_in_current_values_count: Optional[int] = None)
  • calculate_category_column_correlations(column_name: str, dataset: DataFrame, columns: List[str])
  • calculate_column_distribution(column: Series, column_type: str)
  • calculate_correlations(dataset: DataFrame, columns: DatasetColumns)
  • calculate_cramer_v_correlation(column_name: str, dataset: DataFrame, columns: List[str])
  • calculate_data_quality_stats(dataset: DataFrame, columns: DatasetColumns, task: Optional[str])
  • calculate_numerical_column_correlations(column_name: str, dataset: DataFrame, columns: List[str])
  • get_features_stats(feature: Series, feature_type: str)
  • get_pairwise_correlation(df, func: Callable[[Series, Series], float])
  • get_rows_count(data: Union[DataFrame, Series])
  • regression_performance module
  • class ErrorWithQuantiles(error, quantile_top, quantile_other)
  • class FeatureBias(feature_type: str, majority: float, under: float, over: float, range: float)
  • class RegressionPerformanceMetrics(mean_error: float, mean_abs_error: float, mean_abs_perc_error: float, error_std: float, abs_error_max: float, abs_error_std: float, abs_perc_error_std: float, error_normality: dict, underperformance: dict, error_bias: dict)
  • calculate_regression_performance(dataset: DataFrame, columns: DatasetColumns, error_bias_prefix: str)
  • error_bias_table(dataset, err_quantiles, num_feature_names, cat_feature_names)
  • error_with_quantiles(dataset, prediction_column, target_column, quantile: float)
  1. Reference
  2. API Reference

evidently.calculations

PreviousAPI ReferenceNextevidently.calculations.stattests

Last updated 2 months ago

Subpackages

Submodules

classification_performance module

class ConfusionMatrix(labels: Sequence[Union[str, int]], values: list)

Bases: object

Attributes:

labels : Sequence[Union[str, int]]

values : list

class DatasetClassificationQuality(accuracy: float, precision: float, recall: float, f1: float, roc_auc: Optional[float] = None, log_loss: Optional[float] = None, tpr: Optional[float] = None, tnr: Optional[float] = None, fpr: Optional[float] = None, fnr: Optional[float] = None, rate_plots_data: Optional[Dict] = None, plot_data: Optional[Dict] = None)

Bases: object

Attributes:

accuracy : float

f1 : float

fnr : Optional[float] = None

fpr : Optional[float] = None

log_loss : Optional[float] = None

plot_data : Optional[Dict] = None

precision : float

rate_plots_data : Optional[Dict] = None

recall : float

roc_auc : Optional[float] = None

tnr : Optional[float] = None

tpr : Optional[float] = None

class PredictionData(predictions: pandas.core.series.Series, prediction_probas: Optional[pandas.core.frame.DataFrame], labels: List[Union[str, int]])

Bases: object

Attributes:

labels : List[Union[str, int]]

prediction_probas : Optional[DataFrame]

predictions : Series

calculate_confusion_by_classes(confusion_matrix: ndarray, class_names: Sequence[Union[str, int]])

Calculate metrics:

  • TP (true positive)

  • TN (true negative)

  • FP (false positive)

  • FN (false negative) for each class from confusion matrix.

  • Returns

    a dict like:

    {
        "class_1_name": {
            "tp": 1,
            "tn": 5,
            "fp": 0,
            "fn": 3,
        },
        "class_1_name": {
            "tp": 1,
            "tn": 5,
            "fp": 0,
            "fn": 3,
        },
    }

calculate_matrix(target: Series, prediction: Series, labels: List[Union[str, int]])

calculate_pr_table(binded)

collect_plot_data(prediction_probas: DataFrame)

Get predicted values and optional prediction probabilities from source data. Also take into account a threshold value - if a probability is less than the value, do not take it into account.

Return and object with predicted values and an optional prediction probabilities.

k_probability_threshold(prediction_probas: DataFrame, k: Union[int, float])

threshold_probability_labels(prediction_probas: DataFrame, pos_label: Union[str, int], neg_label: Union[str, int], threshold: float)

Get prediction values by probabilities with the threshold apply

data_drift module

Methods and types for data drift calculations.

Bases: object

One column drift metrics.

Attributes:

column_name : str

column_type : str

current_correlations : Optional[Dict[str, float]] = None

current_scatter : Optional[Dict[str, list]] = None

current_small_distribution : Optional[list] = None

drift_detected : bool

drift_score : float

plot_shape : Optional[Dict[str, float]] = None

reference_correlations : Optional[Dict[str, float]] = None

reference_small_distribution : Optional[list] = None

stattest_name : str

threshold : float

x_name : Optional[str] = None

class DatasetDrift(number_of_drifted_columns: int, dataset_drift_score: float, dataset_drift: bool)

Bases: object

Dataset drift calculation results

Attributes:

dataset_drift : bool

dataset_drift_score : float

number_of_drifted_columns : int

Bases: object

Attributes:

dataset_drift : bool

drift_by_columns : Dict[str, ColumnDataDriftMetrics]

number_of_columns : int

number_of_drifted_columns : int

share_of_drifted_columns : float

ensure_prediction_column_is_string(*, prediction_column: Optional[Union[str, Sequence]], current_data: DataFrame, reference_data: DataFrame, threshold: float = 0.5)

Update dataset by predictions type:

  • if prediction column is None or a string, no dataset changes

  • (binary classification) if predictions is a list and its length equals 2

    set predicted_labels column by threshold

  • (multi label classification) if predictions is a list and its length is greater than 2

    set predicted_labels from probability values in columns by prediction column

  • Returns

    prediction column name.

get_dataset_drift(drift_metrics, drift_share=0.5)

data_integration module

get_number_of_all_pandas_missed_values(dataset: DataFrame)

Calculate the number of missed - nulls by pandas - values in a dataset

get_number_of_almost_constant_columns(dataset: DataFrame, threshold: float)

Calculate the number of almost constant columns in a dataset

get_number_of_almost_duplicated_columns(dataset: DataFrame, threshold: float)

Calculate the number of almost duplicated columns in a dataset

get_number_of_constant_columns(dataset: DataFrame)

Calculate the number of constant columns in a dataset

get_number_of_duplicated_columns(dataset: DataFrame)

Calculate the number of duplicated columns in a dataset

get_number_of_empty_columns(dataset: DataFrame)

Calculate the number of empty columns in a dataset

data_quality module

Methods for overall dataset quality calculations - rows count, a specific values count, etc.

Bases: object

Attributes:

column_name : str

kind : str

class DataQualityGetPlotData()

Bases: object

Methods:

calculate_data_by_target(curr: DataFrame, ref: Optional[DataFrame], feature_name: str, feature_type: str, target_name: str, target_type: str, merge_small_cat: Optional[int] = 5)

calculate_data_in_time(curr: DataFrame, ref: Optional[DataFrame], feature_name: str, feature_type: str, datetime_name: str, merge_small_cat: Optional[int] = 5)

calculate_main_plot(curr: DataFrame, ref: Optional[DataFrame], feature_name: str, feature_type: str, merge_small_cat: Optional[int] = 5)

class DataQualityPlot(bins_for_hist: Dict[str, pandas.core.frame.DataFrame])

Bases: object

Attributes:

bins_for_hist : Dict[str, DataFrame]

class DataQualityStats(rows_count: int, num_features_stats: Optional[Dict[str, FeatureQualityStats]] = None, cat_features_stats: Optional[Dict[str, FeatureQualityStats]] = None, datetime_features_stats: Optional[Dict[str, FeatureQualityStats]] = None, target_stats: Optional[Dict[str, FeatureQualityStats]] = None, prediction_stats: Optional[Dict[str, FeatureQualityStats]] = None)

Bases: object

Attributes:

cat_features_stats : Optional[Dict[str, FeatureQualityStats]] = None

datetime_features_stats : Optional[Dict[str, FeatureQualityStats]] = None

num_features_stats : Optional[Dict[str, FeatureQualityStats]] = None

prediction_stats : Optional[Dict[str, FeatureQualityStats]] = None

rows_count : int

target_stats : Optional[Dict[str, FeatureQualityStats]] = None

Methods:

get_all_features()

class FeatureQualityStats(feature_type: str, number_of_rows: int = 0, count: int = 0, infinite_count: Optional[int] = None, infinite_percentage: Optional[float] = None, missing_count: Optional[int] = None, missing_percentage: Optional[float] = None, unique_count: Optional[int] = None, unique_percentage: Optional[float] = None, percentile_25: Optional[float] = None, percentile_50: Optional[float] = None, percentile_75: Optional[float] = None, max: Optional[Union[int, float, bool, str]] = None, min: Optional[Union[int, float, bool, str]] = None, mean: Optional[float] = None, most_common_value: Optional[Union[int, float, bool, str]] = None, most_common_value_percentage: Optional[float] = None, std: Optional[float] = None, most_common_not_null_value: Optional[Union[int, float, bool, str]] = None, most_common_not_null_value_percentage: Optional[float] = None, new_in_current_values_count: Optional[int] = None, unused_in_current_values_count: Optional[int] = None)

Bases: object

Class for all features data quality metrics store.

A type of the feature is stored in feature_type field. Concrete stat kit depends on the feature type. Is a metric is not applicable - leave None value for it.

Metrics for all feature types:

- feature type - cat for category, num for numeric, datetime for datetime features

- count - quantity of a meaningful values (do not take into account NaN values)

- missing_count - quantity of meaningless (NaN) values

- missing_percentage - the percentage of the missed values

- unique_count - quantity of unique values

- unique_percentage - the percentage of the unique values

- max - maximum value (not applicable for category features)

- min - minimum value (not applicable for category features)

- most_common_value - the most common value in the feature values

- most_common_value_percentage - the percentage of the most common value

- most_common_not_null_value - if most_common_value equals NaN - the next most common value. Otherwise - None

- most_common_not_null_value_percentage - the percentage of most_common_not_null_value if it is defined.

    If most_common_not_null_value is not defined, equals None too.

Metrics for numeric features only:

- infinite_count - quantity infinite values (for numeric features only)

- infinite_percentage - the percentage of infinite values (for numeric features only)

- percentile_25 - 25% percentile for meaningful values

- percentile_50 - 50% percentile for meaningful values

- percentile_75 - 75% percentile for meaningful values

- mean - the sum of the meaningful values divided by the number of the meaningful values

- std - standard deviation of the values

Metrics for category features only:

  • new_in_current_values_count - quantity of new values in the current dataset after the reference

      Defined for reference dataset only.
    • new_in_current_values_count - quantity of values in the reference dataset that not presented in the current

      Defined for reference dataset only.

Attributes:

count : int = 0

feature_type : str

infinite_count : Optional[int] = None

infinite_percentage : Optional[float] = None

max : Optional[Union[int, float, bool, str]] = None

mean : Optional[float] = None

min : Optional[Union[int, float, bool, str]] = None

missing_count : Optional[int] = None

missing_percentage : Optional[float] = None

most_common_not_null_value : Optional[Union[int, float, bool, str]] = None

most_common_not_null_value_percentage : Optional[float] = None

most_common_value : Optional[Union[int, float, bool, str]] = None

most_common_value_percentage : Optional[float] = None

new_in_current_values_count : Optional[int] = None

number_of_rows : int = 0

percentile_25 : Optional[float] = None

percentile_50 : Optional[float] = None

percentile_75 : Optional[float] = None

std : Optional[float] = None

unique_count : Optional[int] = None

unique_percentage : Optional[float] = None

unused_in_current_values_count : Optional[int] = None

Methods:

as_dict()

is_category()

Checks that the object store stats for a category feature

is_datetime()

Checks that the object store stats for a datetime feature

is_numeric()

Checks that the object store stats for a numeric feature

calculate_category_column_correlations(column_name: str, dataset: DataFrame, columns: List[str])

For category columns calculate cramer_v correlation

calculate_column_distribution(column: Series, column_type: str)

calculate_cramer_v_correlation(column_name: str, dataset: DataFrame, columns: List[str])

calculate_numerical_column_correlations(column_name: str, dataset: DataFrame, columns: List[str])

get_features_stats(feature: Series, feature_type: str)

get_pairwise_correlation(df, func: Callable[[Series, Series], float])

Compute pairwise correlation of columns :param df: initial data frame. :param func: function for computing pairwise correlation.

  • Returns

    Correlation matrix.

get_rows_count(data: Union[DataFrame, Series])

Count quantity of rows in a dataset

regression_performance module

class ErrorWithQuantiles(error, quantile_top, quantile_other)

Bases: object

class FeatureBias(feature_type: str, majority: float, under: float, over: float, range: float)

Bases: object

Attributes:

feature_type : str

majority : float

over : float

range : float

under : float

Methods:

as_dict(prefix)

class RegressionPerformanceMetrics(mean_error: float, mean_abs_error: float, mean_abs_perc_error: float, error_std: float, abs_error_max: float, abs_error_std: float, abs_perc_error_std: float, error_normality: dict, underperformance: dict, error_bias: dict)

Bases: object

Attributes:

abs_error_max : float

abs_error_std : float

abs_perc_error_std : float

error_bias : dict

error_normality : dict

error_std : float

mean_abs_error : float

mean_abs_perc_error : float

mean_error : float

underperformance : dict

error_bias_table(dataset, err_quantiles, num_feature_names, cat_feature_names)

error_with_quantiles(dataset, prediction_column, target_column, quantile: float)

calculate_metrics(column_mapping: , confusion_matrix: ConfusionMatrix, target: Series, prediction: PredictionData)

get_prediction_data(data: DataFrame, data_columns: , pos_label: Optional[Union[str, int]], threshold: float = 0.5)

class ColumnDataDriftMetrics(column_name: str, column_type: str, stattest_name: str, drift_score: float, drift_detected: bool, threshold: float, current_distribution: , reference_distribution: , current_small_distribution: Optional[list] = None, reference_small_distribution: Optional[list] = None, current_scatter: Optional[Dict[str, list]] = None, x_name: Optional[str] = None, plot_shape: Optional[Dict[str, float]] = None, current_correlations: Optional[Dict[str, float]] = None, reference_correlations: Optional[Dict[str, float]] = None)

current_distribution :

reference_distribution :

class DatasetDriftMetrics(number_of_columns: int, number_of_drifted_columns: int, share_of_drifted_columns: float, dataset_drift: bool, drift_by_columns: Dict[str, ColumnDataDriftMetrics], options: , dataset_columns: )

dataset_columns :

options :

get_drift_for_columns(*, current_data: DataFrame, reference_data: DataFrame, dataset_columns: , data_drift_options: , drift_share_threshold: Optional[float] = None, columns: Optional[List[str]] = None)

get_one_column_drift(*, current_data: DataFrame, reference_data: DataFrame, column_name: str, options: , dataset_columns: , column_type: Optional[str] = None)

class ColumnCorrelations(column_name: str, kind: str, values: )

values :

calculate_correlations(dataset: DataFrame, columns: )

calculate_data_quality_stats(dataset: DataFrame, columns: , task: Optional[str])

calculate_regression_performance(dataset: DataFrame, columns: , error_bias_prefix: str)

evidently.calculations.stattests package
ColumnMapping
DataDriftOptions
DataDriftOptions
DataDriftOptions
DataDriftOptions
Submodules
anderson_darling_stattest module
chisquare_stattest module
cramer_von_mises_stattest module
CramerVonMisesResult
energy_distance module
epps_singleton_stattest module
fisher_exact_stattest module
g_stattest module
hellinger_distance module
jensenshannon module
kl_div module
ks_stattest module
mann_whitney_urank_stattest module
psi module
registry module
StatTest
StatTest.allowed_feature_types
StatTest.default_threshold
StatTest.display_name
StatTest.func
StatTest.name
StatTestInvalidFeatureTypeError
StatTestNotFoundError
StatTestResult
StatTestResult.actual_threshold
StatTestResult.drift_score
StatTestResult.drifted
get_stattest()
register_stattest()
t_test module
tvd_stattest module
utils module
generate_fisher2x2_contingency_table()
get_binned_data()
get_unique_not_nan_values_list_from_series()
permutation_test()
wasserstein_distance_norm module
z_stattest module
proportions_diff_z_stat_ind()
proportions_diff_z_test()
DatasetColumns
Distribution
Distribution
Distribution
Distribution
DatasetColumns
DatasetColumns
DatasetColumns
DatasetColumns
Distribution
Distribution
DatasetColumns
DatasetColumns
DatasetColumns