# evidently.metrics.data\_integrity

## Submodules

## column\_missing\_values\_metric module <a href="#module-evidently.metrics.data_integrity.column_missing_values_metric" id="module-evidently.metrics.data_integrity.column_missing_values_metric"></a>

### class ColumnMissingValues(number\_of\_rows: int, different\_missing\_values: Dict\[Any, int], number\_of\_different\_missing\_values: int, number\_of\_missing\_values: int, share\_of\_missing\_values: float)

Bases: `object`

Statistics about missing values in a column

#### Attributes:

&#x20;    **different\_missing\_values : Dict\[Any, int]**

&#x20;    **number\_of\_different\_missing\_values : int**

&#x20;    **number\_of\_missing\_values : int**

&#x20;    **number\_of\_rows : int**

&#x20;    **share\_of\_missing\_values : float**

### class ColumnMissingValuesMetric(column\_name: str, missing\_values: Optional\[list] = None, replace: bool = True)

Bases: [`Metric`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.Metric)\[`ColumnMissingValuesMetricResult`]

Count missing values in a column.

Missing value is a null or NaN value.

Calculate an amount of missing values kinds and count for such values.\
NA-types like numpy.NaN, pandas.NaT are counted as one type.

You can set you own missing values list with missing\_values parameter.\
Value None in the list means that Pandas null values will be included in the calculation.

If replace parameter is False - add defaults to user’s list.\
If replace parameter is True - use values from missing\_values list only.

#### Attributes:

&#x20;    **DEFAULT\_MISSING\_VALUES = \['', inf, -inf, None]**

&#x20;    **column\_name : str**

&#x20;    **missing\_values : frozenset**

#### Methods:

&#x20;    **calculate(data:** [**InputData**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.InputData)**)**

### class ColumnMissingValuesMetricRenderer(color\_options: Optional\[[ColorOptions](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)] = None)

Bases: [`MetricRenderer`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/M9iMLhGad2yNU1Kd04Dv#evidently.renderers.base_renderer.MetricRenderer)

#### Attributes:

&#x20;    **color\_options :** [**ColorOptions**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)

#### Methods:

&#x20;    **render\_html(obj: ColumnMissingValuesMetric)**

&#x20;    **render\_json(obj: ColumnMissingValuesMetric)**

### class ColumnMissingValuesMetricResult(column\_name: str, current: ColumnMissingValues, reference: Optional\[ColumnMissingValues] = None)

Bases: `object`

#### Attributes:

&#x20;    **column\_name : str**

&#x20;    **current : ColumnMissingValues**

&#x20;    **reference : Optional\[ColumnMissingValues] = None**

## column\_regexp\_metric module <a href="#module-evidently.metrics.data_integrity.column_regexp_metric" id="module-evidently.metrics.data_integrity.column_regexp_metric"></a>

### class ColumnRegExpMetric(column\_name: str, reg\_exp: str, top: int = 10)

Bases: [`Metric`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.Metric)\[`DataIntegrityValueByRegexpMetricResult`]

Count number of values in a column matched or not by a regular expression (regexp)

#### Attributes:

&#x20;    **column\_name : str**

&#x20;    **reg\_exp : str**

&#x20;    **top : int**

#### Methods:

&#x20;    **calculate(data:** [**InputData**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.InputData)**)**

### class ColumnRegExpMetricRenderer(color\_options: Optional\[[ColorOptions](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)] = None)

Bases: [`MetricRenderer`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/M9iMLhGad2yNU1Kd04Dv#evidently.renderers.base_renderer.MetricRenderer)

#### Attributes:

&#x20;    **color\_options :** [**ColorOptions**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)

#### Methods:

&#x20;    **render\_html(obj: ColumnRegExpMetric)**

&#x20;    **render\_json(obj: ColumnRegExpMetric)**

### class DataIntegrityValueByRegexpMetricResult(column\_name: str, reg\_exp: str, top: int, current: DataIntegrityValueByRegexpStat, reference: Optional\[DataIntegrityValueByRegexpStat] = None)

Bases: `object`

#### Attributes:

&#x20;    **column\_name : str**

&#x20;    **current : DataIntegrityValueByRegexpStat**

&#x20;    **reference : Optional\[DataIntegrityValueByRegexpStat] = None**

&#x20;    **reg\_exp : str**

&#x20;    **top : int**

### class DataIntegrityValueByRegexpStat(number\_of\_matched: int, number\_of\_not\_matched: int, number\_of\_rows: int, table\_of\_matched: Dict\[str, int], table\_of\_not\_matched: Dict\[str, int])

Bases: `object`

Statistics about matched by a regular expression values in a column for one dataset

#### Attributes:

&#x20;    **number\_of\_matched : int**

&#x20;    **number\_of\_not\_matched : int**

&#x20;    **number\_of\_rows : int**

&#x20;    **table\_of\_matched : Dict\[str, int]**

&#x20;    **table\_of\_not\_matched : Dict\[str, int]**

## column\_summary\_metric module <a href="#module-evidently.metrics.data_integrity.column_summary_metric" id="module-evidently.metrics.data_integrity.column_summary_metric"></a>

### class CategoricalCharacteristics(number\_of\_rows: int, count: int, unique: Optional\[int], unique\_percentage: Optional\[float], most\_common: Optional\[object], most\_common\_percentage: Optional\[float], missing: Optional\[int], missing\_percentage: Optional\[float], new\_in\_current\_values\_count: Optional\[int] = None, unused\_in\_current\_values\_count: Optional\[int] = None)

Bases: `object`

#### Attributes:

&#x20;    **count : int**

&#x20;    **missing : Optional\[int]**

&#x20;    **missing\_percentage : Optional\[float]**

&#x20;    **most\_common : Optional\[object]**

&#x20;    **most\_common\_percentage : Optional\[float]**

&#x20;    **new\_in\_current\_values\_count : Optional\[int] = None**

&#x20;    **number\_of\_rows : int**

&#x20;    **unique : Optional\[int]**

&#x20;    **unique\_percentage : Optional\[float]**

&#x20;    **unused\_in\_current\_values\_count : Optional\[int] = None**

### class ColumnSummary(column\_name: str, column\_type: str, reference\_characteristics: Union\[NumericCharacteristics, CategoricalCharacteristics, DatetimeCharacteristics, NoneType], current\_characteristics: Union\[NumericCharacteristics, CategoricalCharacteristics, DatetimeCharacteristics], plot\_data: DataQualityPlot)

Bases: `object`

#### Attributes:

&#x20;    **column\_name : str**

&#x20;    **column\_type : str**

&#x20;    **current\_characteristics : Union\[NumericCharacteristics, CategoricalCharacteristics, DatetimeCharacteristics]**

&#x20;    **plot\_data : DataQualityPlot**

&#x20;    **reference\_characteristics : Optional\[Union\[NumericCharacteristics, CategoricalCharacteristics, DatetimeCharacteristics]]**

### class ColumnSummaryMetric(column\_name: str)

Bases: [`Metric`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.Metric)\[`ColumnSummary`]

#### Methods:

&#x20;    **calculate(data:** [**InputData**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.InputData)**)**

&#x20;    **static map\_data(stats:** [**FeatureQualityStats**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ICxNS1GschXEUCV60dnZ#evidently.calculations.data_quality.FeatureQualityStats)**)**

### class ColumnSummaryMetricRenderer(color\_options: Optional\[[ColorOptions](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)] = None)

Bases: [`MetricRenderer`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/M9iMLhGad2yNU1Kd04Dv#evidently.renderers.base_renderer.MetricRenderer)

#### Attributes:

&#x20;    **color\_options :** [**ColorOptions**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)

#### Methods:

&#x20;    **render\_html(obj: ColumnSummaryMetric)**

&#x20;    **render\_json(obj: ColumnSummaryMetric)**

### class DataByTarget(data\_for\_plots: Dict\[str, Dict\[str, Union\[list, pandas.core.frame.DataFrame]]], target\_name: str, target\_type: str)

Bases: `object`

#### Attributes:

&#x20;    **data\_for\_plots : Dict\[str, Dict\[str, Union\[list, DataFrame]]]**

&#x20;    **target\_name : str**

&#x20;    **target\_type : str**

### class DataInTime(data\_for\_plots: Dict\[str, pandas.core.frame.DataFrame], freq: str, datetime\_name: str)

Bases: `object`

#### Attributes:

&#x20;    **data\_for\_plots : Dict\[str, DataFrame]**

&#x20;    **datetime\_name : str**

&#x20;    **freq : str**

### class DataQualityPlot(bins\_for\_hist: Dict\[str, pandas.core.frame.DataFrame], data\_in\_time: Optional\[DataInTime], data\_by\_target: Optional\[DataByTarget], counts\_of\_values: Optional\[Dict\[str, pandas.core.frame.DataFrame]])

Bases: `object`

#### Attributes:

&#x20;    **bins\_for\_hist : Dict\[str, DataFrame]**

&#x20;    **counts\_of\_values : Optional\[Dict\[str, DataFrame]]**

&#x20;    **data\_by\_target : Optional\[DataByTarget]**

&#x20;    **data\_in\_time : Optional\[DataInTime]**

### class DatetimeCharacteristics(number\_of\_rows: int, count: int, unique: Optional\[int], unique\_percentage: Optional\[float], most\_common: Optional\[object], most\_common\_percentage: Optional\[float], missing: Optional\[int], missing\_percentage: Optional\[float], first: Optional\[str], last: Optional\[str])

Bases: `object`

#### Attributes:

&#x20;    **count : int**

&#x20;    **first : Optional\[str]**

&#x20;    **last : Optional\[str]**

&#x20;    **missing : Optional\[int]**

&#x20;    **missing\_percentage : Optional\[float]**

&#x20;    **most\_common : Optional\[object]**

&#x20;    **most\_common\_percentage : Optional\[float]**

&#x20;    **number\_of\_rows : int**

&#x20;    **unique : Optional\[int]**

&#x20;    **unique\_percentage : Optional\[float]**

### class NumericCharacteristics(number\_of\_rows: int, count: int, mean: Union\[float, int, NoneType], std: Union\[float, int, NoneType], min: Union\[float, int, NoneType], p25: Union\[float, int, NoneType], p50: Union\[float, int, NoneType], p75: Union\[float, int, NoneType], max: Union\[float, int, NoneType], unique: Optional\[int], unique\_percentage: Optional\[float], missing: Optional\[int], missing\_percentage: Optional\[float], infinite\_count: Optional\[int], infinite\_percentage: Optional\[float], most\_common: Union\[float, int, NoneType], most\_common\_percentage: Optional\[float])

Bases: `object`

#### Attributes:

&#x20;    **count : int**

&#x20;    **infinite\_count : Optional\[int]**

&#x20;    **infinite\_percentage : Optional\[float]**

&#x20;    **max : Optional\[Union\[float, int]]**

&#x20;    **mean : Optional\[Union\[float, int]]**

&#x20;    **min : Optional\[Union\[float, int]]**

&#x20;    **missing : Optional\[int]**

&#x20;    **missing\_percentage : Optional\[float]**

&#x20;    **most\_common : Optional\[Union\[float, int]]**

&#x20;    **most\_common\_percentage : Optional\[float]**

&#x20;    **number\_of\_rows : int**

&#x20;    **p25 : Optional\[Union\[float, int]]**

&#x20;    **p50 : Optional\[Union\[float, int]]**

&#x20;    **p75 : Optional\[Union\[float, int]]**

&#x20;    **std : Optional\[Union\[float, int]]**

&#x20;    **unique : Optional\[int]**

&#x20;    **unique\_percentage : Optional\[float]**

## dataset\_missing\_values\_metric module <a href="#module-evidently.metrics.data_integrity.dataset_missing_values_metric" id="module-evidently.metrics.data_integrity.dataset_missing_values_metric"></a>

### class DatasetMissingValues(different\_missing\_values: Dict\[Any, int], number\_of\_different\_missing\_values: int, different\_missing\_values\_by\_column: Dict\[str, Dict\[Any, int]], number\_of\_different\_missing\_values\_by\_column: Dict\[str, int], number\_of\_missing\_values: int, share\_of\_missing\_values: float, number\_of\_missing\_values\_by\_column: Dict\[str, int], share\_of\_missing\_values\_by\_column: Dict\[str, float], number\_of\_rows: int, number\_of\_rows\_with\_missing\_values: int, share\_of\_rows\_with\_missing\_values: float, number\_of\_columns: int, columns\_with\_missing\_values: List\[str], number\_of\_columns\_with\_missing\_values: int, share\_of\_columns\_with\_missing\_values: float)

Bases: `object`

Statistics about missed values in a dataset

#### Attributes:

&#x20;    **columns\_with\_missing\_values : List\[str]**

&#x20;    **different\_missing\_values : Dict\[Any, int]**

&#x20;    **different\_missing\_values\_by\_column : Dict\[str, Dict\[Any, int]]**

&#x20;    **number\_of\_columns : int**

&#x20;    **number\_of\_columns\_with\_missing\_values : int**

&#x20;    **number\_of\_different\_missing\_values : int**

&#x20;    **number\_of\_different\_missing\_values\_by\_column : Dict\[str, int]**

&#x20;    **number\_of\_missing\_values : int**

&#x20;    **number\_of\_missing\_values\_by\_column : Dict\[str, int]**

&#x20;    **number\_of\_rows : int**

&#x20;    **number\_of\_rows\_with\_missing\_values : int**

&#x20;    **share\_of\_columns\_with\_missing\_values : float**

&#x20;    **share\_of\_missing\_values : float**

&#x20;    **share\_of\_missing\_values\_by\_column : Dict\[str, float]**

&#x20;    **share\_of\_rows\_with\_missing\_values : float**

### class DatasetMissingValuesMetric(missing\_values: Optional\[list] = None, replace: bool = True)

Bases: [`Metric`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.Metric)\[`DatasetMissingValuesMetricResult`]

Count missing values in a dataset.

Missing value is a null or NaN value.

Calculate an amount of missing values kinds and count for such values.\
NA-types like numpy.NaN, pandas.NaT are counted as one type.

You can set you own missing values list with missing\_values parameter.\
Value None in the list means that Pandas null values will be included in the calculation.

If replace parameter is False - add defaults to user’s list.\
If replace parameter is True - use values from missing\_values list only.

#### Attributes:

&#x20;    **DEFAULT\_MISSING\_VALUES = \['', inf, -inf, None]**

&#x20;    **missing\_values : frozenset**

#### Methods:

&#x20;    **calculate(data:** [**InputData**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.InputData)**)**

### class DatasetMissingValuesMetricRenderer(color\_options: Optional\[[ColorOptions](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)] = None)

Bases: [`MetricRenderer`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/M9iMLhGad2yNU1Kd04Dv#evidently.renderers.base_renderer.MetricRenderer)

#### Attributes:

&#x20;    **color\_options :** [**ColorOptions**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)

#### Methods:

&#x20;    **render\_html(obj: DatasetMissingValuesMetric)**

&#x20;    **render\_json(obj: DatasetMissingValuesMetric)**

### class DatasetMissingValuesMetricResult(current: DatasetMissingValues, reference: Optional\[DatasetMissingValues] = None)

Bases: `object`

#### Attributes:

&#x20;    **current : DatasetMissingValues**

&#x20;    **reference : Optional\[DatasetMissingValues] = None**

## dataset\_summary\_metric module <a href="#module-evidently.metrics.data_integrity.dataset_summary_metric" id="module-evidently.metrics.data_integrity.dataset_summary_metric"></a>

### class DatasetSummary(target: Optional\[str], prediction: Optional\[Union\[str, Sequence\[str]]], date\_column: Optional\[str], id\_column: Optional\[str], number\_of\_columns: int, number\_of\_rows: int, number\_of\_missing\_values: int, number\_of\_categorical\_columns: int, number\_of\_numeric\_columns: int, number\_of\_datetime\_columns: int, number\_of\_constant\_columns: int, number\_of\_almost\_constant\_columns: int, number\_of\_duplicated\_columns: int, number\_of\_almost\_duplicated\_columns: int, number\_of\_empty\_rows: int, number\_of\_empty\_columns: int, number\_of\_duplicated\_rows: int, columns\_type: dict, nans\_by\_columns: dict, number\_uniques\_by\_columns: dict)

Bases: `object`

Columns information in a dataset

#### Attributes:

&#x20;    **columns\_type : dict**

&#x20;    **date\_column : Optional\[str]**

&#x20;    **id\_column : Optional\[str]**

&#x20;    **nans\_by\_columns : dict**

&#x20;    **number\_of\_almost\_constant\_columns : int**

&#x20;    **number\_of\_almost\_duplicated\_columns : int**

&#x20;    **number\_of\_categorical\_columns : int**

&#x20;    **number\_of\_columns : int**

&#x20;    **number\_of\_constant\_columns : int**

&#x20;    **number\_of\_datetime\_columns : int**

&#x20;    **number\_of\_duplicated\_columns : int**

&#x20;    **number\_of\_duplicated\_rows : int**

&#x20;    **number\_of\_empty\_columns : int**

&#x20;    **number\_of\_empty\_rows : int**

&#x20;    **number\_of\_missing\_values : int**

&#x20;    **number\_of\_numeric\_columns : int**

&#x20;    **number\_of\_rows : int**

&#x20;    **number\_uniques\_by\_columns : dict**

&#x20;    **prediction : Optional\[Union\[str, Sequence\[str]]]**

&#x20;    **target : Optional\[str]**

### class DatasetSummaryMetric(almost\_duplicated\_threshold: float = 0.95, almost\_constant\_threshold: float = 0.95)

Bases: [`Metric`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.Metric)\[`DatasetSummaryMetricResult`]

Common dataset(s) columns/features characteristics

#### Attributes:

&#x20;    **almost\_constant\_threshold : float**

&#x20;    **almost\_duplicated\_threshold : float**

#### Methods:

&#x20;    **calculate(data:** [**InputData**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/PNL2TuiArYt5TU085k7F#evidently.metrics.base_metric.InputData)**)**

### class DatasetSummaryMetricRenderer(color\_options: Optional\[[ColorOptions](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)] = None)

Bases: [`MetricRenderer`](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/M9iMLhGad2yNU1Kd04Dv#evidently.renderers.base_renderer.MetricRenderer)

#### Attributes:

&#x20;    **color\_options :** [**ColorOptions**](https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/pages/ZW3IQh73672cpd3xySyJ#evidently.options.color_scheme.ColorOptions)

#### Methods:

&#x20;    **render\_html(obj: DatasetSummaryMetric)**

&#x20;    **render\_json(obj: DatasetSummaryMetric)**

### class DatasetSummaryMetricResult(almost\_duplicated\_threshold: float, current: DatasetSummary, reference: Optional\[DatasetSummary] = None)

Bases: `object`

#### Attributes:

&#x20;    **almost\_duplicated\_threshold : float**

&#x20;    **current : DatasetSummary**

&#x20;    **reference : Optional\[DatasetSummary] = None**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs-old.evidentlyai.com/reference/api-reference/evidently.metrics/evidently.metrics.data_integrity.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
