Add a custom text descriptor

How to add custom text descriptors.

You are looking at the old Evidently documentation: this API is available with versions 0.6.7 or lower. Check the newer docs version here.

You can implement custom row-level evaluations for text data that you will later use just like any other descriptor across Metrics and Tests. You can implement descriptors that use a single column or two columns.

Note that if you want to use LLM-based evaluations, you can write custom prompts using LLM judge templates.

Code example

Refer to a How-to example:

Custom descriptors

Imports:

Single column descriptor

You can create a custom descriptor that will take a single column from your dataset and run a certain evaluation for each row.

Implement your evaluation as a Python function. It will take a pandas Series as input and return a transformed Series.

Here, the is_empty_string_callable function takes a column of strings and returns an "EMPTY" or "NON EMPTY" outcome for each.

Create a custom descriptor. Create an example of CustomColumnEval class to wrap the evaluation logic into an object that you can later use to process specific dataset input.

Where:

  • func: Callable[[pd.Series], pd.Series] is a function that returns a transformed pandas Series.

  • display_name: str is the new descriptor's name that will appear in Reports and Test Suites.

  • feature_type is the type of descriptor that the function returns (cat for categorical, num for numerical)

Apply the new descriptor. To create a Report with a new Descriptor, pass it as a column_name to the ColumnSummaryMetric. This will compute the new descriptor for all rows in the specified column and summarize its distribution:

Run the Report on your df dataframe as usual:

Double column descriptor

You can create a custom descriptor that will take two columns from your dataset and will run a certain evaluation for each row. (For example, for pairwise evaluators).

Implement your evaluation as a Python function. Here, the exact_match_callable function takes two columns and checks whether each pair of values is the same, returning "MATCH" if they are equal and "MISMATCH" if they are not.

Create a custom descriptor. Create an example of the CustomPairColumnEval class to wrap the evaluation logic into an object that you can later use to process two named columns in a dataset.

Where:

  • func: Callable[[pd.Series, pd.Series], pd.Series] is a function that returns a transformed pandas Series after evaluating two columns.

  • first_column: str is the name of the first column to be passed into the function.

  • second_column: str is the name of the second column to be passed into the function.

  • display_name: str is the new descriptor's name that will appear in Reports and Test Suites.

  • feature_type is the type of descriptor that the function returns (cat for categorical, num for numerical).

Apply the new descriptor. To create a Report with a new Descriptor, pass it as a column_name to the ColumnSummaryMetric. This will compute the new descriptor for all rows in the dataset and summarize its distribution:

Run the Report on your df dataframe as usual:

Last updated