Feature importance in data drift
How to show feature importance in Data Drift evaluations.
You can add feature importances to the dataset-level data drift Tests and Metrics:
DataDriftTableTestShareOfDriftedColumns
Code example
Notebook example on showing feature importance:
Compute feature importances
By default, the feature importance column is not shown. To display them, you must set the feature_importance parameter as True.
If you do not specify anything else, Evidently will train a random forest model using the provided dataset and derive the feature importances.
Notes:
This is only possible if your dataset contains the
targetcolumn.If you have both
currentandreferencedatasets, two different models will be trained. You will have two columns with feature importance: one forreferenceand one forcurrentdata.If your dataset also contains the
predictioncolumn, you should clearly label it using Column Mapping to avoid it being treated as a feature.
Pass your own importances
You can also pass the list of feature importances derived during the model training process. This is a recommended option.
In this case, pass it as a list using the additional_data parameter when running the Report.
You can pass the current_feature_importance β a single column will appear in this case. You can also optionally pass reference_feature_importance.
Last updated