LogoLogo
HomeBlogGitHub
latest
latest
  • New DOCS
  • What is Evidently?
  • Get Started
    • Evidently Cloud
      • Quickstart - LLM tracing
      • Quickstart - LLM evaluations
      • Quickstart - Data and ML checks
      • Quickstart - No-code evaluations
    • Evidently OSS
      • OSS Quickstart - LLM evals
      • OSS Quickstart - Data and ML monitoring
  • Presets
    • All Presets
    • Data Drift
    • Data Quality
    • Target Drift
    • Regression Performance
    • Classification Performance
    • NoTargetPerformance
    • Text Evals
    • Recommender System
  • Tutorials and Examples
    • All Tutorials
    • Tutorial - Tracing
    • Tutorial - Reports and Tests
    • Tutorial - Data & ML Monitoring
    • Tutorial - LLM Evaluation
    • Self-host ML Monitoring
    • LLM as a judge
    • LLM Regression Testing
  • Setup
    • Installation
    • Evidently Cloud
    • Self-hosting
  • User Guide
    • 📂Projects
      • Projects overview
      • Manage Projects
    • 📶Tracing
      • Tracing overview
      • Set up tracing
    • 🔢Input data
      • Input data overview
      • Column mapping
      • Data for Classification
      • Data for Recommendations
      • Load data to pandas
    • 🚦Tests and Reports
      • Reports and Tests Overview
      • Get a Report
      • Run a Test Suite
      • Evaluate Text Data
      • Output formats
      • Generate multiple Tests or Metrics
      • Run Evidently on Spark
    • 📊Evaluations
      • Evaluations overview
      • Generate snapshots
      • Run no code evals
    • 🔎Monitoring
      • Monitoring overview
      • Batch monitoring
      • Collector service
      • Scheduled evaluations
      • Send alerts
    • 📈Dashboard
      • Dashboard overview
      • Pre-built Tabs
      • Panel types
      • Adding Panels
    • 📚Datasets
      • Datasets overview
      • Work with Datasets
    • 🛠️Customization
      • Data drift parameters
      • Embeddings drift parameters
      • Feature importance in data drift
      • Text evals with LLM-as-judge
      • Text evals with HuggingFace
      • Add a custom text descriptor
      • Add a custom drift method
      • Add a custom Metric or Test
      • Customize JSON output
      • Show raw data in Reports
      • Add text comments to Reports
      • Change color schema
    • How-to guides
  • Reference
    • All tests
    • All metrics
      • Ranking metrics
    • Data drift algorithm
    • API Reference
      • evidently.calculations
        • evidently.calculations.stattests
      • evidently.metrics
        • evidently.metrics.classification_performance
        • evidently.metrics.data_drift
        • evidently.metrics.data_integrity
        • evidently.metrics.data_quality
        • evidently.metrics.regression_performance
      • evidently.metric_preset
      • evidently.options
      • evidently.pipeline
      • evidently.renderers
      • evidently.report
      • evidently.suite
      • evidently.test_preset
      • evidently.test_suite
      • evidently.tests
      • evidently.utils
  • Integrations
    • Integrations
      • Evidently integrations
      • Notebook environments
      • Evidently and Airflow
      • Evidently and MLflow
      • Evidently and DVCLive
      • Evidently and Metaflow
  • SUPPORT
    • Migration
    • Contact
    • F.A.Q.
    • Telemetry
    • Changelog
  • GitHub Page
  • Website
Powered by GitBook
On this page
  • Text Evals Report
  • Code example
  • Data Requirements
  • How it looks
  • Metrics output
  • Report customization
  • Examples
  1. Presets

Text Evals

PreviousNoTargetPerformanceNextRecommender System

Last updated 2 months ago

You are looking at the old Evidently documentation: this API is available with versions 0.6.7 or lower. Check the newer version .

TL;DR: You can explore and compare text datasets.

  • Report: for visual analysis or metrics export, use the TextEvals.

Text Evals Report

To visually explore the descriptive properties of text data, you can create a new Report object and generate TextEvals preset for the column containing the text data. It's best to define your own set of descriptors by passing them as a list to the TextEvals preset. For more details, see .

If you don’t specify descriptors, the Preset will use default statistics.

Code example

text_overview_report = Report(metrics=[
    TextEvals(column_name="Review_Text")
])

text_overview_report.run(reference_data=ref, current_data=cur)
text_overview_report

Note that to calculate some text-related metrics, you may also need to also import additional libraries:

import nltk
nltk.download('words')
nltk.download('wordnet')
nltk.download('omw-1.4')

Data Requirements

  • You can pass one or two datasets. Evidently will compute descriptors both for the current production data and the reference data. If you pass a single dataset, there will be no comparison.

  • To run this preset, you must have text columns in your dataset. Additional features and prediction/target are optional. Pass them if you want to analyze the correlations with text descriptors.

How it looks

The report includes 5 components. All plots are interactive.

Text Descriptors Distribution

The report generates several features that describe different text properties and shows the distributions of these text descriptors.

Text length

Non-letter characters

Out-of-vocabulary words

Sentiment

Shows the distribution of text sentiment (-1 negative to 1 positive).

Sentence Count

Shows the sentence count.

Metrics output

You can also get the report output as a JSON or a Python dictionary.

Report customization

  • You can create a different report or test suite from scratch, taking this one as an inspiration.

Examples

Column mapping. Specify the columns that contain text features in .

Aggregated visuals in plots. Starting from v 0.3.2, all visuals in the Evidently Reports are aggregated by default. This helps decrease the load time and report size for larger datasets. If you work with smaller datasets or samples, you can pass an . You can choose whether you want it on not based on the size of your dataset.

You can .

You can use a .

Head to an to see an example Text Overview preset and other metrics and tests for text data.

here
how descriptors work
column mapping
option to generate plots with raw data
choose your own descriptors
different color schema for the report
example how-to notebook