Configure a MetricStore
Metric storage is an experimental feature.
A MetricStore is a StoreA connector to store and retrieve information about metadata in Great Expectations. that stores Metrics computed during Validation. A MetricStore tracks the run_id of the Validation and the Expectation SuiteA collection of verifiable assertions about data. name in addition to the Metric name and Metric kwargs.
Saving MetricsA computed attribute of data such as the mean of a column. during ValidationThe act of applying an Expectation Suite to a Batch. lets you construct a new data series based on observed dataset characteristics computed by Great Expectations. A data series can serve as the source for a dashboard, or overall data quality metrics.
Prerequisites
- A Great Expectations instance
- Completion of the Quickstart
- A configured Data Context
Add a MetricStore
To define a MetricStore, add a Metric StoreA connector to store and retrieve information about computed attributes of data, such as the mean of a column. configuration to the stores section of your great_expectations.yml. The configuration must include the following keys:
class_name- EnterMetricStore. This key determines which class is instantiated to create theStoreBackend. Other fields are passed through to theStoreBackendclass on instantiation. The only backend Store under test for use with aMetricStoreis theDatabaseStoreBackendwith Postgres.store_backend- Defines how your metrics are persisted.
To use an SQL Database such as Postgres, add the following fields and values:
class_name- EnterDatabaseStoreBackend.credentials- Point to the credentials defined in yourconfig_variables.yml, or define them inline.
The following is an example of how the MetricStore configuration appears in great_expectations.yml:
stores:
# ...
metric_store: # You can choose any name as the key for your metric store
class_name: MetricStore
store_backend:
class_name: DatabaseStoreBackend
credentials: ${my_store_credentials}
# alternatively, define credentials inline:
# credentials:
# username: my_username
# password: my_password
# port: 1234
# host: xxxx
# database: my_database
# driver: postgresql
The next time your Data Context is loaded, it will connect to the database and initialize a table to store metrics if one has not already been created.
Configure a Validation Action
When a MetricStore is available, add a StoreMetricsAction validation ActionA Python class with a run method that takes a Validation Result and does something with it to your CheckpointThe primary means for validating data in a production deployment of Great Expectations. to save Metrics during Validation. The validation Action must include the following fields:
class_name- EnterStoreMetricsAction. Determines which class is instantiated to execute the Action.target_store_name- Enter the key for the MetricStore you added in yourgreat_expectations.yml. In the previous example, themetrics_storefield defines which Store backend to use when persisting the metrics.requested_metrics- Identify the Expectation Suites and Metrics you want to store.
Add the following entry to great_expectations.yml to generate Validation ResultGenerated when data is Validated against an Expectation or Expectation Suite. statistics:
expectation_suite_name:
statistics.<statistic name>
Add the following entry to great_expectations.yml to generate values from a specific ExpectationA verifiable assertion about data. result field:
expectation_suite_name:
- column:
<column name>:
<expectation name>.result.<value name>
To indicate that any Expectation Suite can be used to generate values, use the wildcard "*".
If you use an Expectation Suite name as a key, Metrics are only added to the MetricStore when the Expectation Suite runs. When you use the wildcard "*", Metrics are added to the MetricStore for each Expectation Suite that runs in the Checkpoint.
The following example yaml configuration adds StoreMetricsAction to the taxi_data dataset:
action_list:
# ...
- name: store_metrics
action:
class_name: StoreMetricsAction
target_store_name: metric_store # This should match the name of the store configured above
requested_metrics:
public.taxi_data.warning: # match a particular expectation suite
- column:
passenger_count:
- expect_column_values_to_not_be_null.result.element_count
- expect_column_values_to_not_be_null.result.partial_unexpected_list
- statistics.successful_expectations
"*": # wildcard to match any expectation suite
- statistics.evaluated_expectations
- statistics.success_percent
- statistics.unsuccessful_expectations
Test your MetricStore and StoreMetricsAction
Run the following command to run your Checkpoint and test StoreMetricsAction:
import great_expectations as gx
context = gx.get_context()
checkpoint_name = "your checkpoint name here"
context.run_checkpoint(checkpoint_name=checkpoint_name)