Deploy Great Expectations in hosted environments without a file system
The components in the great_expectations.yml
file define the Validation Results Stores, Datasource connections, and Data Docs hosts for a Data Context. These components might be inaccessible in hosted environments, such as Databricks, Amazon EMR, and Google Cloud Composer. The information provided here is intended to help you use Great Expectations in hosted environments.
Configure your Data Context
To use code to create a Data Context, see How to instantiate an Ephemeral Data Context.
To configure a Data Context for a specific environment, see one of the following resources:
- How to instantiate a Data Context on an EMR Spark cluster
- How to use Great Expectations in Databricks
Create Expectation Suites and add Expectations
To add a Datasource and an Expectation Suite, see How to connect to a PostgreSQL database.
To add Expectations to your Suite individually, use the following code:
validator.expect_column_values_to_not_be_null("my_column")
validator.save_expectation_suite(discard_failed_expectations=False)
To configure your Expectation store to load a Suite at a later time, see one of the following resources:
- How to configure an Expectation store to use Amazon S3
- How to configure an Expectation store to use Azure Blob Storage
- How to configure an Expectation store to use GCS
- How to configure an Expectation store to use a filesystem
- How to configure an Expectation store to use PostgreSQL
Run validation
To use an Expectation Suite you've created to validate data, see How to validate data without a Checkpoint.
Use Data Docs
To build and view Data Docs in your environment, see Options for hosting Data Docs.