How to configure credentials
This guide will explain how to configure your great_expectations.yml
project config to populate credentials from either a YAML file or a secret manager.
If your Great Expectations deployment is in an environment without a file system, refer to How to instantiate a Data Context without a yml file for credential configuration examples.
- YAML
- Secret Manager
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
Steps
1. Save credentials and config
Decide where you would like to save the desired credentials or config values - in a YAML file, environment variables, or a combination - then save the values.
In most cases, we suggest using a config variables YAML file. YAML files make variables more visible, easily editable, and allow for modularization (e.g. one file for dev, another for prod).
- In the
great_expectations.yml
config file, environment variables take precedence over variables defined in a config variables YAML - Environment variable substitution is supported in both the
great_expectations.yml
and config variablesconfig_variables.yml
config file.
If using a YAML file, save desired credentials or config values to great_expectations/uncommitted/config_variables.yml
or another YAML file of your choosing:
my_postgres_db_yaml_creds:
drivername: postgresql
host: localhost
port: 5432
username: postgres
password: ${MY_DB_PW}
database: postgres
- If you wish to store values that include the dollar sign character
$
, please escape them using a backslash\
so substitution is not attempted. For example in the above example for Postgres credentials you could setpassword: pa\$sword
if your password ispa$sword
. Say that 5 times fast, and also please choose a more secure password! - When you save values via the CLICommand Line Interface, they are automatically escaped if they contain the
$
character. - You can also have multiple substitutions for the same item, e.g.
database_string: ${USER}:${PASSWORD}@${HOST}:${PORT}/${DATABASE}
If using environment variables, set values by entering export ENV_VAR_NAME=env_var_value
in the terminal or adding the commands to your ~/.bashrc
file:
export POSTGRES_DRIVERNAME=postgresql
export POSTGRES_HOST=localhost
export POSTGRES_PORT=5432
export POSTGRES_USERNAME=postgres
export POSTGRES_PW=
export POSTGRES_DB=postgres
export MY_DB_PW=password
2. Set config_variables_file_path
If using a YAML file, set the config_variables_file_path
key in your great_expectations.yml
or leave the default.
config_variables_file_path: uncommitted/config_variables.yml
3. Replace credentials with placeholders
Replace credentials or other values in your great_expectations.yml
with ${}
-wrapped variable names (i.e. ${ENVIRONMENT_VARIABLE}
or ${YAML_KEY}
).
datasources:
my_postgres_db:
class_name: Datasource
module_name: great_expectations.datasource
execution_engine:
module_name: great_expectations.execution_engine
class_name: SqlAlchemyExecutionEngine
credentials: ${my_postgres_db_yaml_creds}
data_connectors:
default_inferred_data_connector_name:
class_name: InferredAssetSqlDataConnector
my_other_postgres_db:
class_name: Datasource
module_name: great_expectations.datasource
execution_engine:
module_name: great_expectations.execution_engine
class_name: SqlAlchemyExecutionEngine
credentials:
drivername: ${POSTGRES_DRIVERNAME}
host: ${POSTGRES_HOST}
port: ${POSTGRES_PORT}
username: ${POSTGRES_USERNAME}
password: ${POSTGRES_PW}
database: ${POSTGRES_DB}
data_connectors:
default_inferred_data_connector_name:
class_name: InferredAssetSqlDataConnector
Additional Notes
- The default
config_variables.yml
file located atgreat_expectations/uncommitted/config_variables.yml
applies to deployments created usinggreat_expectations init
. - To view the full script used in this page, see it on GitHub: how_to_configure_credentials.py
Choose which secret manager you are using:
- AWS Secrets Manager
- GCP Secret Manager
- Azure Key Vault
This guide will explain how to configure your great_expectations.yml
project config to substitute variables from AWS Secrets Manager.
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
- Configured a secret manager and secrets in the cloud with AWS Secrets Manager
Secrets store substitution uses the configurations from your great_expectations.yml
project config after all other types of substitution are applied (from environment variables or from the config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
- AWS: values starting with
secret|arn:aws:secretsmanager
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use AWS Secrets Manager, you may need to install the great_expectations
package with its aws_secrets
extra requirement:
pip install great_expectations[aws_secrets]
In order to substitute your value by a secret in AWS Secrets Manager, you need to provide an arn of the secret like this one:
secret|arn:aws:secretsmanager:123456789012:secret:my_secret-1zAyu6
The last 7 characters of the arn are automatically generated by AWS and are not mandatory to retrieve the secret, thus secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret
will retrieve the same secret.
You will get the latest version of the secret by default.
You can get a specific version of the secret you want to retrieve by specifying its version UUID like this: secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000
If your secret value is a JSON string, you can retrieve a specific value like this:
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret|key
Or like this:
secret|arn:aws:secretsmanager:region-name-1:123456789012:secret:my_secret:00000000-0000-0000-0000-000000000000|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|drivername
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|host
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|port
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|username
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|password
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_HOST
port: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_PORT
username: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_USERNAME
password: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_PASSWORD
database: secret|arn:aws:secretsmanager:${AWS_REGION}:${ACCOUNT_ID}:secret:PROD_DB_CREDENTIALS_DATABASE
This guide will explain how to configure your great_expectations.yml
project config to substitute variables from GCP Secrets Manager.
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
- Configured a secret manager and secrets in the cloud with GCP Secret Manager
Secrets store substitution uses the configurations from your great_expectations.yml
project config after all other types of substitution are applied (from environment variables or from the config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
- GCP: values matching the following regex
^secret\|projects\/[a-z0-9\_\-]{6,30}\/secrets
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use GCP Secret Manager, you may need to install the great_expectations
package with its gcp
extra requirement:
pip install great_expectations[gcp]
In order to substitute your value by a secret in GCP Secret Manager, you need to provide a name of the secret like this one:
secret|projects/project_id/secrets/my_secret
You will get the latest version of the secret by default.
You can get a specific version of the secret you want to retrieve by specifying its version id like this: secret|projects/project_id/secrets/my_secret/versions/1
If your secret value is a JSON string, you can retrieve a specific value like this:
secret|projects/project_id/secrets/my_secret|key
Or like this:
secret|projects/project_id/secrets/my_secret/versions/1|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|drivername
host: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|host
port: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|port
username: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|username
password: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|password
database: secret|projects/${PROJECT_ID}/secrets/dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_HOST
port: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PORT
username: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_USERNAME
password: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_PASSWORD
database: secret|projects/${PROJECT_ID}/secrets/PROD_DB_CREDENTIALS_DATABASE
This guide will explain how to configure your great_expectations.yml
project config to substitute variables from Azure Key Vault.
Prerequisites: This how-to guide assumes you have:
- Completed the Getting Started Tutorial
- Have a working installation of Great Expectations
- Set up a working deployment of Great Expectations
- Configured a secret manager and secrets in the cloud with Azure Key Vault
Secrets store substitution uses the configurations from your great_expectations.yml
project config after all other types of substitution are applied (from environment variables or from the config_variables.yml
config file)
The secrets store substitution works based on keywords. It tries to retrieve secrets from the secrets store for the following values :
- Azure : values matching the following regex
^secret\|https:\/\/[a-zA-Z0-9\-]{3,24}\.vault\.azure\.net
if the values you provide don't match with the keywords above, the values won't be substituted.
Setup
To use Azure Key Vault, you may need to install the great_expectations
package with its azure_secrets
extra requirement:
pip install great_expectations[azure_secrets]
In order to substitute your value by a secret in Azure Key Vault, you need to provide a name of the secret like this one:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret
You will get the latest version of the secret by default.
You can get a specific version of the secret you want to retrieve by specifying its version id (32 lowercase alphanumeric characters) like this: secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab
If your secret value is a JSON string, you can retrieve a specific value like this:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret|key
Or like this:
secret|https://my-vault-name.vault.azure.net/secrets/my-secret/a0b00aba001aaab10b111001100a11ab|key
Example great_expectations.yml:
datasources:
dev_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|drivername
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|host
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|port
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|username
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|password
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/dev_db_credentials|database
prod_postgres_db:
class_name: SqlAlchemyDatasource
data_asset_type:
class_name: SqlAlchemyDataset
module_name: great_expectations.dataset
module_name: great_expectations.datasource
credentials:
drivername: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_DRIVERNAME
host: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_HOST
port: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_PORT
username: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_USERNAME
password: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_PASSWORD
database: secret|https://${VAULT_NAME}.vault.azure.net/secrets/PROD_DB_CREDENTIALS_DATABASE