Before we begin using EQL and the Proxy, we'll need to do some setup to get the necessary keys and configuration.
-
Create an account.
-
Install the CLI:
brew install cipherstash/tap/stash
- Login:
stash login
stash datasets create eql-test
# grab dataset ID and export CS_DATASET_ID=
stash clients create eql-test --dataset-id $CS_DATASET_ID
- Create an access key for CipherStash Proxy:
stash workspaces
# grab the workspace ID and export CS_WORKSPACE_ID=
stash access-keys create --workspace-id $CS_WORKSPACE_ID eql-test
- Go to the [EQL playground](../../playground) and copy over the example
.envrc
file:
cd playground
cp .envrc.example .envrc
Update the .envrc
file with these environment variables CS_WORKSPACE_ID
, CS_CLIENT_ACCESS_KEY
, CS_ENCRYPTION__CLIENT_ID
, CS_ENCRYPTION__CLIENT_KEY
and CS_DATASET_ID
:
source .envrc
- Start PostgreSQL and CipherStash Proxy and install EQL:
docker compose up
This will:
- spin up a docker container for the CipherStash Proxy and Postgres
- install EQL
- Check Postgres and the Proxy are running:
docker ps
You should see 2 containers running, postgres_proxy
and eql-playground-pg
.
These examples will show how EQL works using raw SQL.
Prerequisites:
- PostgreSQL and CipherStash Proxy are running in docker containers.
Let's step through an example of how we go from a plaintext text field to an encrypted text field.
This guide will include:
- How to setup your database
- How to add indexes
- How to encrypt existing plaintext data
- How to insert data
- How to query data
Connect to your postgres docker container:
docker exec -it eql-playground-pg bash
Start psql
:
PGPASSWORD=postgres PGUSER=postgres psql
We will use a users
table with an email field for this example.
In psql, run:
CREATE TABLE IF NOT EXISTS users (
id serial PRIMARY KEY NOT NULL,
email VARCHAR(100)
);
Our users
schema looks like this:
Column | Type | Nullable |
---|---|---|
email |
character varying(100) |
Seed plaintext data into the users table:
INSERT INTO users (email) VALUES
('adalovelace@example.com'),
('gracehopper@test.com'),
('edithclarke@email.com');
In the previous step we:
- setup a basic users table with a plaintext email (text) field.
- seeded the db with plaintext emails.
In this part we will add a new column to store our encrypted email data.
When we add the column we use a Type
of cs_encrypted_v1
.
This type will enforce constraints on the field to ensure that:
- the payload is in the format EQL and CipherStash Proxy expects.
- the payload has been encrypted before inserting.
If there are issues with the payload being inserted into a field with a type of cs_encrypted_v1
, an error will be returned describing what the issue with the payload is.
To add a new column called email_encrypted
with a type of cs_encrypted_v1
:
ALTER TABLE users ADD email_encrypted cs_encrypted_v1;
Our users
schema now looks like this:
Column | Type | Nullable |
---|---|---|
email |
character varying(100) |
|
email_encrypted |
cs_encrypted_v1 |
We now have our database schema setup to store encrypted data.
In this part we will learn about why we need to add indexes and how to add them.
When you install EQL, a table called cs_configuration_v1
is created in your database.
Adding indexes updates this table with the details and configuration needed for CipherStash Proxy to know how to encrypt your data, and what types of queries are able to be performed
We will also need to add the relevant native database indexes to be able to perform these queries.
In this example, we want to be able to execute these types of queries on our email_encrypted
field:
- free text search
- equality
- order by
- comparison
This means that we need to add the below indexes for our new email_encrypted
field.
For free text queries (e.g LIKE
, ILIKE
) we add a match
index and a GIN index:
SELECT cs_add_index_v1('users', 'email_encrypted', 'match', 'text');
CREATE INDEX ON users USING GIN (cs_match_v1(email_encrypted));
For equality queries we add a unique
index:
SELECT cs_add_index_v1('users', 'email_encrypted', 'unique', 'text', '{"token_filters": [{"kind": "downcase"}]}');
CREATE UNIQUE INDEX ON users(cs_unique_v1(email_encrypted));
For ordering or comparison queries we add an ore
index:
SELECT cs_add_index_v1('users', 'email_encrypted', 'ore', 'text');
CREATE INDEX ON users (cs_ore_64_8_v1(email_encrypted));
After adding these indexes, our cs_configuration_v1
table will look like this:
id | 1
state | pending
data | {"v": 1, "tables": {"users": {"email_encrypted": {"cast_as": "text", "indexes": {"ore": {}, "match": {"k": 6, "m": 2048, "tokenizer": {"kind": "ngram", "token_length": 3}, "token_filters": [{"kind": "downcase"}], "include_original": true}, "unique": {"token_filters": [{"kind": "downcase"}]}}}}}}
The initial state
will be set as pending.
To activate this configuration run:
SELECT cs_encrypt_v1();
SELECT cs_activate_v1();
The cs_configured_v1
table will now have a state of active
.
id | 1
state | active
data | {"v": 1, "tables": {"users": {"email_encrypted": {"cast_as": "text", "indexes": {"ore": {}, "match": {"k": 6, "m": 2048, "tokenizer": {"kind": "ngram", "token_length": 3}, "token_filters": [{"kind": "downcase"}], "include_original": true}, "unique": {"token_filters": [{"kind": "downcase"}]}}}}}}
Prerequisites:
Ensure CipherStash Proxy has the most up to date configuration from the cs_configuration_v1
table.
CipherStash Proxy pings the database every 60 seconds to refresh the configuration but we can force the refresh by running:
SELECT cs_refresh_encrypt_config();
Bundled in with the CipherStash Proxy is a migrator tool.
This tool encrypts the plaintext data from the plaintext email
field, and inserts it into the encrypted field, email_encrypted
.
We access the migrator tool by requesting a shell inside the CipherStash Proxy container.
docker exec -it postgres_proxy bash
Run:
cipherstash-migrator --columns email=email_encrypted --table users --database-name postgres --username postgres --password postgres
We now have encrypted data in our email_encrypted
field that we can query.
Drop the plaintext email column:
ALTER TABLE users DROP COLUMN email;
Note: In production ensure data is backed up before dropping any columns
Before inserting or querying any records, we need to connect to our database via the Proxy.
We do this so our data is encrypted and decrypted.
In another terminal run:
PGPASSWORD=postgres psql -h localhost -p 6432 -U postgres -d postgres
When inserting data into the encrypted column we need to wrap the plaintext in an EQL payload.
The reason for this is that the CipherStash Proxy expects the EQL payload to be able to encrypt the data, and to be able to decrypt the data.
These statements must be run through the CipherStash Proxy in order to encrypt the data.
For a plaintext of test@test.com
.
An EQL payload will look like this:
{
"k": "pt", // The kind of EQL payload. The client will always send through plaintext "pt"
"p": "test@test.com", // The plaintext data
"i": {
"t": "users", // The table
"c": "email_encrypted" // The encrypted column
},
"v": 1,
"q": null // Used in queries only.
}
Example:
A query to insert an email into the plaintext email
field in the users
table looks like this:
INSERT INTO users (email) VALUES ('test@test.com');
The equivalent of this query to insert a plaintext email and encrypt it into the email_encrypted
column using EQL:
INSERT INTO users (email_encrypted) VALUES ('{"v":1,"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"}}');
What is happening?
The CipherStash Proxy takes this EQL payload and encrypts the plaintext data.
It creates an EQL payload that looks similar to this and inserts this into the encrypted field in the database.
{
"k": "ct", // The kind of EQL payload. The Proxy will insert a json payload of a ciphertext or "ct".
"c": "encrypted test@test.com", // The source ciphertext of the plaintext email.
"e": {
"t": "users", // Table
"c": "email_encrypted" // Encrypted column
},
"m": [42], // The ciphertext used for free text queries i.e match index
"u": "unique ciphertext", // The ciphertext used for unique queries. i.e unique index
"o": ["a", "b", "c"], // The ciphertext used for order or comparison queries. i.e ore index
"v": 1
}
This is what is stored in the email_encrypted
column.
In this part we will step through how to read our encrypted data.
We will cover:
- simple queries
- free text search queries
- exact/unique queries
- order by and comparison queries
If we don't need to execute any searchable operations (free text, exact) on the encrypted field.
The query will look similar to a plaintext query except we will use the encrypted column.
A plaintext query to select all emails from the users table would look like this:
SELECT email FROM users;
The EQL equivalent of this query is:
SELECT email_encrypted FROM users;
Returns:
email_encrypted
-------------------------------------------------------------------------------------------------
{"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null}
{"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null}
{"k":"pt","p":"edithclarke@email.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null}
{"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null}
What is happening?
The json stored in the database looks similar to this:
{
"k": "ct", // The kind of EQL payload. The Proxy will insert a json payload of a ciphertext or "ct".
"c": "encrypted test@test.com", // The source ciphertext of the plaintext email.
"e": {
"t": "users", // Table
"c": "email_encrypted" // Encrypted column
},
"m": [42], // The ciphertext used for free text queries i.e match index
"u": "unique ciphertext", // The ciphertext used for unique queries. i.e unique index
"o": ["a", "b", "c"], // The ciphertext used for order or comparison queries. i.e ore index
"v": 1
}
The Proxy decrypts the json above and returns a plaintext json payload that looks like this:
{
"k": "pt",
"p": "test@test.com", // The returned plaintext data
"i": {
"t": "users",
"c": "email_encrypted"
},
"v": 1,
"q": null
}
When working with EQL in an application you would likely be using an ORM.
We are currently building out packages and examples to make it easier to work with EQL json payloads.
EQL provides specialized functions to be able to interact with encrypted data and to support operations like equality checks, comparison queries, and unique constraints.
Prerequsites:
- A match index is needed on the encrypted column to support this operation.
- Connected to the database via the Proxy.
EQL function to use: cs_match_v1(val JSONB)
EQL query payload for a match query:
{
"k": "pt",
"p": "grace", // The text we want to use for search
"i": {
"t": "users",
"c": "email_encrypted"
},
"v": 1,
"q": "match" // This field is required on queries. This specifies the type of query we are executing.
}
A plaintext query, to search for any records that have an email like grace
, looks like this:
SELECT * FROM users WHERE email LIKE '%grace%';
The EQL equivalent of this query is:
SELECT * FROM users WHERE cs_match_v1(email_encrypted) @> cs_match_v1(
'{"v":1,"k":"pt","p":"grace","i":{"t":"users","c":"email_encrypted"},"q":"match"}'
);
This query returns:
id | email_encrypted |
---|---|
2 | {"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null} |
Prerequsites:
- A unique index is needed on the encrypted column to support this operation.
EQL function to use: cs_unique_v1(val JSONB)
EQL query payload for a match query:
{
"k": "pt",
"p": "adalovelace@example.com", // The text we want to use for the equality query
"i": {
"t": "users",
"c": "email_encrypted"
},
"v": 1,
"q": "unique" // This field is required on queries. This specifies the type of query we are executing.
}
A plaintext query to search for any records that equal adalovelace@example.com
looks like this:
SELECT * FROM users WHERE email = 'adalovelace@example.com';
The EQL equivalent of this query is:
SELECT * FROM users WHERE cs_unique_v1(email_encrypted) = cs_unique_v1(
'{"v":1,"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"q":"unique"}'
);
This query returns:
id | email_encrypted |
---|---|
1 | {"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null} |
Prerequsites:
- An ore index is needed on the encrypted column to support this operation.
EQL function to use: cs_ore_64_8_v1(val JSONB)
.
A plaintext query order by email looks like this:
SELECT * FROM users ORDER BY email ASC;
The EQL equivalent of this query is:
SELECT * FROM users ORDER BY cs_ore_64_8_v1(email_encrypted) ASC;
This query returns:
id | email_encrypted |
---|---|
1 | {"k":"pt","p":"adalovelace@example.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null} |
3 | {"k":"pt","p":"edithclarke@email.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null} |
2 | {"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null} |
4 | {"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null} |
Prerequsites:
- A unique index is needed on the encrypted column to support this operation.
EQL function to use: cs_ore_64_8_v1(val JSONB)
.
EQL query payload for a comparison query:
{
"k": "pt",
"p": "gracehopper@test.com", // The text we want to use for the equality query
"i": {
"t": "users",
"c": "email_encrypted"
},
"v": 1,
"q": "ore" // This field is required on queries. This specifies the type of query we are executing.
}
A plaintext text query to compare email values looks like this:
SELECT * FROM users WHERE email > 'gracehopper@test.com';
The EQL equivalent of this query is:
SELECT * FROM users WHERE cs_ore_64_8_v1(email_encrypted) > cs_ore_64_8_v1(
'{"v":1,"k":"pt","p":"gracehopper@test.com","i":{"t":"users","c":"email_encrypted"},"q":"ore"}'
);
This query returns:
id | email_encrypted |
---|---|
4 | {"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"},"v":1,"q":null} |
This tutorial showed how we can go from a plaintext text field to an encrypted field and how to query the encrypted fields.
We have some examples here of what this would look like if you are using an ORM.