Google Cloud Storage output plugin for Embulk.
- Plugin type: file output
- Load all or nothing: no
- Resume supported: yes
- Cleanup supported: no
- Connector do not support retry in case we have any problem with streaming chanel. In this case, we need to run the job again.
- bucket: Google Cloud Storage bucket name (string, required)
- path_prefix: Prefix of output keys (string, required)
- file_ext: Extention of output file (string, required)
- sequence_format: Format of the sequence number of the output files (string, default value is ".%03d.%02d")
- content_type: content type of output file (string, optional, default value is "application/octet-stream")
- auth_method: Authentication method
(string, optional, default value is "private_key") - service_account_email: Google Cloud Platform service account email (string, required when auth_method is private_key)
- p12_keyfile: Private key file fullpath of Google Cloud Platform service account (string, required when auth_method is private_key)
- json_keyfile fullpath of json_key (string, required when auth_method is json_key)
- application_name: Application name, anything you like (string, optional, default value is "embulk-output-gcs")
- max_connection_retry: Number of connection retries to GCS (number, default value is 10)
- delete_in_advance: Delete Bucket/Prefix matched files in advance (boolean, default value is false)
type: gcs
bucket: your-gcs-bucket-name
path_prefix: logs/out
file_ext: .csv
auth_method: `private_key` #default
service_account_email: ''
p12_keyfile: '/path/to/private/key.p12'
type: csv
encoding: UTF-8
There are three methods supported to fetch access token for the service account.
- Public-Private key pair of GCP(Google Cloud Platform)'s service account
- JSON key of GCP(Google Cloud Platform)'s service account
- Pre-defined access token (Google Compute Engine only)
You first need to create a service account (client ID), download its private key and deploy the key with embulk.
type: gcs
auth_method: private_key
p12_keyfile: /path/to/p12_keyfile.p12
You first need to create a service account (client ID), download its json key and deploy the key with embulk.
type: gcs
auth_method: json_key
json_keyfile: /path/to/json_keyfile.json
You can also embed contents of json_keyfile at config.yml.
type: gcs
auth_method: json_key
content: |
"private_key_id": "123456789",
"private_key": "-----BEGIN PRIVATE KEY-----\nABCDEF",
"client_email": "..."
On the other hand, you don't need to explicitly create a service account for embulk when you run embulk in Google Compute Engine. In this third authentication method, you need to add the API scope "" to the scope list of your Compute Engine VM instance, then you can configure embulk like this.
Setting the scope of service account access for instances
type: gcs
auth_method: compute_engine
$ ./gradlew gem
$ ./gradlew test # -t to watch change of files and rebuild continuously
To run unit tests, we need to configure the following environment variables.
When environment variables are not set, skip almost test cases.
GCP_BUCKET_DIRECTORY(optional, if needed)
If you're using Mac OS X El Capitan and GUI Applications(IDE), like as follows.
$ vi ~/Library/LaunchAgents/environment.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "">
<plist version="1.0">
launchctl setenv GCP_EMAIL
launchctl setenv GCP_P12_KEYFILE /path/to/p12_keyfile.p12
launchctl setenv GCP_JSON_KEYFILE /path/to/json_keyfile.json
launchctl setenv GCP_BUCKET my-bucket
launchctl setenv GCP_BUCKET_DIRECTORY unittests
$ launchctl load ~/Library/LaunchAgents/environment.plist
$ launchctl getenv GCP_EMAIL //try to get value.
Then start your applications.