VDI Stack Environment Configuration

Table of Contents

Environment Variables

Environment Variables

The following is an index of all the possible environment variables that may be used to configure the VDI service stack.

Examples

Minimal Example

This example is the minimal configuration necessary to spin up the service.

Important

This configuration does not include the database configuration for any application databases or plugin handler service configurations.

DATASET_DIRECTORY_SOURCE_PATH=/var/www/Common/userDatasets/vdi_datasets_feat_s/
DATASET_DIRECTORY_TARGET_PATH=/datasets

AUTH_SECRET_KEY=
ADMIN_AUTH_TOKEN=
LDAP_SERVER=
ORACLE_BASE_DN=ou=applications,dc=apidb,dc=org

USER_DB_TNS_NAME=apicommn
USER_DB_USER=
USER_DB_PASS=
USER_DB_POOL_SIZE=5

GLOBAL_RABBIT_USERNAME=someUser
GLOBAL_RABBIT_PASSWORD=somePassword
GLOBAL_RABBIT_HOST=rabbit-external
GLOBAL_RABBIT_VDI_EXCHANGE_NAME=vdi-bucket-notifications
GLOBAL_RABBIT_VDI_QUEUE_NAME=vdi-bucket-notifications
GLOBAL_RABBIT_VDI_ROUTING_KEY=vdi-bucket-notifications

KAFKA_SERVERS=kafka:9092
KAFKA_PRODUCER_CLIENT_ID=vdi-event-router
KAFKA_CONSUMER_GROUP_ID=vdi-kafka-consumers
KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092

S3_HOST=minio-external
S3_PORT=9000
S3_USE_HTTPS=true
S3_ACCESS_TOKEN=someToken
S3_SECRET_KEY=someSecretKey
S3_BUCKET_NAME=some-other-bucket

CACHE_DB_USERNAME=someUser
CACHE_DB_PASSWORD=somePassword
CACHE_DB_NAME=vdi
CACHE_DB_HOST=cache-db

SITE_BUILD=build-65

PLUGIN_HANDLER_NOOP_NAME=noop
PLUGIN_HANDLER_NOOP_DISPLAY_NAME="Example Plugin"
PLUGIN_HANDLER_NOOP_VERSION=1.0
PLUGIN_HANDLER_NOOP_ADDRESS=plugin-example:80

DB_CONNECTION_ENABLED_{SOME_PROJECT}=true
DB_CONNECTION_NAME_{SOME_PROJECT}=ProjectDB
DB_CONNECTION_LDAP_{SOME_PROJECT}=dbTnsName
DB_CONNECTION_PASS_{SOME_PROJECT}=someDBPass
DB_CONNECTION_DATA_SCHEMA_{SOME_PROJECT}=vdi_datasets_dev_n
DB_CONNECTION_CONTROL_SCHEMA_{SOME_PROJECT}=vdi_control_dev_n

Modules

Rest Service

Req.	Name	Type	Description
	`SERVER_PORT`	`uint16`	Port exposed and used by the VDI REST API service.
❗	`LDAP_SERVER`	`List<HostAddress>`
❗	`ORACLE_BASE_DN`	`String`
❗	`AUTH_SECRET_KEY`	`String`	Secret key value used to decode and validate WDK user tokens for user authentication.
❗	`ADMIN_AUTH_TOKEN`	`String`	Auth token value used to authenticate requests to administration endpoints.
	`ENABLE_CORS`	`boolean`	Enable cross origin requests (used for development)
	`MAX_UPLOAD_FILE_SIZE`	`uint64`	Max file size allowed for a single upload in bytes.
	`USER_UPLOAD_QUOTA`	`uint64`	Quota cap for an individual user’s total uploads in bytes.

Hard Delete Trigger Handler

Req. Name Type Description

HARD_DELETE_HANDLER_WORKER_POOL_SIZE

uint8

Number of workers to use while processing hard-delete events.

HARD_DELETE_HANDLER_WORK_QUEUE_SIZE

uint16

Size the worker pool job queue is allowed to fill to before blocking.

HARD_DELETE_HANDLER_KAFKA_CONSUMER_CLIENT_ID

String

Kafka client ID for the KafkaConsumer that will be used to receive messages from the VDI Kafka instance.

THIS VALUE MUST BE UNIQUE ACROSS ALL KAFKA CLIENT IDS

Import Trigger Handler

Req. Name Type Description

IMPORT_HANDLER_WORKER_POOL_SIZE

uint8

Number of workers to use while processing import events.

IMPORT_HANDLER_WORK_QUEUE_SIZE

uint16

Size the worker pool job queue is allowed to fill to before blocking.

IMPORT_HANDLER_KAFKA_CONSUMER_CLIENT_ID

String

Kafka client ID for the KafkaConsumer that will be used to receive messages from the VDI Kafka instance.

THIS VALUE MUST BE UNIQUE ACROSS ALL KAFKA CLIENT IDS

Install Data Trigger Handler

Req. Name Type Description

INSTALL_DATA_HANDLER_WORKER_POOL_SIZE

uint8

Number of workers to use while processing install-data events.

INSTALL_DATA_HANDLER_WORK_QUEUE_SIZE

uint16

Size the worker pool job queue is allowed to fill to before blocking.

INSTALL_DATA_HANDLER_KAFKA_CONSUMER_CLIENT_ID

String

Kafka client ID for the KafkaConsumer that will be used to receive messages from the VDI Kafka instance.

THIS VALUE MUST BE UNIQUE ACROSS ALL KAFKA CLIENT IDS

Pruner

Name	Type	Description
`DATASET_PRUNING_DELETION_THRESHOLD`	`Duration`	Age at which a soft-deleted dataset becomes a candidate for pruning from the VDI system
`DATASET_PRUNING_INTERVAL`	`Duration`	Frequency at which the pruner will run automatically.
`DATASET_PRUNING_WAKEUP_INTERVAL`	`Duration`	Frequency at which the pruner module will wake up and check for a service shutdown signal.

Dataset Reconciler

Name	Type	Description
`RECONCILER_FULL_ENABLED`	`boolean`	Whether the full dataset reconciliation process is enabled.
`RECONCILER_FULL_RUN_INTERVAL`	`Duration`	Interval at which the full reconciliation process will run.
`RECONCILER_SLIM_RUN_INTERVAL`	`Duration`	Interval at which the slim reconciliation process will run.
`RECONCILER_DELETES_ENABLED`	`boolean`	Whether the reconciler should perform delete operations.

Reconciliation Event Handler

Req. Name Type Description

RECONCILIATION_HANDLER_WORKER_POOL_SIZE

uint8

Number of workers to use while processing reconciliation events.

RECONCILIATION_HANDLER_WORK_QUEUE_SIZE

Duration

Size the worker pool job queue is allowed to fill to before blocking.

RECONCILIATION_HANDLER_KAFKA_CONSUMER_CLIENT_ID

String

Kafka client ID for the KafkaConsumer that will be used to receive messages from the VDI Kafka instance.

THIS VALUE MUST BE UNIQUE ACROSS ALL KAFKA CLIENT IDS

Share Trigger Handler

Req. Name Type Description

SHARE_HANDLER_WORKER_POOL_SIZE

uint8

Number of workers to use while processing share events.

SHARE_HANDLER_WORK_QUEUE_SIZE

uint16

Size the worker pool job queue is allowed to fill to before blocking.

SHARE_HANDLER_KAFKA_CONSUMER_CLIENT_ID

String

Kafka client ID for the KafkaConsumer that will be used to receive messages from the VDI Kafka instance.

THIS VALUE MUST BE UNIQUE ACROSS ALL KAFKA CLIENT IDS

Soft Delete Trigger Handler

Req. Name Type Description

SOFT_DELETE_HANDLER_WORKER_POOL_SIZE

uint8

Number of workers to use while processing soft-delete events.

SOFT_DELETE_HANDLER_WORK_QUEUE_SIZE

uint16

Size the worker pool job queue is allowed to fill to before blocking.

SOFT_DELETE_HANDLER_KAFKA_CONSUMER_CLIENT_ID

String

Kafka client ID for the KafkaConsumer that will be used to receive messages from the VDI Kafka instance.

THIS VALUE MUST BE UNIQUE ACROSS ALL KAFKA CLIENT IDS

Update Meta Trigger Handler

Req. Name Type Description

UPDATE_META_HANDLER_WORKER_POOL_SIZE

uint8

Number of workers to use while processing update-meta events.

UPDATE_META_HANDLER_WORK_QUEUE_SIZE

uint16

Size the worker pool job queue is allowed to fill to before blocking.

UPDATE_META_HANDLER_KAFKA_CONSUMER_CLIENT_ID

String

Kafka client ID for the KafkaConsumer that will be used to receive messages from the VDI Kafka instance.

THIS VALUE MUST BE UNIQUE ACROSS ALL KAFKA CLIENT IDS

Components

Cache DB

Req.	Name	Type	Description
❗	`CACHE_DB_HOST`	`String`	Hostname of the cache db instance.
	`CACHE_DB_PORT`	`uint16`	Port number for the cache db instance.
❗	`CACHE_DB_NAME`	`String`	Name of the postgres database in the cache db instance to use.
❗	`CACHE_DB_USERNAME`	`String`	Database credentials username.
❗	`CACHE_DB_PASSWORD`	`String`	Database credentials password.
	`CACHE_DB_POOL_SIZE`	`uint8`	Database connection pool size.

Kafka

Req.	Name	Type	Description
❗	`KAFKA_SERVERS`	`List<HostAddress>`	Kafka server(s) to connect to publish and consume message topics.

Consumer Client

Kafka consumer client tuning and configuration.

Req.	Name	Type	Description
	`KAFKA_CONSUMER_AUTO_COMMIT_INTERVAL`	`Duration`	The frequency that the consumer offsets are auto-committed to Kafka if `KAFKA_CONSUMER_ENABLE_AUTO_COMMIT` is set to `true`.
	`KAFKA_CONSUMER_AUTO_OFFSET_RESET`	`"earliest" "latest" "none"`	What to do when there is no initial offset in Kafka, or if the current offset does not exist anymore on the server. `earliest` = Automatically reset the offset to the earliest offset. `latest` = Automatically reset the offset to the latest offset. `none` = Throw an exception if no previous offset is found for the consumer’s group.
	`KAFKA_CONSUMER_CONNECTIONS_MAX_IDLE`	`Duration`	Close idle connections after this duration.
	`KAFKA_CONSUMER_DEFAULT_API_TIMEOUT`	`Duration`	Specifies the timeout for client APIs. This configuration is used as the default timeout for all client operations that do not specify a `timeout` parameter.
	`KAFKA_CONSUMER_ENABLE_AUTO_COMMIT`	`boolean`	If `true`, the consumer’s offset will be periodically committed in the background.
	`KAFKA_CONSUMER_FETCH_MAX_BYTES`	`uint32`	The maximum amount of data the server should return for a fetch request. Records are fetched in batches by the consumer, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that the consumer can make progress. As such, this is not an absolute maximum. Note that the consumer performs multiple fetches in parallel.
	`KAFKA_CONSUMER_FETCH_MIN_BYTES`	`uint32`	The minimum amount of data the server should return for a fetch request. If insufficient data is available the request will wait for that much data to accumulate before answering the request. The default setting of `1` byte means that fetch requests are answered as soon as a single byte of data is available or the fetch request times out waiting for data to arrive. Setting this to something greater than `1` will cause the server to wait for larger amounts of data to accumulate which can improve server throughput a bit at the cost of some additional latency.
❗	`KAFKA_CONSUMER_GROUP_ID`	`String`	A unique string that identifies the consumer group this consumer belongs to.
	`KAFKA_CONSUMER_GROUP_INSTANCE_ID`	`String`	A unique identifier of the consumer instance provided by the end user. Only non-empty strings are permitted. If set, the consumer is treated as a static member, which means that only one instance with this ID is allowed in the consumer group at any time. This can be used in combination with a larger session timeout to avoid group rebalances caused by transient unavailability (e.g. process restarts). If not set, the consumer will join the group as a dynamic member, which is the traditional behavior.
	`KAFKA_CONSUMER_HEARTBEAT_INTERVAL`	`Duration`	The expected time between heartbeats to the consumer coordinator when using Kafka’s group management facilities. Heartbeats are used to ensure that the consumer’s session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower than `KAFKA_CONSUMER_SESSION_TIMEOUT`, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances.
	`KAFKA_CONSUMER_MAX_POLL_INTERVAL`	`Duration`	The maximum delay between invocations of `poll()` when using consumer group management. This places an upper bound on the amount of time that the consumer can be idle before fetching more records. If `poll()` is not called before expiration of this timeout, then the consumer is considered failed and the group will rebalance in order to reassign the partitions to another member. For consumers using a non-null `KAFKA_CONSUMER_GROUP_INSTANCE_ID` which reach this timeout, partitions will not be immediately reassigned. Instead, the consumer will stop sending heartbeats and partitions will be reassigned after expiration of `KAFKA_CONSUMER_SESSION_TIMEOUT`. This mirrors the behavior of a static consumer which has shutdown.
	`KAFKA_CONSUMER_MAX_POLL_RECORDS`	`uint32`	The maximum number of records returned in a single call to `poll()`. Note, that this value does not impact the underlying fetching behavior. The consumer will cache the records from each fetch request and returns them incrementally from each poll.
	`KAFKA_CONSUMER_POLL_DURATION`	`Duration`	The amount of time to block waiting for input.
	`KAFKA_CONSUMER_RECEIVE_BUFFER_SIZE_BYTES`	`uint32`	The size of the TCP receive buffer (`SO_RCVBUF`) to use when reading data. If the value is `-1`, the OS default will be used.
	`KAFKA_CONSUMER_RECONNECT_BACKOFF_MAX_TIME`	`Duration`	The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.
	`KAFKA_CONSUMER_RECONNECT_BACKOFF_TIME`	`Duration`	The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.
	`KAFKA_CONSUMER_REQUEST_TIMEOUT`	`Duration`	The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted.
	`KAFKA_CONSUMER_RETRY_BACKOFF_TIME`	`Duration`	The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.
	`KAFKA_CONSUMER_SEND_BUFFER_SIZE_BYTES`	`uint32`	The size of the TCP send buffer (`SO_SNDBUF`) to use when sending data. If the value is `-1`, the OS default will be used.
	`KAFKA_CONSUMER_SESSION_TIMEOUT`	`Duration`	The timeout used to detect worker failures. The worker sends periodic heartbeats to indicate its liveness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove the worker from the group and initiate a rebalance. Note that the value must be in the allowable range as configured in the broker configuration by `group.min.session.timeout.ms` and `group.max.session.timeout.ms`.

Producer Client

Kafka message producer client tuning and configuration.

Req.	Name	Type	Description
	`KAFKA_PRODUCER_BATCH_SIZE`	`uint32`	The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes. No attempt will be made to batch records larger than this size. Requests sent to brokers will contain multiple batches, one for each partition with data available to be sent. A small batch size will make batching less common and may reduce throughput (a batch size of zero will disable batching entirely). A very large batch size may use memory a bit more wastefully as we will always allocate a buffer of the specified batch size in anticipation of additional records. Note: This setting gives the upper bound of the batch size to be sent. If we have fewer than this many bytes accumulated for this partition, we will 'linger' for the `KAFKA_PRODUCER_LINGER_TIME` time waiting for more records to show up. This `KAFKA_PRODUCER_LINGER_TIME` setting defaults to `0`, which means we’ll immediately send out a record even the accumulated batch size is under this `KAFKA_PRODUCER_BATCH_SIZE` setting.
	`KAFKA_PRODUCER_BUFFER_MEMORY_BYTES`	`uint32`	The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for `KAFKA_PRODUCER_MAX_BLOCKING_TIMEOUT` after which it will throw an exception. This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.
❗	`KAFKA_PRODUCER_CLIENT_ID`	`String`	An id string to pass to the server when making requests. The purpose of this is to be able to track the source of requests beyond just ip/port by allowing a logical application name to be included in server-side request logging.
	`KAFKA_PRODUCER_COMPRESSION_TYPE`	`none gzip snappy lz4 zstd`	The compression type for all data generated by the producer. The default is none (i.e. no compression). Valid values are `none`, `gzip`, `snappy`, `lz4`, or `zstd`. Compression is of full batches of data, so the efficacy of batching will also impact the compression ratio (more batching means better compression).
	`KAFKA_PRODUCER_CONNECTIONS_MAX_IDLE`	`Duration`	Close idle connections after the number of milliseconds specified by this config.
	`KAFKA_PRODUCER_DELIVERY_TIMEOUT`	`Duration`	An upper bound on the time to report success or failure after a call to `send()` returns. This limits the total time that a record will be delayed prior to sending, the time to await acknowledgement from the broker (if expected), and the time allowed for retriable send failures. The producer may report failure to send a record earlier than this config if either an unrecoverable error is encountered, the retries have been exhausted, or the record is added to a batch which reached an earlier delivery expiration deadline. The value of this config should be greater than or equal to the sum of `KAFKA_PRODUCER_REQUEST_TIMEOUT` and `KAFKA_PRODUCER_LINGER_TIME`.
	`KAFKA_PRODUCER_LINGER_TIME`	`Duration`	The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However, in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record, the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle’s algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get `KAFKA_PRODUCER_BATCH_SIZE` worth of records for a partition it will be sent immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition we will 'linger' for the specified time waiting for more records to show up. This setting defaults to `0` (i.e. no delay). Setting `KAFKA_PRODUCER_LINGER_TIME=5`, for example, would have the effect of reducing the number of requests sent but would add up to `5ms` of latency to records sent in the absence of load.
	`KAFKA_PRODUCER_MAX_BLOCKING_TIMEOUT`	`Duration`	The configuration controls how long the `KafkaProducer`'s `send()`, `partitionsFor()`, `initTransactions()`, `sendOffsetsToTransaction()`, `commitTransaction()` and `abortTransaction()` methods will block. For `send()` this timeout bounds the total time waiting for both metadata fetch and buffer allocation (blocking in the user-supplied serializers or partitioner is not counted against this timeout). For `partitionsFor()` this timeout bounds the time spent waiting for metadata if it is unavailable. The transaction-related methods always block, but may time out if the transaction coordinator could not be discovered or did not respond within the timeout.
	`KAFKA_PRODUCER_MAX_REQUEST_SIZE_BYTES`	`uint32`	The maximum size of a request in bytes. This setting will limit the number of record batches the producer will send in a single request to avoid sending huge requests. This is also effectively a cap on the maximum uncompressed record batch size. Note that the server has its own cap on the record batch size (after compression if compression is enabled) which may be different from this.
	`KAFKA_PRODUCER_RECEIVE_BUFFER_SIZE_BYTES`	`uint32`	The size of the TCP receive buffer (`SO_RCVBUF`) to use when reading data. If the value is `-1`, the OS default will be used.
	`KAFKA_PRODUCER_RECONNECT_BACKOFF_MAX_TIME`	`Duration`	The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to thisz maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms.
	`KAFKA_PRODUCER_RECONNECT_BACKOFF_TIME`	`Duration`	The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.
	`KAFKA_PRODUCER_REQUEST_TIMEOUT`	`Duration`	The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted. This should be larger than `replica.lag.time.max.ms` (a broker configuration) to reduce the possibility of message duplication due to unnecessary producer retries.
	`KAFKA_PRODUCER_RETRY_BACKOFF_TIME`	`Duration`	The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.
	`KAFKA_PRODUCER_SEND_BUFFER_SIZE_BYTES`	`uint32`	The size of the TCP send buffer (`SO_SNDBUF`) to use when sending data. If the value is `-1`, the OS default will be used.
	`KAFKA_PRODUCER_SEND_RETRIES`	`uint32`	Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Produce requests will be failed before the number of retries has been exhausted if the timeout configured by delivery.timeout.ms expires first before successful acknowledgement. Users should generally prefer to leave this config unset and instead use `KAFKA_PRODUCER_DELIVERY_TIMEOUT` to control retry behavior. Enabling idempotence requires this config value to be greater than `0`. If conflicting configurations are set and idempotence is not explicitly enabled, idempotence is disabled.

Trigger Topics

Names of the topics that various trigger events will be published to.

Name	Type	Description
`KAFKA_TOPIC_HARD_DELETE_TRIGGERS`	`String`	Name of the hard-delete trigger topic that messages will be routed to for object hard-delete events from MinIO. A hard-delete event is the removal of a VDI dataset object in MinIO. Presently these events do not trigger any behavior in the VDI service.
`KAFKA_TOPIC_IMPORT_TRIGGERS`	`String`	Name of the import trigger topic that messages will be routed to for import events from MinIO. An import event is the creation or overwriting of a user upload object in MinIO. These events will trigger a call to the plugin handler server to process the user upload to prepare it for installation.
`KAFKA_TOPIC_INSTALL_TRIGGERS`	`String`	Name of the install-data trigger topic that messages will be routed to for data installation triggers from MinIO. An install-data event is the creation or overwriting of a VDI dataset data object in MinIO. These events will trigger a call to the plugin handler server to install the data that has just landed in MinIO.
`KAFKA_TOPIC_SHARE_TRIGGERS`	`String`	Name of the share trigger topic that messages will be routed to for share events from MinIO. A share event is the creation or overwriting of a "share" object in MinIO. These events will trigger an update to the share/visibility configuration for the target dataset.
`KAFKA_TOPIC_SOFT_DELETE_TRIGGERS`	`String`	Name of the soft-delete trigger topic that messages will be routed to for soft-delete events from MinIO. A soft-delete event is the creation or overwriting of a soft-delete flag object in MinIO. These events will trigger a call to the plugin handler server to uninstall the data from the target application databases.
`KAFKA_TOPIC_UPDATE_META_TRIGGERS`	`String`	Name of the update-meta trigger topic that messages will be routed to for metadata update events from MinIO. An update-meta event is the creation or overwriting of the dataset metadata object in MinIO. These events will trigger a call to the plugin handler server to install or update the metadata for the dataset in the target application databases.
`KAFKA_TOPIC_RECONCILIATION_TRIGGERS`	`String`	Name of the reconciliation trigger topic that messages will be routed to for events fired by the dataset reconciler.

Message Keys

Names of the message key values that events will be keyed on when published to the various Kafka topics. Event messages that are not keyed on the appropriate value will be ignored by the VDI service.

Name	Type	Description
`KAFKA_MESSAGE_KEY_HARD_DELETE_TRIGGERS`	`String`	Message key for hard-delete trigger events.
`KAFKA_MESSAGE_KEY_IMPORT_TRIGGERS`	`String`	Message key for import trigger events.
`KAFKA_MESSAGE_KEY_INSTALL_TRIGGERS`	`String`	Message key for install-data trigger events.
`KAFKA_MESSAGE_KEY_SHARE_TRIGGERS`	`String`	Message key for share trigger events.
`KAFKA_MESSAGE_KEY_SOFT_DELETE_TRIGGERS`	`String`	Message key for soft-delete trigger events.
`KAFKA_MESSAGE_KEY_UPDATE_META_TRIGGERS`	`String`	Message key for update-meta trigger events.
`KAFKA_MESSAGE_KEY_RECONCILIATION_TRIGGERS`	`String`	Message key for reconciliation trigger events.

Rabbit

Req.	Name	Type	Description
	`GLOBAL_RABBIT_CONNECTION_NAME`	`String`	Optional name of the connection to the RabbitMQ service. This value will show in the RabbitMQ logs and in the management console to identify the VDI service’s connection.
❗	`GLOBAL_RABBIT_HOST`	`String`	Hostname of the global RabbitMQ instance that the VDI service will connect to.
	`GLOBAL_RABBIT_PORT`	`uint16`	Port to use when connecting to the global RabbitMQ instance.
❗	`GLOBAL_RABBIT_USERNAME`	`String`	Credentials username used to authenticate with the global RabbitMQ instance.
❗	`GLOBAL_RABBIT_PASSWORD`	`String`	Credentials password used to authenticate with the global RabbitMQ instance.
	`GLOBAL_RABBIT_VDI_POLLING_INTERVAL`	`Duration`	Frequency that the global RabbitMQ instance will be polled for new messages from MinIO.
	`GLOBAL_RABBIT_USE_TLS`	`boolean`	Whether the connection to the target RabbitMQ instance should use TLS. Defaults to `false`.
	`GLOBAL_RABBIT_CONNECTION_TIMEOUT`	`Duration`	TCP connection timeout.

Exchange Config

Req.	Name	Type	Description
❗	`GLOBAL_RABBIT_VDI_EXCHANGE_NAME`	`String`	Name of the target RabbitMQ exchange that will be declared by both the MinIO instance and the VDI service.
	`GLOBAL_RABBIT_VDI_EXCHANGE_TYPE`	`direct fanout topic match`	Exchange type as declared bt the MinIO connection to the global RabbitMQ instance.
	`GLOBAL_RABBIT_VDI_EXCHANGE_AUTO_DELETE`	`boolean`	Whether the exchange should be auto deleted when the connections from MinIO and the VDI service are closed.
	`GLOBAL_RABBIT_VDI_EXCHANGE_DURABLE`	`boolean`	Whether the exchange should be durable (persisted to disk). This value must align with the exchange configuration as set by MinIO.
	`GLOBAL_RABBIT_VDI_EXCHANGE_ARGUMENTS`	`Map<String, String>`	Additional arguments to pass to the exchange declaration.

Queue Config

Req.	Name	Type	Description
❗	`GLOBAL_RABBIT_VDI_QUEUE_NAME`	`String`	Name of the RabbitMQ queue to declare. This value must align with the queue name as configured in MinIO.
	`GLOBAL_RABBIT_VDI_QUEUE_AUTO_DELETE`	`boolean`	Whether the queue should be auto deleted when the connections from MinIO and the VDI service are closed.
	`GLOBAL_RABBIT_VDI_QUEUE_EXCLUSIVE`	`boolean`	Whether the queue should be exclusive to the VDI service. See: Exclusive Queues
	`GLOBAL_RABBIT_VDI_QUEUE_DURABLE`	`boolean`	Whether the queue should be durable (persisted to disk). This value must align with the queue configuration as set by MinIO.
	`GLOBAL_RABBIT_VDI_QUEUE_ARGUMENTS`	`Map<String, String>`	Additional arguments to pass to the queue declaration.

Routing

Req.	Name	Type	Description
	`GLOBAL_RABBIT_VDI_ROUTING_KEY`	`String`
	`GLOBAL_RABBIT_VDI_ROUTING_ARGUMENTS`	`Map<String, String>`

S3 (MinIO)

Req.	Name	Type	Description
❗	`S3_HOST`	`String`	MinIO hostname.
❗	`S3_PORT`	`uint16`	MinIO connection port.
❗	`S3_USE_HTTPS`	`boolean`	Whether HTTPS should be used when connecting to the MinIO instance.
❗	`S3_BUCKET_NAME`	`String`	Name of the MinIO bucket that will be used by the VDI service.
❗	`S3_ACCESS_TOKEN`	`String`	MinIO username/access token to use when authenticating with the MinIO instance.
❗	`S3_SECRET_KEY`	`String`	MinIO password/secret key to use when authenticating with the MinIO instance.

General Plugin Variables

Environment variables used by all plugins.

Req.	Name	Type	Description
❗	`SITE_BUILD`	`String`	Site build number string (e.g. `"build-65"`)
❗	`DATASET_INSTALL_ROOT`	`String`	Mount path in the plugin containers for the dataset install directory tree.

Wildcard Environment Variables

Plugin Handler Environment Key Components.

Registers a VDI plugin with the service.

PLUGIN_HANDLER_<NAME>_NAME
PLUGIN_HANDLER_<NAME>_DISPLAY_NAME
PLUGIN_HANDLER_<NAME>_VERSION
PLUGIN_HANDLER_<NAME>_ADDRESS
PLUGIN_HANDLER_<NAME>_PROJECT_IDS
PLUGIN_HANDLER_<NAME>_CUSTOM_PATH
PLUGIN_HANDLER_<NAME>_SERVER_PORT
PLUGIN_HANDLER_<NAME>_SERVER_HOST
PLUGIN_HANDLER_<NAME>_IMPORT_SCRIPT_PATH
PLUGIN_HANDLER_<NAME>_IMPORT_SCRIPT_MAX_DURATION
PLUGIN_HANDLER_<NAME>_CHECK_COMPAT_SCRIPT_PATH
PLUGIN_HANDLER_<NAME>_CHECK_COMPAT_SCRIPT_MAX_DURATION
PLUGIN_HANDLER_<NAME>_INSTALL_DATA_SCRIPT_PATH
PLUGIN_HANDLER_<NAME>_INSTALL_DATA_SCRIPT_MAX_DURATION
PLUGIN_HANDLER_<NAME>_INSTALL_META_SCRIPT_PATH
PLUGIN_HANDLER_<NAME>_INSTALL_META_SCRIPT_MAX_DURATION
PLUGIN_HANDLER_<NAME>_UNINSTALL_SCRIPT_PATH
PLUGIN_HANDLER_<NAME>_UNINSTALL_SCRIPT_MAX_DURATION

Unlike most of the other environment key values defined here, these values define components of wildcard environment keys which may be specified with any arbitrary <NAME> value between the defined prefix value and suffix options.

The environment variables set using the prefix and suffixes defined below must appear in groups that contain the indicated suffixes. For example, given the <NAME> value "RNASEQ" the following environment variables must be present:

PLUGIN_HANDLER_RNASEQ_NAME
PLUGIN_HANDLER_RNASEQ_DISPLAY_NAME
PLUGIN_HANDLER_RNASEQ_VERSION
PLUGIN_HANDLER_RNASEQ_ADDRESS

Req.	Name	Type	Description
❗	`PLUGIN_HANDLER_<NAME>_NAME`	`String`	Name of the plugin handler. This will typically be the type name of the dataset type that the plugin handles.
❗	`PLUGIN_HANDLER_<NAME>_DISPLAY_NAME`	`String`	Display name for the plugin handler. This will be shown to the end users as the type of their datasets.
❗	`PLUGIN_HANDLER_<NAME>_VERSION`	`String`	Version for the plugin handler.
❗	`PLUGIN_HANDLER_<NAME>_ADDRESS`	`HostAddress`	Address and port of the plugin handler service.
	`PLUGIN_HANDLER_<NAME>_PROJECT_IDS`	`List<String>`	List of project IDs for which the plugin is relevant. If this value is omitted or set to a blank value, the plugin will be considered relevant to all projects.
	`PLUGIN_HANDLER_<NAME>_CUSTOM_PATH`	`String`	Custom $PATH variable additions to pass to plugin scripts.
	`PLUGIN_HANDLER_<NAME>_SERVER_PORT`	`uint16`	Port the plugin handler HTTP server will bind to.
	`PLUGIN_HANDLER_<NAME>_SERVER_HOST`	`String`	Address the plugin handler HTTP server will bind to.
	`PLUGIN_HANDLER_<NAME>_IMPORT_SCRIPT_PATH`	`String`	Path to the import script or binary in the plugin container.
	`PLUGIN_HANDLER_<NAME>_IMPORT_SCRIPT_MAX_DURATION`	`Duration`	Max duration the import script will be permitted to run before being killed.
	`PLUGIN_HANDLER_<NAME>_CHECK_COMPAT_SCRIPT_PATH`	`String`	Path to the compatibility check script or binary in the plugin container.
	`PLUGIN_HANDLER_<NAME>_CHECK_COMPAT_SCRIPT_MAX_DURATION`	`Duration`	Max duration the compatibility check script will be permitted to run before being killed.
	`PLUGIN_HANDLER_<NAME>_INSTALL_DATA_SCRIPT_PATH`	`String`	Path to the data install script or binary in the plugin container.
	`PLUGIN_HANDLER_<NAME>_INSTALL_DATA_SCRIPT_MAX_DURATION`	`Duration`	Max duration the data install script will be permitted to run before being killed.
	`PLUGIN_HANDLER_<NAME>_INSTALL_META_SCRIPT_PATH`	`String`	Path to the metadata install script or binary in the plugin container.
	`PLUGIN_HANDLER_<NAME>_INSTALL_META_SCRIPT_MAX_DURATION`	`Duration`	Max duration the metadata install script will be permitted to run before being killed.
	`PLUGIN_HANDLER_<NAME>_UNINSTALL_SCRIPT_PATH`	`String`	Path to the uninstall script or binary in the plugin container.
	`PLUGIN_HANDLER_<NAME>_UNINSTALL_SCRIPT_MAX_DURATION`	`Duration`	Max duration the uninstall script will be permitted to run before being killed.

Application Database Key Components

DB_CONNECTION_ENABLED_<NAME>
DB_CONNECTION_NAME_<NAME>
DB_CONNECTION_USER_<NAME>
DB_CONNECTION_PASS_<NAME>
DB_CONNECTION_DATA_SCHEMA_<NAME>
DB_CONNECTION_CONTROL_SCHEMA_<NAME>
DB_CONNECTION_POOL_SIZE_<NAME>

# restrict to plugins
DB_CONNECTION_DATA_TYPES_<NAME>

# for LDAP
DB_CONNECTION_LDAP_<NAME>
# else, for manual connection
DB_CONNECTION_HOST_<NAME>
DB_CONNECTION_PORT_<NAME>
DB_CONNECTION_PLATFORM_<NAME>

Unlike most of the other environment key values defined here, these values define components of wildcard environment keys which may be specified with any arbitrary <NAME> value following the defined prefix option.

The environment variables set using the prefixes defined below must appear in groups that contain all prefixes. For example, given the <NAME> value "PLASMO", the following environment variables must all be present:

DB_CONNECTION_ENABLED_PLASMO
DB_CONNECTION_NAME_PLASMO
DB_CONNECTION_LDAP_PLASMO
DB_CONNECTION_USER_PLASMO
DB_CONNECTION_PASS_PLASMO
DB_CONNECTION_DATA_SCHEMA_PLASMO
DB_CONNECTION_CONTROL_SCHEMA_PLASMO
DB_CONNECTION_POOL_SIZE_PLASMO

Database connections detail sets MUST provide either an LDAP lookup name for a database OR a host and port.

Variable Definitions

Req.	Name	Type	Description
	`DB_CONNECTION_ENABLED_<NAME>`	`bool`	Whether the database connection should be enabled for use by VDI.
❗	`DB_CONNECTION_NAME_<NAME>`	`String`	Name for the connection, typically the project ID or identifier for the application database.
	`DB_CONNECTION_LDAP_<NAME>`	`String`	LDAP distinguished name for the database connection `OrclNetDesc` entry containing the connection details for the target database.
	`DB_CONNECTION_HOST_<NAME>`	`String`	Host URI to use when providing manual database connection details. WARNING: This variable will be ignored if `DB_CONNECTION_LDAP` is provided. WARNING: This variable is required if `DB_CONNECTION_LDAP` is absent.
	`DB_CONNECTION_PORT_<NAME>`	`uint16`	Port to use when providing manual database connection details. WARNING: This variable will be ignored if `DB_CONNECTION_LDAP` is provided. WARNING: This variable is required if `DB_CONNECTION_LDAP` is absent.
❗	`DB_CONNECTION_USER_<NAME>`	`String`	Database credentials username.
❗	`DB_CONNECTION_PASS_<NAME>`	`String`	Database credentials password.
❗	`DB_CONNECTION_DATA_SCHEMA_<NAME>`	`String`	Database schema where user dataset data is installed to.
❗	`DB_CONNECTION_CONTROL_SCHEMA_<NAME>`	`String`	Database schema where the VDI control tables are installed to.
	`DB_CONNECTION_POOL_SIZE_<NAME>`	`uint8`	Connection pool size for the JDBC `DataSource`.
	`DB_CONNECTION_DATA_TYPES_<NAME>`	`List<String>`	Dataset type names that align with plugins registered in the VDI environment configuration. If provided, VDI will only use this connection for datasets whose type name matches an item in the given list. If omitted, VDI will use this connection for all datasets whose type name does not match another DB connection with declared types.

One to Many Project to DB Configs

A single project may have multiple target databases registered provided that each connection has a unique (name, dataset type) pairing.

By default, if a database connection detail set does not contain a dataset type list via DB_CONNECTION_DATA_TYPES_<NAME> it will be used as a fallback for all dataset types that do not match a configured connection.

If multiple database connection detail sets omit the data types restriction variable, VDI will refuse to start. If multiple database connection detail sets specify the same data type, VDI will refuse to start.

Example

# This DB connection is a fallback because it provides no data type list.
# It is NOT used for datasets of type foo and bar.
DB_CONNECTION_ENABLED_PLASMO_1=true
DB_CONNECTION_NAME_PLASMO_1=PlasmoDB
#DB_CONNECTION_DATA_TYPES_PLASMO_1='*'  # Wildcard is implied by omission

# This DB connection is only used for datasets of type foo and bar.
DB_CONNECTION_ENABLED_PLASMO_2=true
DB_CONNECTION_NAME_PLASMO_2=PlasmoDB
DB_CONNECTION_DATA_TYPES_PLASMO_2=foo,bar

Environment Variable Types

Duration

Durations are a string representation of a time interval. Durations are represented as one or more numeric values followed by a shorthand notation of the time unit.

Time Unit Notations:

ns	Nanoseconds	5ns
us	Microseconds	5us
ms	Milliseconds	5ms
s	Seconds	5s
m	Minutes	5m
h	Hours	5h
d	Days	5d

Durations may also be a combination of multiple values such as 1d 12h, 1h 0m 30.340s

Important

Only the last segment of a duration may have a fractional part.

HostAddress

A HostAddress is a hostname port pair in the form {host}:{port}, for example google.com:443.

List<T>

A list is a comma separated set of values that may be of any type that does not itself contain a comma, for example, a list may be of Durations or HostAddresses.

Example:

SOME_VARIABLE=item1,item2,item3

Map<K, V>: A map is a list of key/value pairs with the keys separated from values by a colon and the pairs separated by commas. Keys may only be simple types, and values may be of any type that does not contain a comma.

Example:

SOME_VARIABLE=key1:value,key2:value,key3:value

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

env-vars.adoc

env-vars.adoc

VDI Stack Environment Configuration

Environment Variables

Examples

Minimal Example

Modules

Rest Service

Hard Delete Trigger Handler

Import Trigger Handler

Install Data Trigger Handler

Pruner

Dataset Reconciler

Reconciliation Event Handler

Soft Delete Trigger Handler

Update Meta Trigger Handler

Components

Cache DB

Kafka

Consumer Client

Producer Client

Trigger Topics

Message Keys

Rabbit

Exchange Config

Queue Config

Routing

S3 (MinIO)

General Plugin Variables

Wildcard Environment Variables

Plugin Handler Environment Key Components.

Application Database Key Components

Variable Definitions

One to Many Project to DB Configs

Environment Variable Types

Files

env-vars.adoc

Latest commit

History

env-vars.adoc

File metadata and controls

VDI Stack Environment Configuration

Environment Variables

Examples

Minimal Example

Modules

Rest Service

Hard Delete Trigger Handler

Import Trigger Handler

Install Data Trigger Handler

Pruner

Dataset Reconciler

Reconciliation Event Handler

Share Trigger Handler

Soft Delete Trigger Handler

Update Meta Trigger Handler

Components

Cache DB

Kafka

Consumer Client

Producer Client

Trigger Topics

Message Keys

Rabbit

Exchange Config

Queue Config

Routing

S3 (MinIO)

General Plugin Variables

Wildcard Environment Variables

Plugin Handler Environment Key Components.

Application Database Key Components

Variable Definitions

One to Many Project to DB Configs

Environment Variable Types