-
Notifications
You must be signed in to change notification settings - Fork 0
Service Descriptor Language Reference
The Service Descriptor Language (SDL) is used in the service.sdl
file to describe the CSD service to Cloudera Manager. It is worth first looking at the CSD primer to get an idea of how all the CSD components fit together.
Below is a description of each section of the SDL in more detail. There are some structures that are shared between sections. For example, the parameters under service are the same structures as the parameters under roles. Those structures will be described at the end.
This is the top level section that describes the service as a whole.
{
"name" : "ECHO",
"label" : "Echo",
"description" : "The echo service",
"version" : "1",
"runAs" : {
"user" : "echoservice",
"group" : "echoservice"
},
"maxInstances" : 1,
"icon" : "images/icon.png",
"compatibility" : {
"generation" : 2,
"cdhVersion" : {
"min" : 4,
"max" : 5
}
},
"parcel" : {
"repoUrl" : "http://mywebsite.com",
"requiredTags" : [ "echo" ],
"optionalTags" : [ "echo-plugin" ]
},
"serviceDependencies" : [
{
"name" : "ZOOKEEPER"
},
{
"name" : "HDFS",
"required" : "true"
}
],
"serviceInit" : {
"preStartSteps" : [
{
"commandName" : "CreateHomeDirCommand"
}
],
"postStartSteps" : [
{
"commandName" : "CreateParamDirCommand",
"failureAllowed" : true
}
]
},
"stopRunner" : {
"relevantRoleTypes" : ["ECHO_WEBSERVER"],
"runner" : {
"program" : "scripts/control.sh",
"args" : ["admin", "stopAll"]
},
"timeout" : 180000,
"masterRole" : "ECHO_MASTER"
},
"inExpressWizard" : true,
"rolesWithExternalLinks" : ["ECHO_MASTER_SERVER"],
"hdfsDirs" : [
{
"name" : "CreateUserHomeDirCommand",
"label" : "Create Echo User Home Dir",
"description" : "Creates the Echo user directory in HDFS",
"directoryDescription" : "Echo HDFS user directory",
"path" : "/user/${user}",
"permissions" : "0750"
}
],
"commands" : [
{
"name" : "service_cmd1",
"label" : "Service Cmd1 Label",
"description" : "Service Cmd1 Help",
"roleName" : "ECHO_WEBSERVER",
"roleCommand" : "role_cmd1",
"runMode" : "all"
}
],
"parameters" : "..."
}
This is the logical name of the service. In CM it is the service type. Cloudera Manager validates that this name is globally unique.
Required?: Yes
{
"name" : "ECHO"
}
Note: must be uppercase and only contain letters, numbers and underscores
The text that is user facing. e.g. in the "Add Service" wizard.
Required?: Yes
{
"label" : "Echo"
}
The description of the service - shown in the wizards.
Required?: Yes
{
"description" : "The echo service"
}
The version of the CSD.
- Updated in Cloudera Manager 5.14 CSD versioning logic improved. The version of a CSD is now checked at CSD load time (during Cloudera Manager startup): only the latest version for a CSD service type is loaded, while earlier CSD versions are not. The version number has precedence over the CDH compatibility range: note that a newer CSD version might not be compatible with an older cluster, even if an older version CSD which is compatible is still present in the CSD repository. Version numbers are recommended to be in the format "major.minor.micro".
Required?: Yes
{
"version" : "1"
}
The default user and group to run all commands as. If the user/group does not exist, the script will fail to run. The user/group should have been created by Cloudera Manager, the system packages or parcels. The CSD system will not create users and groups. The runAs can be overridden by a CM administrator in the configuration page.
The optional principal field lets you specify the kerberos user as which daemon and all processes on it run. More precisely this field specifies the default value of a parameter that gets added for configuring the kerberos principal, the value of which is used to replace ${principal}
in configurations. ${principal}
will have the same value as ${user}
on non-secure clusters.
Required?: Yes
{
"runAs" : {
"user" : "echoservice",
"group" : "echoservice",
"principal" : "echoservice" // this is optional
}
}
The number of service instances this service type can have. For example, CM only allows one instance of HDFS to exist per cluster.
Required?: No, defaults to no limit
{
"maxInstances" : 1
}
The path and file name to the icon that shows up beside the service. The icon must be a 16x16 PNG image.
Required?: No, a default service icon exists in Cloudera Manager
{
"icon" : "images/icon.png"
}
Describes compatibility requirements for this CSD. It is further discussed in detail here.
Required?: No, by default compatibility checks are not enforced.
{
"compatibility" : {
"generation" : 2,
"cdhVersion" : {
"min" : "4",
"max" : "5"
}
}
}
Key | Description | Required? |
---|---|---|
generation | An integer used to communicate compatibility between different CSDs of the same name. Compatibility is intentionally independent of CSD versions. | yes |
cdhVersion | The range of CDH cluster versions compatible with this CSD. Defaults to all versions. min is inclusive, but max is exclusive. If only the major version is specified for max, it is interpreted as everything up to but not including the next major version. Therefore the example above means everything starting from 4.0.0 inclusive, up to but not including 6.0.0. | no |
- Changed in Cloudera Manager 5.4.0, support "major.minor.micro" syntax, e.g. "min" : "5.3.0"
A structure that describes the interaction of the CSD with parcels.
Required?: No
{
"parcel" : {
"repoUrl" : "http://mywebsite.com",
"additionalRepoUrls": [ "http://anotherparcelsource.com" ],
"requiredTags" : [ "echo" ],
"optionalTags" : [ "echo-plugin" ]
}
}
Key | Description | Required? |
---|---|---|
repoUrl | Automatically adds the parcel repository url to the list in Cloudera Manager. This makes deploying the associated parcel easier since the user doesn't have to manually add a parcel url. | no |
additionalRepoUrls | A list of URLs which is added to the parcel repository list in Cloudera Manager, the same as 'repoUrl'. This is combined with 'repoUrl', and all unique URL strings will be added to the repository list. (Since: Cloudera Manager 6.0.0) | no |
requiredTags | The tags that must exist in the active parcels on the cluster. | no |
optionalTags | Any parcel that has this optional tag will get their environment scripts sourced when commands are run. | no |
For more info see The Parcel provides tags and interaction with Parcels.
A list of service types that this CSD depends on. An instance of each of these services needs to exist on the cluster before a service of this CSD can be added. Declaring a dependency on a service, means that the client configs for all dependencies will also get deployed to the process directory.
Required?: No
{
"serviceDependencies" : [
{
"name" : "ZOOKEEPER"
},
{
"name" : "HDFS",
"required" : "true"
}
]
}
Key | Description | Required? |
---|---|---|
name | The service type. | yes |
required | Whether an error should surface if dependency is not met. Defaults to false. | no |
Note: If you add a dependency to ZooKeeper service, then any process in that service (e.g. role daemon process, command process, client config deployment process) will get the ZooKeeper quorum in ZK_QUORUM environment variable. This can be used in the control script to add configuration properties for the ZooKeeper quorum.
In the wizard, when the service is added, it might be necessary to run some service commands before starting the service and right after starting the service. For example, creating a directory in HDFS.
Required?: No, only the start service commands are run
{
"serviceInit" : {
"preStartSteps" : [
{
"commandName" : "CreateHomeDirCommand"
}
],
"postStartSteps" : [
{
"commandName" : "SomeCommand",
"failureAllowed" : true
}
]
}
}
Key | Description | Required? |
---|---|---|
preStartSteps | A list of service commands to execute before the service starts. | no |
postStartSteps | A list of service commands to execute after the service starts. | no |
failureAllowed | For each service command in preStartSteps or postStartSteps, indicates if it is ok if the commands fail. Defaults to no. | no |
Defines a custom runner to gracefully bring down the service. After the runner completes, the remaining standing roles will be abruptly stopped. We use something similar for Hbase and Accumulo that notifies the master to shutdown the service - ensuring that the worker roles shutdown cleanly.
Required?: No, when the service is stopped, each role gets a SIGTERM
{
"stopRunner" : {
"relevantRoleTypes" : ["ECHO_WEBSERVER"],
"runner" : {
"program" : "scripts/control.sh",
"args" : ["admin", "stopAll"]
},
"timeout" : 180000,
"masterRole" : "ECHO_MASTER"
}
}
Key | Description | Required? |
---|---|---|
relevantRoleTypes | The types of roles to be stopped by the runner. These roles are set as stopped after the runner runs. If not specified, all roles are expected to be stopped. | no |
masterRole | The master role name. This is used to determine when the stop runner is completed. When one of the master roles is stopped then the runner is considered done. If defined, the runner will only expect the script to bring down one running master, among possibly multiple masters. | no |
runner | The script to gracefully bring down the roles. See Script Runner. | yes |
timeout | The duration to wait for the runner to complete. If the runner does not complete by then, the runner will be aborted and the processes will be exited abruptly. Default is no timeout. | no |
Set to true if this service should show up in the Express and Add Cluster wizards. Only services that can be fully configured from inside the wizards should be added. Otherwise the initial cluster setup will fail.
Required?: No, defaults to false
{
"inExpressWizard" : true
}
Specifies a list of roles that have external links that should be shown on the service main page.
Required?: No
{
"rolesWithExternalLinks" : ["ECHO_MASTER_SERVER"],
}
A common requirement of services is that directories exist in HDFS with specific permissions. In order for these directories to exist, the user running the command needs to have elevated HDFS privileges. To aid with this, Cloudera Manager can create the needed directories and set the proper permissions.
Required?: No
{
"hdfsDirs" : [
{
"name" : "CreateUserHomeDirCommand",
"label" : "Create Echo User Home Dir",
"description" : "Creates the Echo user directory in HDFS",
"directoryDescription" : "Echo HDFS user directory",
"path" : "/user/${user}",
"permissions" : "0750"
}
]
}
Key | Description | Required? |
---|---|---|
name | The name of the command. | yes |
label | The user facing name in the "Actions" dropdown list. | yes |
description | The help string for the command. | yes |
directoryDescription | The description for this directory. | yes |
path | The path in HDFS. This can have standard substitutions. | yes |
permission | Permission for the directory. | yes |
A list of service commands. A command needs to execute on a host. Since a service is an abstract entity made up of roles it does not live on a specific host. The way service commands work is they point to a role command on a specific role type. runMode
can be used to specify how this command should be run.
Required?: No
{
"commands" : [
{
"name" : "service_cmd1",
"label" : "Service Cmd1 Label",
"description" : "Service Cmd1 Help",
"roleCommand" : "role_cmd1",
"roleName" : "ECHO_WEBSERVER",
"runMode" : "all"
}
]
}
Key | Description | Required? |
---|---|---|
name | The name of the command. | yes |
label | The user facing name in the "Actions" dropdown list. | yes |
description | The help string for the command. | yes |
roleCommand | The role command the service command will execute. | yes |
roleName | The role type the role command lives on. | yes |
runMode | Enum of "all" or "single". See runMode. | yes |
all
- The service command will run on all the roles in the service with the required role state (specified in the role command description). The command will wait until all the roles have completed before completing the service command.
single
- The service command will pick a single random role that is in the required role state (specified in the role command description) to run the role command. The service command will wait until the role has completed the command. This type of command is often needed if the command should only be run once, for example an "Initialization" type of command.
A list of parameters used to configure the service. See Parameters.
Required?: No
{
"parameters" : "..."
}
List of external kerberos principals used by the service. Cloudera Manager will not manage these principals, but this can be used to refer to any external principals in configuration.
Whether kerberos authentication is used. A service will require kerberos authentication if any of the following is true:
- Any dependency of the service requires kerberos authentication
- This field returns a value of "true" or "kerberos" (case-insensitive)
HDFS Encryption needs to talk to a Key Management Server. If your CSD is implementing a Hadoop Key Management Server interface, then you need to include providesKms.
Required?: No
{
"providesKms" : {
"roleName" : "MY_KMS_ROLE",
"insecureUrl" : "http://${host}:${kms_port}/kms",
"secureUrl" : "https://${host}:${kms_ssl_port}/kms",
"loadBalancerUrl" : "${kms_load_balancer}"
}
}
Key | Description | Required? |
---|---|---|
roleName | Name of the role that provides the KMS interface. | Usually. Can omit if your CSD always configures a load balancer. |
insecureUrl | URL of KMS when SSL is not enabled and the load balancer is not in use. | Usually. Can omit if your CSD always configures a load balancer. |
secureUrl | URL of KMS when SSL is not enabled and the load balancer is not in use. | Usually. Can omit if your CSD always configures a load balancer, or if it never supports SSL. |
loadBalancerUrl | URL of load balancer. Must be specified when there are multiple relevant roles. Note that you can only reference service-level parameters in substitutions for loadBalancerUrl. | If your CSD can be configured with multiple roles of the relevant type (see topology), then this must be present. |
In addition, if roleName is provided and the matching role defines a configGenerator for “core-site.xml”, then core-site.xml will automatically get everything in the core-site.xml from HDFS client configuration. If parameters registered with this configGenerator have keys that conflict with contents in HDFS core-site.xml, the values from HDFS will be overridden.
Note that SSL enablement is determined by sslServer for the relevant role.
The KEYTRUSTEE CSD serves as a complete example of how to create a custom KMS that HDFS can use for encryption.
If service supports rolling restart, the steps can be specified using this. See Rolling Restart for more details.
{
"rollingRestart" : {
"nonWorkerSteps" : [{
"roleName" : "NON_WORKER_ROLE",
"bringDownCommands" : [ "Stop" ],
"bringUpCommands" : [ "Start", "role_cmd" ]
}],
"workerSteps" : {
"roleName" : "WORKER_ROLE",
"bringDownCommands" : [ "service_cmd", "Stop" ],
"bringUpCommands" : [ "Start" ]
}
}
}
A service can have list of role types. These role of a specific role type is associated with a process on a host.
{
"roles" : [
{
"name" : "ECHO_WEBSERVER",
"label" : "Web Server",
"pluralLabel" : "Web Servers",
"startRunner" : {
"program" : "scripts/control.sh",
"args" : [ "start" ],
"environmentVariables" : {
"WEBSERVER_PORT" : "${port_num}"
}
},
"stopRunner" : {
"timeout" : "90000",
"runner" : {
"program" : "scripts/stop_echows.sh",
"args" : ["cmdlineArg1"],
"environmentVariables" : {
"FOO_VAR" : "bar"
}
}
},
"externalLink" : {
"name" : "webserver_web_ui",
"label" : "Web Server Web UI",
"url" : "http://${host}:${webserver_webui_port}"
},
"additionalExternalLinks" : [
{
"name" : "webserver_web_ui2",
"label" : "Web Server WebUI2",
"url" : "http://${host}:${webserver_webui_port}/ui2"
}
],
"topology" : {
"minInstances" : "2",
"maxInstances" : "10",
"softMinInstances" : "3",
"softMaxInstances" : "6",
"requiresOddNumberOfInstances" : "false"
},
"logging" : {
"dir" : "/var/log/echo",
"filename" : "webserver.log",
"modifiable" : true,
"configName" : "log.dir",
"loggingType" : "log4j"
},
"commands" : [
{
"name" : "role_cmd1",
"label" : "Role Cmd1 Label",
"description" : "Role Cmd1 Help",
"expectedExitCodes" : [0, 1, 2],
"requiredRoleState" : "running",
"commandRunner" : {
"program" : "scripts/control.sh",
"args" : ["cmd1"]
}
}
],
"configWriter" : "....",
"parameters" : "....",
"cgroup" : "...."
}
]
}
This is the logical name of the role. In CM it is the role type. A current limitation is that the role type needs to be globally unique. Because of this, it is suggested that the service type be prepended to the role type to make it scoped to this service.
Required?: Yes
{
"name" : "ECHO_WEBSERVER"
}
Note: must be uppercase and only contain letters, numbers and underscores
The name that is user facing. e.g. in the "Add Service" wizard.
Required?: Yes
{
"label" : "Web Server"
}
The plural name that is user facing. Shows up in the home screen.
Required?: Yes
{
"pluralLabel" : "Web Servers"
}
- New in Cloudera Manager 5.3.0
The script to run that starts this role.
Required?: Yes
{
"startRunner" : {
"program" : "scripts/control.sh",
"args" : [ "start" ],
"environmentVariables" : {
"WEBSERVER_PORT" : "${port_num}"
}
}
}
The startRunner
structure is of type Script Runner
This section provides two capabilities: custom stop behavior and graceful role shutdown. When this descriptor is not specified, the standard behavior applies: the process will receive a SIGTERM and will be allowed to complete its shutdown in a certain amount of time (currently hardcoded). If the process cannot terminate in time or report the expected exit code it will be forcefully killed (group SIGKILL). This is mutually exclusive with using a stopRunner on the service-level descriptor.
Required?: No
"stopRunner" : {
"timeout" : "90000",
"runner" : {
"program" : "scripts/stop_echows.sh",
"args" : ["cmdlineArg1"],
"environmentVariables" : {
"FOO_VAR" : "bar"
}
}
}
-
timeout
: default timeout in milliseconds. Any positive non-zero value will enable the graceful role shutdown feature. This value can be configured by the user in Cloudera Manager, the value here specified will be the default. A value of zero (0
) means wait forever the process termination. - When a
runner
is specified (of type Script Runner), it allows to run a custom script to perform the stop operation for the role. The pid of the process currently running is passed via the Cloudera Manager-provided env. variablePID_TO_STOP
. Whenrunner
is not specified, the standard behavior applies but the graceful timeout will be respected before a forced kill.
- New in Cloudera Manager 5.11.1
Specifies an external link that shows up in the status page of the role. This is often a Web UI for the role. If this role is present in the service rolesWithExternalLinks, it would be this external link that is used in the service status page.
Required?: No
{
"externalLink" : {
"name" : "webserver_web_ui",
"label" : "Web Server Web UI",
"url" : "http://${host}:${webserver_webui_port}",
"secureUrl" : "https://${host}:${webserver_webui_ssl_port}"
}
}
Key | Description | Required? |
---|---|---|
name | The internal identifier for the link. | yes |
label | The user facing name. Show up on the Status Page. | yes |
url | The url to the status page. This can have standard substitutions. | yes |
secureUrl | The url to the status page when SSL is enabled. This can have standard substitutions. If secureUrl is not specified, then url is used instead. SSL enablement is determined by sslServer in the same role. | no |
Specifies additional external links. These links will show up alongside the externalLink on the role process page. These links will not show up in the main status page of the role. Ideally, a externalLink should first be specified before specifying additional links.
Required?: No
{
"additionalExternalLinks" : [
{
"name" : "webserver_web_ui2",
"label" : "Web Server WebUI2",
"url" : "http://${host}:${webserver_webui_port}/ui2",
"secureUrl" : "https://${host}:${webserver_webui_ssl_port}"
}
]
}
Key | Description | Required? |
---|---|---|
name | The internal identifier for the link. | yes |
label | The user facing name. | yes |
url | The url to the status page. This can have standard substitutions. | yes |
secureUrl | The url to the status page when SSL is enabled. This can have standard substitutions. If secureUrl is not specified, then url is used instead. SSL enablement is determined by sslServer in the same role. | no |
Provides restrictions on the number of instance of this role type that should exist on the cluster. For example, a singleton type role like a master can have minInstaces = 1
and maxInstances = 1
.
Required?: No
{
"topology" : {
"minInstances" : "2",
"maxInstances" : "10",
"softMinInstances" : "3",
"softMaxInstances" : "6",
"requiresOddNumberOfInstances" : "false",
"placementRules" : [ ]
}
}
Key | Description | Required? |
---|---|---|
minInstances | The minimum number of instances. Defaults to 1. | no |
maxInstances | The maximum number of instances. Default is no limit. | no |
softMinInstances | The recommended minimum number of instances. A warning will be displayed on the service's Instances page in Cloudera Manager when fewer than this number of roles are configured. By default, there is no recommended minimum. | no |
softMaxInstances | The recommended maximum number of instances. A warning will be displayed on the service's Instances page in Cloudera Manager when more than this number of roles are configured. By default, there is no recommended maximum. | no |
requiresOddNumberOfInstances | This should be set to true in cases where only an odd number of instances is allowed for the role type. This might be important for some HA architectures. Defaults to false. | no |
placementRules | List of special placementRules limiting where a role may be placed. Default is no special rules. | no |
- New in Cloudera Manager 5.5.0, added placementRules
- New in Cloudera Manager 5.8.0, added softMinInstances and softMaxInstances
- New in Cloudera Manager 5.14.0, added requiresOddNumberOfInstances
Placement Rules allow special restrictions for where a role may be placed.
alwaysWith
{
"type" : "alwaysWith",
"roleType" : "PRIMARY_ROLE_NAME"
}
When this rule is present, the current role must always be placed on the same host as wherever the role with name "PRIMARY_ROLE_NAME" is placed. The current role will no longer appear in the wizard when adding this service. Instead, one instance of this role will automatically be placed on any host that has the primary role. If the user ever assigns roles in a way that violates this placement rule, the service will have a configuration error and fail to start.
alwaysWithAny
{
"type" : "alwaysWithAny",
"roleTypes" : [ "PRIMARY_ROLE_NAME_1", "PRIMARY_ROLE_NAME_2"]
}
When this rule is present, the current role must always be placed on the same host as wherever the roles with the name "PRIMARY_ROLE_NAME_1", "PRIMARY_ROLE_NAME_2" are placed. The current role will no longer appear in the wizard when adding this service. Instead, one instance of this role will automatically be placed on any host that has at least one of the primary roles. If more than one of the primary roles are themselves placed in the same host, then only one instance of this role will be automatically placed on that host.
There should be at least two unique primary roles defined in alwaysWithAny rule. And, alwaysWithAny rule is mutually exclusive to alwaysWith placement rule and should not be defined together for the same role. If the user ever assigns roles in a way that violates this placement rule, the service will have a configuration error and fail to start.
neverWith
{
"type" : "neverWith",
"roleTypes" : [ "INCOMPATIBLE_ROLE_NAME_1", "INCOMPATIBLE_ROLE_NAME_2" ]
}
When this rule is present, the current role must never be placed on the same host as wherever the listed incompatible roles are placed. If the user ever assigns roles in a way that violates this placement rule, the service will have a configuration error and fail to start.
Starting Cloudera Manager 5.13.0, all the placement rules for a role should be of a unique “type”. That is, a role can have a maximum of one placement rule per type.
- Placement Rules (alwaysWith and neverWith) new in Cloudera Manager 5.5.0
- Placement Rule (alwaysWithAny) new in Cloudera Manager 5.13.0
Health Aggregation allows reporting health of the service based on the health of one or more role(s). A role can define one of the following health aggregation types, depending on the topology of the role, but not both of them.
singleton
A type of health aggregation where the service health is reported based the health of the singleton role. The service health has a 1:1 mapping to the health of the singleton role, for example: if the role is "Good" (i.e GREEN), then the service-level health check reports "Good". A “singleton” health aggregation type should be only used for a singleton role. A role is considered to be a singleton role if its topology has “maxInstances” = 1.
{
"healthAggregation" : {
"type" : "singleton"
}
}
nonSingleton
A type of health aggregation where the service health is reported based on the health of all the roles of this type. A “nonSingleton” health aggregation type should be only used with a non-singleton role. A role is considered to be a non-singleton role if its topology has “maxInstances” > 1.
{
"healthAggregation" : {
"type" : "nonSingleton",
"percentGreenForGreen" : 95.0,
"percentYellowGreenForYellow" : 90.0
}
}
percentGreenForGreen
: A double value. The associated service health will report "Good" (i.e GREEN) if the total percentage of "Good" roles is strictly greater than this value. If this condition evaluates to true, then "percentYellowGreenForYellow" is not evaluated.
percentYellowGreenForYellow
: A double value. The associated service health will report "Concerning" (i.e YELLOW) if the total percentage of "Good" and "Concerning" roles is strictly greater than this value. If this condition evaluates to false, then service health would report as "Bad" (i.e RED). Note that these rules imply that the health check will never return “Bad” (i.e RED) unless at least one role is “Bad”.
- Health Aggregation new in Cloudera Manager 5.13.0.
Instructs Cloudera Manager where to look for the role log file. By specifying this structure, the CSD can also participate in log collection when a support bundle is requested. In addition, if the loggingType of log4j
is set, then a log4j.properties file is generated from user provided configuration and sent to the agent's process directory. The start script can then place the log4j.properties file in the appropriate location where the role process can read it.
Since: Cloudera Manager 5.4.0, it is also possible specify logback
as the loggingType. By default, a logback.xml file is generated from user provided configuration when this logging type is chosen, with support for appending XML snippets into the generated logback configuration file.
Since: Cloudera Manager 5.5.0, it possible to specify glog
as the loggingType. This is intended for services that use glog for logging. Environment variables for glog (prefixed with "GLOG_") will be emitted into the role's environment.
Required?: No
Log4j example
{
"logging" : {
"dir" : "/var/log/echo",
"filename" : "webserver.log",
"modifiable" : true,
"configName" : "log.dir",
"loggingType" : "log4j",
"additionalConfigs" : [
{
"key" : "additional.log.key",
"value" : "additional.log.value"
}
]
}
}
Logback example
{
"logging" : {
"dir" : "/var/log/echo",
"filename" : "webserver.log",
"modifiable" : true,
"configName" : "log.dir",
"loggingType" : "logback",
"additionalConfigs" : [
{
"key" : "extraLoggerConfig",
"value" : "<logger name=\"org.apache.commons.beanutils\" level=\"ERROR\"/>"
}
]
}
}
glog example
{
"logging" : {
"dir" : "/var/log/echo",
"filename" : "webserver.INFO",
"modifiable" : true,
"loggingType" : "glog"
}
}
Key | Description | Required? |
---|---|---|
dir | The location on disk where the logs are written. This directory gets created automatically by the CM agent. | yes |
filename | The log file name. This can have standard substitutions. | yes |
configFilename | The name of the configuration file. Defaults to log4j.properties (log4j) or logback.xml (logback) | no |
modifiable | Whether the directory should be exposed in CM configuration UI for modification. Default is false. | no |
configName | The name of the configuration key to output when being written to a config file. Default is "log_dir". | no |
loggingType | Enum of "log4j", "logback", "glog", or "other". Defaults to "other". See loggingType. | no |
additionalConfigs | List of ConfigEntry to add near the end of the file. Only works with log4j or logback files. In the case of log4j, these additional configs will go after all regular parameters, but before any Advanced Configuration Snippets. In the case of logback, the "value" field of each of the additional configs will be added to the generated configuration XML. (Not allowed for Gateway roles until CM 5.3.2) | no |
kerberosPrincipals | List of Kerberos Principal Config Entries to add to config file. | no |
- New in Cloudera Manager 5.2.0, added additionalConfigs for any non-gateway role. Gateway support added in 5.3.2
log4j
- Cloudera Manager auto-generates the following parameters for the role:
- Log Threshold
- Max file size
- Max backup index size
- Log4j safety valve
- A log4j.properties file is generated and deployed to the process directory.
logback (Since: Cloudera Manager 5.4.0)
- Cloudera Manager auto-generates the following parameters for the role:
- Log Threshold
- Max file size
- Max backup index size
- Logback XML override
- A logback.xml file is generated and deployed to the process directory.
glog (Since: Cloudera Manager 5.5.0)
- Cloudera Manager auto-generates some parameters for a role. Each one will automatically be emitted into the environment. The parameters are listed below with their corresponding environment variable names in parentheses:
- Log directory (GLOG_log_dir)
- Minimum log level (GLOG_minloglevel)
- Maximum log level to buffer (GLOG_logbuflevel)
- Minimum log verbosity (GLOG_v)
- Maximum log size (GLOG_max_log_size)
- When using glog-based logging, the filename field must end in ".INFO".
other
- Cloudera Manager doesn't do anything special for this logging type.
A list of commands that can be run on the role. A command has a few pieces of metadata but essentially it is a script that gets executed on the host of the role. The command script should return one of a predefined list of exit codes to be considered successful. This is the building block for a service commands.
Required?: No
{
"commands" : [
{
"name" : "role_cmd1",
"label" : "Role Cmd1 Label",
"description" : "Role Cmd1 Help",
"expectedExitCodes" : [0, 1, 2],
"requiredRoleState" : "running",
"commandRunner" : {
"program" : "scripts/control.sh",
"args" : ["cmd1"]
}
}
]
}
Key | Description | Required? |
---|---|---|
name | The internal name of the role command. | yes |
label | The user facing name in the "Actions" dropdown list. | yes |
description | The help text for the command. | yes |
expectedExitCodes | The exit codes from the command that are expected. If an exit code doesn't is found that is not in this list, the command will be flagged as failed. | yes |
requiredRoleState | The state the role needs to be in for the command to be available. Enum of "stopped" and "running". | yes |
commandRunner | The script to execute for this command. The structure is of type Script Runner. | yes |
Specifies what configuration files should be written out to the process directory when commands get run for this role. This includes the start command and additional role commands.
Required?: No
{
"configWriter" : "..."
}
The configWriter
structure is of type Configuration Writer
A list of parameters used to configure the role. See Parameters. Roles inherit all the parameters that are specified in the service.
Required?: No
{
"parameters" : "..."
}
Use when this role acts as an SSL Server. This will automatically generate relevant role parameters and help CM know when SSL is relevant, such as when it should use regular or secure urls.
Required?: No
Java Keystore Format (JKS)
{
"sslServer" : {
"keyIdentifier" : "echo_master",
"enabledConfigName" : "echo.ssl.enabled",
"keystoreLocationConfigName" : "echo.ssl.keystore.location",
"keystorePasswordConfigName" : "echo.ssl.keystore.password.script",
"keystorePasswordCredentialProviderCompatible" : false,
"keystorePasswordScriptBased" : true,
"keyPasswordOptionality" : "required",
"keystoreKeyPasswordConfigName" : "echo.ssl.keystore.key.password.script",
"keystoreKeyPasswordCredentialProviderCompatible" : false,
"keystoreKeyPasswordScriptBased" : true
}
}
Key | Description | Required? |
---|---|---|
keystoreFormat | The format of the ssl server, either "jks" or "pem". If not specified, defaults to "jks". The rest of this section assumes the JKS format is configured. | no |
keyIdentifier | The alias / identifier for this key in the keystore | yes |
enabledConfigName | Config name to emit when ssl_enabled is used in a config file. If null, ssl_enabled will not be emitted into config files, and can only be used in substitutions like ${ssl_enabled}. | no |
keystoreLocationConfigName | Config name to emit when ssl_server_keystore_location is used in a config file. If null, ssl_server_keystore_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_keystore_location}. | no |
keystorePasswordConfigName | Config name to emit when ssl_server_keystore_password is used in a config file. If null, ssl_server_keystore_password will not be emitted into config files, and can only be used in substitutions like ${ssl_server_keystore_password}. You must set this in order to use keystorePasswordCredentialProviderCompatible or keystorePasswordScriptBased. | no |
keystorePasswordCredentialProviderCompatible | Defaults to false. Whether ssl_server_keystore_password can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_server_keystore_password}. Requires keystorePasswordConfigName to be set. Mutually exclusive with keystorePasswordScriptBased. | no |
keystorePasswordScriptBased | Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_server_keystore_password}): 1) The regular password for the keystore is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires keystorePasswordConfigName to be set. Mutually exclusive with keystorePasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_server_keystore_password}. For this functionality to be useful, your code must run the script in the parameter to get the real password. | no |
keyPasswordOptionality | Whether to expose the ssl_server_keystore_keypassword parameter, and whether it is required or optional. When not specified, no parameter is emitted for ssl_server_keystore_keypassword. See Parameter Optionality | no |
keystoreKeyPasswordConfigName | Config name to emit when ssl_server_keystore_keypassword is used in a config file. If null, ssl_server_keystore_keypassword will not be emitted into config files, and can only be used in substitutions like ${ssl_server_keystore_keypassword}. You must set this in order to use keystoreKeyPasswordCredentialProviderCompatible or keystoreKeyPasswordScriptBased. | no |
keystoreKeyPasswordCredentialProviderCompatible | Defaults to false. Whether ssl_server_keystore_keypassword can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_server_keystore_keypassword}. Requires keystoreKeyPasswordConfigName to be set. Mutually exclusive with keystoreKeyPasswordScriptBased. | no |
keystoreKeyPasswordScriptBased | Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_server_keystore_keypassword}): 1) The regular password for the keystore is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires keystoreKeyPasswordConfigName to be set. Mutually exclusive with keystoreKeyPasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_server_keystore_keypassword}. For this functionality to be useful, your code must run the script in the parameter to get the real password. | no |
- New in Cloudera Manager 5.2.0, initial SSL Server support
- New in Cloudera Manager 5.5.0, format type, custom config names, and script-based passwords for generated parameters.
Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.
Users will see these parameters in the normal Configuration page as well as when adding this service.
Parameter name | Type | Description |
---|---|---|
ssl_enabled | boolean | Enable SSL Requests to this role. |
ssl_server_keystore_location | path | Path to the keystore file containing the server certificate and private key used for SSL. Used when this role is acting as an SSL server. Keystore must be in JKS format. |
ssl_server_keystore_password | password | Password for the JKS keystore used when this role is acting as an SSL server. |
ssl_server_keystore_keypassword | password | Password that protects the private key contained in the JKS keystore used when this role is acting as an SSL server. |
- New in Cloudera Manager 5.2.0, initial SSL Server support
- New in Cloudera Manager 5.5.0, SSL configs appear in wizards.
PEM Certificate Format
{
"sslServer" : {
"keystoreFormat" : "pem",
"enabledConfigName" : "echo.ssl.enabled",
"privateKeyLocationConfigName" : "echo.ssl.keystore.location",
"privateKeyPasswordConfigName" : "echo.ssl.keystore.password.script",
"privateKeyPasswordCredentialProviderCompatible" : false,
"privateKeyPasswordScriptBased" : true
}
}
Key | Description | Required? |
---|---|---|
keystoreFormat | The format of the ssl server, either "jks" or "pem". If not specified, defaults to "jks". The rest of this section assumes this was configured to "pem" | no (yes for PEM format) |
enabledConfigName | Config name to emit when ssl_enabled is used in a config file. If null, ssl_enabled will not be emitted into config files, and can only be used in substitutions like ${ssl_enabled}. | no |
privateKeyLocationConfigName | Config name to emit when ssl_server_privatekey_location is used in a config file. If null, ssl_server_privatekey_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_privatekey_location}. | no |
privateKeyPasswordConfigName | Config name to emit when ssl_server_privatekey_password is used in a config file. If null, ssl_server_privatekey_password will not be emitted into config files, and can only be used in substitutions like ${ssl_server_privatekey_password}. You must set this in order to use privateKeyPasswordCredentialProviderCompatible or privateKeyPasswordScriptBased. | no |
privateKeyPasswordCredentialProviderCompatible | Defaults to false. Whether ssl_server_privatekey_password can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_server_privatekey_password}. Requires privateKeyPasswordConfigName to be set. Mutually exclusive with privateKeyPasswordScriptBased. | no |
privateKeyPasswordScriptBased | Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_server_privatekey_password}): 1) The regular password for the private key is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires privateKeyPasswordConfigName to be set. Mutually exclusive with privateKeyPasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_server_privatekey_password}. For this functionality to be useful, your code must run the script in the parameter to get the real password. | no |
certificateLocationConfigName | Optional. Config name to emit when ssl_server_certificate_location is used in a config file. If null, ssl_server_certificate_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_certificate_location}. | no |
certificateLocationDefault | Optional. Default value for ssl_server_certificate_location. | no |
caCertificateLocationConfigName | Optional. Config name to emit when ssl_server_ca_certificate_location is used in a config file. If null, ssl_server_ca_certificate_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_ca_certificate_location}. | no |
caCertificateLocationDefault | Optional. Default value for ssl_server_ca_certificate_location. | no |
- New in Cloudera Manager 5.5.0, PEM format SSL Server support
Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.
Users will see these parameters in the normal Configuration page as well as when adding this service.
Parameter name | Type | Description |
---|---|---|
ssl_enabled | boolean | Enable SSL Requests to this role. |
ssl_server_keystore_location | path | The path to the TLS/SSL file containing the server certificate and private key used for TLS/SSL. Used when {0} is acting as a TLS/SSL server. The certificate file must be in PEM format. This file can be created by concatenating the certificate.pem file with the private key.pem file. |
ssl_server_keystore_password | password | The password for the private key in the {0} TLS/SSL Server Certificate and Private Key file. If left blank, this indicates that the private key is not protected by a password. |
- New in Cloudera Manager 5.5.0, PEM format SSL Server support
If this role acts as an SSL Client of some SSL Server, use sslClient. This will automatically generate role parameters related to SSL client configuration.
Required?: No
Java Truststore Format (JKS)
{
"sslClient" : {
"truststoreLocationConfigName" : "echo.ssl.truststore.location",
"truststorePasswordConfigName" : "echo.ssl.truststore.password.script",
"truststorePasswordCredentialProviderCompatible" : false,
"truststorePasswordScriptBased" : true
}
}
Key | Description | Required? |
---|---|---|
truststoreFormat | The format of the ssl truststore, either "jks" or "pem". If not specified, defaults to "jks". | no |
truststoreLocationConfigName | Config name to emit when ssl_client_truststore_location is used in a config file. If null, ssl_client_truststore_location will not be emitted into config files, and can only be used in substitutions like ${ssl_client_truststore_location}. | no |
truststorePasswordConfigName | Config name to emit when ssl_client_truststore_password is used in a config file. If null, ssl_client_truststore_password will not be emitted into config files, and can only be used in substitutions like ${ssl_client_truststore_password}. You must set this in order to use truststorePasswordCredentialProviderCompatible or truststorePasswordScriptBased. | no |
truststorePasswordCredentialProviderCompatible | Defaults to false. Whether ssl_client_truststore_password can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_client_truststore_password}. Requires truststorePasswordConfigName to be set. Mutually exclusive with truststorePasswordScriptBased. | no |
truststorePasswordScriptBased | Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_client_truststore_password}): 1) The regular password for the keystore is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires truststorePasswordConfigName to be set. Mutually exclusive with truststorePasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_client_truststore_password}. For this functionality to be useful, your code must run the script in the parameter to get the real password. | no |
- New in Cloudera Manager 5.2.0, basic SSL Client support
- New in Cloudera Manager 5.5.0, custom config names and script-based truststore password
Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.
Users will see these parameters in the normal Configuration page as well as when adding this service.
Parameter name | Type | Description |
---|---|---|
ssl_client_truststore_location | path | Path to the client truststore file used for SSL. Used when this role is acting as an SSL client. Truststore must be in JKS format. |
ssl_client_truststore_password | password | Password for the JKS truststore file used when {0} is acting as an SSL client. |
- New in Cloudera Manager 5.2.0, initial SSL Client support
- New in Cloudera Manager 5.5.0, SSL configs appear in wizards.
PEM Certificate(s) format
{
"sslClient" : {
"truststoreFormat" : "pem",
"truststoreLocationConfigName" : "echo.ssl.truststore.location"
}
}
Key | Description | Required? |
---|---|---|
truststoreFormat | The format of the ssl truststore, either "jks" or "pem". If not specified, defaults to "jks". The rest of this section assumes "pem" was configured. | no (yes for PEM format) |
truststoreLocationConfigName | Config name to emit when ssl_client_truststore_location is used in a config file. If null, ssl_client_truststore_location will not be emitted into config files, and can only be used in substitutions like ${ssl_client_truststore_location}. | no |
- New in Cloudera Manager 5.5.0, PEM format for sslClient
Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.
Users will see these parameters in the normal Configuration page as well as when adding this service.
Parameter name | Type | Description |
---|---|---|
ssl_client_truststore_location | path | The location on disk of the trust store, in .pem format, used to confirm the authenticity of TLS/SSL servers that {0} might connect to. This is used when {0} is the client in a TLS/SSL connection. This trust store must contain the certificate(s) used to sign the service(s) being connected to. If this parameter is not provided, the default list of well-known certificate authorities is used instead. |
- New in Cloudera Manager 5.5.0, PEM format for sslClient
Can be used to indicate which String parameters are unique identifiers for the role. If specified, Cloudera Manager will initialize the parameters to a unique value at role creation.
- New in Cloudera Manager 5.2.0
List of kerberos principals used by the role. If this is specified, a keytab file containing all the principals will be added to role's configuration whenever the role is started or when a command is run on the role. The name of the keytab file is serviceType.toLowerCase() + ".keytab"
. Kerberos principals are also added to the environment of the processes of the role, where variable name is the name of the kerberos principal and value is the principal itself.
Describes if and how cgroup parameters belonging to roles of this type should be automatically configured during static service pool configuration. In general, this section is only relevant to roles that:
- Are present in large numbers across many hosts in the cluster, and
- Consume a non-trivial amount of resources.
For example, the Datanode is the only role within HDFS with automatic configuration for cgroups, because it's the only resource-consuming, multi-instance role.
More information about Linux cgroups, how they're used to implement static service pools, and how Cloudera Manager configures them can be found here.
Required?: No
{
"cgroup" : {
"cpu" : {
"autoConfigured" : true
},
"memory" : {
"autoConfigured" : true,
"autoConfiguredMin" : 1073741824
},
"blkio" : {
"autoConfigured" : true
}
}
}
Key | Description | Required? |
---|---|---|
cpu | The cpu cgroup subsystem. Includes the cpu.shares resource control. |
no |
memory | The memory cgroup subsystem. Includes the memory.limit_in_bytes resource control. |
no |
blkio | The blkio cgroup subsystem. Includes the blkio.weight resource control. |
no |
autoConfigured | Whether the resource controls of this cgroup subsystem should be automatically configured by CM when setting up static service pools. If false, these controls will be left at their defaults. | yes |
autoConfiguredMin | For memory.limit_in_bytes , the absolute minimum memory amount (in bytes) that a role can be given. If unset, defaults to 0. |
no |
Indicates whether the process associated with a role is a Java or other JVM-based process.
Required?: No. Defaults to false
if omitted.
{
"roles" : [
{
"name" : "MY_JAVA_ROLE",
"jvmBased" : "true"
}
],
}
Setting jvmBased
to true
enables a number of features for the role in Cloudera Manager:
Out of memory handling
- This feature allows the user to configure the JVM's behavior when an OutOfMemoryError occurs.
- When
jvmBased
istrue
, Cloudera Manager auto-generates the following parameters for the role: - Dump Heap When Out of Memory
- Heap Dump Directory
- Kill When Out of Memory
- When the Dump Heap When Out of Memory parameter is checked, Cloudera Manager monitors the amount of free space available on the filesystem hosting the heap dump directory, and incorporates this information into the health checks performed for the role.
Periodic Stacks collection
- Periodic stacks collection allows the user to enable and configure the periodic collection of thread stack traces in Cloudera Manager.
- When
jvmBased
istrue
, Cloudera Manager auto-generates the following parameters for the role: - Stacks Collection Enabled
- Stacks Collection Directory
- Stacks Collection Frequency
- Stacks Collection Data Retention
- Stacks Collection Method
JVM-related role commands
- Cloudera Manager defines the following commands for the role when
jvmBased
istrue
: - Collect Stack Traces (jstack)
- Heap Dump (jmap)
- Heap Histogram (jmap -histo)
- These commands are accessible via the Actions menu on the role instance page, as well as through the Cloudera Manager API.
Changes required to CSD control script
When jvmBased
is true
, a new environment variable, CSD_JAVA_OPTS
, is defined in the environment of the role's process. This variable contains options that must be passed when starting up the JVM for the role. In order for the features described above to work, the CSD control script must be modified to include the value of CSD_JAVA_OPTS
on the command line that launches the JVM.
Typically, you can add CSD_JAVA_OPTS
to an existing variable that defines JVM options. For example, the Spark CSD control script defines a variable called SPARK_DAEMON_JAVA_OPTS
. The control script includes the following code to add CSD_JAVA_OPTS
to these options:
export SPARK_DAEMON_JAVA_OPTS="$CSD_JAVA_OPTS $SPARK_DAEMON_JAVA_OPTS"
- jvmBased new in Cloudera Manager 5.7.0
The gateway structure is used to describe the client configuration of the service. Client configuration can be deployed from the service "Action" menu. Once the "Deploy Client Configuration" command is run, the following steps occur:
- Cloudera Manager sends the configuration files specified in the gateway config writer to each gateway role host.
- If a
scriptRunner
exists, it is executed. This gives the CSD a hook to modify the client configuration before it is deployed. - The agent expects client configurations to exist in a subdirectory of the process directory with the same name as the alternatives name. For example:
/var/run/cloudera-scm-agent/process/111-deploy-client-config/echo-conf
. - After the
scriptRunner
is run (or not if there isn't one), the agent will copy the client config subdirectory to the alternativeslinkRoot
and create the system alternatives.
{
"gateway" : {
"alternatives" : {
"name" : "echo-conf",
"linkRoot" : "/etc/echo",
"priority" : 50
},
"scriptRunner" : {
"program" : "scripts/cc.sh",
"args" : ["deploy"]
},
"parameters" : "...",
"configWriter" : "...",
"logging" : "..."
}
}
Describes how to install the deployed files into alternatives for the client configuration.
Required?: Yes
{
"alternatives" : {
"name" : "echo-conf",
"linkRoot" : "/etc/echo",
"priority" : 50
}
}
Key | Description | Required? |
---|---|---|
name | The logical name for the link group in alternatives. It will also serve as the subdirectory name within the process directory for all the generated configuration files. | yes |
linkRoot | The symbolic link to be used by clients that internally points to the alternatives managed locations. The files will be deployed to a subdirectory called conf . For example, if link root is /etc/service , the complete link would be /etc/service/conf . |
yes |
priority | Default priority when installed into alternatives. The value can be changed later via the CM configuration UI. Default is 0. | no |
A script to run under the agent's process directory alongside the files generated by configWriter
. The generated files will be in the script's current working directory. The environment variable $CONF will be available for the root directory of the deploy process. If environment variables are defined for the script, it is highly recommended that they be namespaced with a unique prefix to avoid conflict with the other environment variables that CM injects.
Required?: No
{
"scriptRunner" : {
"program" : "scripts/cc.sh",
"args" : ["deploy"]
}
}
The startRunner
structure is of type Script Runner.
Specifies what configuration files should be written out to the process directory when the "Deploy Client Configuration" command is run.
Required?: yes
{
"configWriter" : "..."
}
The configWriter
structure is of type Configuration Writer.
A list of parameters used to configure the client configuration. See Parameters.
Required?: No
{
"parameters" : "..."
}
Instructs Cloudera Manager where to look for the gateway role log file. Similar effects of generating log4j properties file or logback XML configuration file take place if configured for a gateways as they do for regular roles.
Generated gateway logging configuration files currently are limited to logging to console, but this behavior can be overridden using the supported Log4J safety valve or the Logback XML override configuration.
Required?: No
Log4j example
{
"logging" : {
"configFilename" : "log4j-1.properties",
"loggingType" : "log4j",
"additionalConfigs" : [
{
"key" : "additional.log.key",
"value" : "additional.log.value"
}
]
}
}
Logback example
{
"logging" : {
"configFilename" : "logback-test.xml",
"loggingType" : "logback",
"additionalConfigs" : [
{
"key" : "extraLoggerConfig",
"value" : "<logger name=\"org.apache.commons.beanutils\" level=\"ERROR\"/>"
}
]
}
}
Key | Description | Required? |
---|---|---|
configFilename | The name of the configuration file. Defaults to log4j.properties (log4j) or logback.xml (logback) | no |
loggingType | Enum of "log4j", "logback" or "other". Defaults to "other". See loggingType. | no |
additionalConfigs | List of ConfigEntry to add near the end of the file. Only works with log4j or logback files. In the case of log4j, these additional configs will go after all regular parameters, but before any Advanced Configuration Snippets. In the case of logback, the "value" field of each of the additional configs will be added to the generated configuration XML. | no |
log4j
- Cloudera Manager auto-generates the following parameters for the role:
- Log Threshold
- Log4j safety valve
- A log4j.properties file is generated and deployed to the process directory.
logback
- Cloudera Manager auto-generates the following parameters for the role:
- Log Threshold
- Logback XML override
- A logback.xml file is generated and deployed to the process directory.
other
- Cloudera Manager doesn't do anything special for this logging type.
Configuration writers provide a way to create custom configuration files for processes/commands controlled by the service. Configuration writers can be associated with a role or a gateway. Cloudera Manager bundles all the configuration files together and sends them to the agent via the heartbeat and writes them to the process directory.
{
"configWriter" : {
"generators" : [
{
"filename" : "sample_xml_file.xml",
"configFormat" : "hadoop_xml",
"excludedParams" : ["service_var1", "role_var3"],
"includedParams" : ["service_var1", "role_var3"]
}
],
"peerConfigGenerators" : [
{
"filename" : "sample_role_peer_file.properties",
"params" : ["service_var1", "role_var3"],
"roleName" : "ECHO_MASTER_SERVER"
}
],
"auxConfigGenerators" : [
{
"filename" : "sample_aux_file.properties",
"sourceFilename" : "aux/some_aux_file.properties"
}
]
}
}
There are three types of configWriters
: generators, peerConfigGenerators, auxConfigGenerators.
A generator allows the CSD author to write out all or a subset of the parameters in a file automatically. The advantage is that when a new parameter is added to the service descriptor, it will automatically be written out in the format specified. The disadvantage is that there is a limited number of supported formats: hadoop_xml, java properties, and gflags. Generated configuration might also be useful for passing a large number of parameters to the command scripts. In addition, every generator will have a safety valve that shows up in the CM configuration UI.
{
"generators" : [
{
"filename" : "sample_xml_file.xml",
"refreshable" : false,
"configFormat" : "hadoop_xml",
"excludedParams" : ["service_var1", "role_var3"],
"includedParams" : ["service_var1", "role_var3"],
"additionalConfigs" : [
{
"key" : "additional.config.key",
"value" : "additional.config.value"
}
]
}
]
}
Key | Description | Required? |
---|---|---|
filename | The configuration filename. | yes |
refreshable | Whether this file can be "refreshed", which means it can be replaced without stopping the role. No control script is run during this refresh. In order for this to be useful, the underlying role must be able to re-read the file automatically. Defaults to false. | no |
configFormat | The format of the configuration file. Enum of "hadoop_xml", "properties", or "gflags". See configFormat. | yes |
includedParams | A list of all the parameters, by name, to include in the configuration file. By default all parameters are included. | no |
excludedParams | A list of all the parameters, by name, to exclude. This list takes precedence over the include list. By default no parameters are excluded. | no |
additionalConfigs | List of ConfigEntry to add near the end of the file. Only works with log4j files. These additional configs will go after all regular parameters, but before any Advanced Configuration Snippets. | no |
- New in Cloudera Manager 5.2.0, added additionalConfigs for any non-gateway role. Gateway support added in 5.3.2
- New in Cloudera Manager 5.5.0, added refreshable and support for gflags configFormat
Below are examples of the format. If a service needs to do some complex munging of configuration variables or needs a different configuration format, that work can be done in the command script.
hadoop_xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://nightly-1.ent.cloudera.com:8020</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>simple</value>
</property>
<configuration>
properties
fs.default.name=hdfs://nightly-1.ent.cloudera.com:8020
hadoop.security.authentication=simple
gflags
--enable_foo
--num_roles=3
--service_name=my_service
A peer configuration generator is used to distribute a properties file that contains a list of all the hostnames that share the same roletype plus any parameters needed from each host. This is a general solution for distributing a hostlist and port number for each role of the same roletype. The format
generated is a java properties file of the following format: <hostname>:<parameter config name>=<value>
{
"peerConfigGenerators" : [
{
"filename" : "hostlist.properties",
"refreshable" : false,
"params" : ["dataNodeDataDir", "webPortAddress"],
"roleName" : "ECHO_MASTER_SERVER"
}
]
}
Key | Description | Required? |
---|---|---|
filename | The configuration filename. | yes |
refreshable | Whether this file can be "refreshed", which means it can be replaced without stopping the role. No control script is run during this refresh. In order for this to be useful, the underlying role must be able to re-read the file automatically. Defaults to false. | no |
params | A list of parameters to include for each hostname. | yes |
roleName | If specified, instead of using current role type, it uses the role type specified by roleName. This allows for example, a Datanode to get parameters from a Namenode. Only role types defined within the service can be used here. For example, the HDFS service cannot reference the Regionserver role type. By default, the current role is used. | no |
- New in Cloudera Manager 5.5.0, added refreshable
Example of hostlist.properties
hostname1.mycompany.com:dfs.datanode.data.dir=/foo/bar
hostname1.mycompany.com:webPortAddress=6060
hostname2.mycompany.com:dfs.datanode.data.dir=/foo/bar/foo
hostname2.mycompany.com:webPortAddress=6060
An aux config generator allows the CSD author to copy a static configuration file to a different location in the process directory. In addition, a safety valve will be created for it in Cloudera Manager. This generator is useful if the service has a config file that is in a format not natively supported by the configGenerators - for example, a config.yml
YAML config file or a service_env.sh
environment script. One possible way to deal with this is:
- Add the
config.yml
file template to the aux/directory. When the service starts all of the contents of the aux directory are copied to the agent process directory. - Create a
auxConfigGenerators
structure for theconfig.yml
and specify where the file should be copied in the process directory. - The start script can perform the necessary variable substitutions before starting the service.
The benefit, is that a safety valve will show up in Cloudera Manager for the config.yml
file. An administrator can use this safety valve to add additional configuration that may not have been modeled in the service.sdl
. This behavior exists for all of the first party services in Cloudera Manager. The contents, if any, of the safety valve will simply be appended to the empty configuration file.
{
"auxConfigGenerators" : [
{
"filename" : "sample_aux_file.properties",
"sourceFilename" : "aux/some_aux_file.properties"
}
]
}
Key | Description | Required? |
---|---|---|
filename | The configuration filename. | yes |
sourceFilename | The configuration file to copy over to filename . By default, if there is no sourceFilename specified, an empty file is created. The contents of the safety valves will be appended to the configuration file. |
no |
The type of ConfigEntry. The possible values are -
simple
In this case, config value is evaluated using standard substitutions.
auth_to_local
In this case, config value is computed by using Auth-to-Local rules configuration properties of the DFS service that is a dependency/dependent of the service. Value field is ignored in this case.
- New in Cloudera Manager 5.5.0
A ConfigEntry describes custom entries in a config file. Both key and value support standard substitutions.
{
"key" : "additional.config.key",
"value" : "additional.config.value",
"type" : "config.entry.type"
}
key
Property name to be used in the config file.
value
Value to be emitted in the config file for the given property. This field is ignored if type
is auth_to_local
.
type
Type of the config entry. Default is simple
if none is specified.
Required?: No
type
is new in Cloudera Manager 5.5.0
Kerberos principal config entry describes an entry in config file referencing a kerberos principal.
{
"external" : "false",
"peerRoleType" : "PEER_ROLE_TYPE",
"principalName" : "kerberos_principal_name",
"propertyName" : "kerberos.principal.property.name",
"instanceWildcard" : "_HOST"
}
external
Should be set to true if the principal refers to an external principal of the service.
peerRoleType
If the principal belongs to a peer role type, then this field should be used to specify that role type. If this is set, principal from an arbitrary role of that role type is used. If both external and peerRoleType are specified, external takes precedence.
principalName
Name of the principal (as specified in [principalName] (#principalName)) to be emitted in config file.
propertyName
Property name to be used while emitting the principal in config file.
instanceWildcard
Optional wildcard string that will be used to replace the instance part of the principal while emitting it in a config file. E.g. hdfs/${host}@REALM
will be emitted as hdfs/_HOST@REALM
if instance wildcard is _HOST
.
- New in Cloudera Manager 5.2.0
Determines whether a particular parameter should be exposed and / or required
- NOT_EXPOSED - do not expose this parameter
- OPTIONAL - expose this parameter as an optional parameter
- REQUIRED - expose this parameter as a required parameter
The script runner structure provides all the information needed to run a script. When a command is run, Cloudera Manager deploys the entire scripts
directory present in the CSD to the agent process directory.
Note: CSDs won't install third-party dependencies for scripts. If a script requires a third-party dependency (for example a python module), the CSD author needs to verify that the dependency is included as part of the system package or parcel.
{
"program" : "scripts/control.sh",
"args" : [ "cmd" ],
"environmentVariables" : {
"ENV_VAR1" : "${parameter_name}",
"ENV_VAR2" : "23"
}
}
Key | Description | Required? |
---|---|---|
program | The path to the script to run relevant to the process directory home. | yes |
args | A list of arguments to pass to the agent. These can have standard substitutions. | no |
environmentVariables | A map of environment variables set before the script is run. These can have standard substitutions. | no |
The scripts that will be packaged with the CSD are just glue between the agent and the real program binaries/scripts that ship part of the standard program deliverable - whether it's packages, parcels, or tar balls. Although the service.sdl
runner can just call the shipped binary/script directory using arguments and environment variables, most likely some adapting is necessary. It is the responsibility of the script bundled with the CSD to do this work before the real service binary can run.
Both the service and roles have parameters defined. These parameters show up in the "Configuration" pages of Cloudera Manager. They can be consumed via configWriters or used in substitutions. Each parameter is typed and depending on the type there might be more required fields to be set. See parameter types.
{
"name" : "server_timeout",
"label" : "Server Timeout",
"description" : "The Server Timeout",
"configName" : "server.timeout",
"required" : true,
"configurableInWizard" : true,
"default" : 20,
"invalidValues": [0],
"type" : "long",
"unit" : "seconds"
}
Key | Description | Required? |
---|---|---|
name | The logical name for the parameter. This is the name that will be referenced either during variable substitution or config generation. Must be unique within a service. The convention is to only use lower case letter separated by underscores. | yes |
label | The user friendly name. | yes |
description | The description shown in the "Configuration" page. | yes |
configName | The name of the key that will be outputted to a config file. By default the name of the parameter is used. | no |
required | True if this parameter is required to be set. By default is false. | no |
default | The default value of the parameter. | no |
configurableInWizard | True if the configuration option should be configurable in the wizard before the service is started. This should be used sparingly: only for parameters whose values cannot be known a priori and are very difficult to change after the fact. By default is false. | no |
sensitive | True if this parameter holds sensitive information. By default is false. | no |
type | The type of the parameter. See parameter types. | yes |
- New in Cloudera Manager 5.1.0, added "sensitive" field
Some types may require additional keys in the parameter structure.
boolean
double
Type | Description | Required? |
---|---|---|
softMin | Recommended minimum double value. By default there is no recommended minimum. | no |
softMax | Recommended maximum double value. By default there is no recommended maximum. | no |
min | Absolute minimum double value. By default there is no minimum. | no |
max | Absolute maximum double value. By default there is no maximum. | no |
unit | Unit of the value. See units. | no |
invalidValues | A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. | no |
- New in Cloudera Manager 6.0.0, added "invalidValues" field
long
Type | Description | Required? |
---|---|---|
softMin | Recommended minimum long value. By default there is no recommended minimum. | no |
softMax | Recommended maximum long value. By default there is no recommended maximum. | no |
min | Absolute minimum long value. By default there is no minimum. | no |
max | Absolute maximum long value. By default there is no maximum. | no |
unit | Unit of the value. See units. | no |
invalidValues | A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. | no |
- New in Cloudera Manager 6.0.0, added "invalidValues" field
memory
Type | Description | Required? |
---|---|---|
softMin | Recommended minimum memory value. By default there is no recommended minimum. | no |
softMax | Recommended maximum memory value. By default there is no recommended maximum. | no |
min | Absolute minimum long value. By default there is no minimum. | no |
max | Absolute maximum long value. By default there is no maximum. | no |
unit | Unit of the value. See units. Must be a byte quantity. | yes |
scaleFactor | Factor used in memory consumption calculation to account for any inherent overhead in the memory quantity. Defaults to 1.0. See [[resource management |
Resource-management-support-for-csds#cooperative-memory-limits]] for more details. |
autoConfigShare | Dictates the percentage of the role's overall memory allotment that should be set aside for this memory quantity during autoconfiguration for resource management. If unset, parameter is not autoconfigured for RM. See [[resource management |
Resource-management-support-for-csds#cooperative-memory-limits]] for more details. |
invalidValues | A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. | no |
- New in Cloudera Manager 6.0.0, added "invalidValues" field
port
Type | Description | Required? |
---|---|---|
softMin | Recommended minimum port value. By default there is no recommended minimum. | no |
softMax | Recommended maximum port value. By default there is no recommended maximum. | no |
min | Absolute minimum port value. By default there is no minimum. | no |
max | Absolute maximum port value. By default there is no maximum. | no |
outbound | True if the port is outbound. By default is false. | no |
zeroAllowed | True if 0 can be specified. By default is false. | no |
negativeOneAllowed | True if -1 can be specified. By default is false. | no |
invalidValues | A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. | no |
- New in Cloudera Manager 6.0.0, added "invalidValues" field
string_enum
Type | Description | Required? |
---|---|---|
validValues | An array of valid strings. | yes |
string
Type | Description | Required? |
---|---|---|
conformRegex | A regular expression the string needs to conform to. By default, all strings are valid. | no |
initType | An Enum of "randomBase64". Initializes the parameter on creation of the owning entity. | no |
- New in Cloudera Manager 5.4.0, added initType
password
A string type used for passwords so that they are masked upon entry in the Cloudera Manager UI. This type is considered sensitive by default. Note that this does not prevent the password from being displayed in the "Process" UI, the "Config diff" UI, or on disk on the host where the service is running. In order to encrypt the passwords in those locations, please see "credentialProviderCompatible" or "alternateScriptParameterName" below.
Type | Description | Required? |
---|---|---|
conformRegex | A regular expression the string needs to conform to. By default, all strings are valid. | no |
initType | An Enum of "randomBase64". Initializes the parameter on creation of the owning entity. | no |
credentialProviderCompatible | Whether this parameter can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. This is mutually exclusive with alternateScriptParameterName. Has no effect on substitutions like ${parameter_name}, which will always get the raw password. | no |
alternateScriptParameterName | If specified, the following things happen: 1) The "configName" of this parameter is no longer emitted 2) In its place, a parameter with the name specified by the "alternateScriptParameterName" is emitted. This parameter contains the full path to a script, and that script will echo the value of the desired password to stdout. For this functionality to be useful, your code must accept the parameter specified in "alternateScriptParameterName" as a parameter that replaces the "configName" and is known to point to the full path of a script that will print the desired password to stdout. This is mutually exclusive with credentialProviderCompatible. Has no effect on substitutions like ${parameter_name}, which will always get the raw password. | no |
- New in Cloudera Manager 5.1.0
- New in Cloudera Manager 5.4.0, added initType
- New in Cloudera Manager 5.5.0, added credentialProviderCompatible and alternateScriptParameterName
string_array
Type | Description | Required? |
---|---|---|
separator | When the array is serialized to a string, what separator should be used. By default comma is used. | no |
minLength | The minimum length the array can be. By default there is no lower bound. | no |
maxLength | The maximum length the array can be. By default there is no upper bound. | no |
path
Type | Description | Required? |
---|---|---|
conformRegex | A regular expression the path needs to conform to. By default, all paths are valid. | no |
pathType | An Enum of "localDataDir", "localDataFile" or "serviceSpecific". For "localDataDir", CM will create for you with the mode specified. For "serviceSpecific" and "localDataFile" the path will not be created. | yes |
mode | The mode of the path. By default it is 0755. | yes |
path_array
Type | Description | Required? |
---|---|---|
separator | When the array is serialized to a string, what separator should be used. By default comma is used. | no |
minLength | The minimum length the array can be. By default there is no lower bound. | no |
maxLength | The maximum length the array can be. By default there is no upper bound. | no |
conformRegex | A regular expression the path needs to conform to. By default, all paths are valid. | no |
pathType | An Enum of "localDataDir", "localDataFile" or "serviceSpecific". Both "localDataDir" and "localDataFile" CM will create for you with the mode specified. For "serviceSpecific" the path will not be created. | yes |
mode | The mode of the path. By default it is 0755. | yes |
uri
Type | Description | Required? |
---|---|---|
conformRegex | A regular expression the uri needs to conform to. By default, all uris are valid. | no |
opaque | True if the uri is opaque or not. By default it is false. | no |
allowedSchemas | A list of allowed schemas for this uri. By default all schemas are allowed. | no |
uri_array
Type | Description | Required? |
---|---|---|
separator | When the array is serialized to a string, what separator should be used. By default comma is used. | no |
minLength | The minimum length the array can be. By default there is no lower bound. | no |
maxLength | The maximum length the array can be. By default there is no upper bound. | no |
conformRegex | A regular expression the uri needs to conform to. By default, all uris are valid. | no |
opaque | True if the uri is opaque or not. By default it is false | no |
allowedSchemas | A list of allowed schemas for this uri. By default all schemas are allowed. | no |
The units can be one of:
- milliseconds
- seconds
- minutes
- hours
- bytes
- kilobytes
- megabytes
- gigabytes
- percent
- pages
- times
- lines
For some strings in the service.sdl
it is necessary to add substitutions. For example, script runner arguments and environment variables are more useful if instead of passing hardcoded values we can pass in the materialized values of parameters or the host the role is running on. To facilitate this, the SDL supports ant style placeholders: ${<variable_name>}
. There are various types of substitutions, each supporting a specific set of variables:
Any parameter available to the context -- role parameters if applicable and inherited service parameters. This is used in the form: ${<parameter_name>}
where the parameter_name is the name (not the configName) of the parameter.
The user - ${user}
. Note that in single user mode, this evaluates to the user running all the processes in the cluster.
The group - ${group}
The host - ${host}
if available.
The kerberos principal - ${principal}
. This evaluates to the kerberos principal of the role in a secure cluster, otherwise has the same value as ${user}
.
The value of the placeholder is generated after CM has processed the configuration hierarchy for the role and has produced a flat map of configuration keys and values. Because of this, we do not require scoping for role parameters to account for overrides or config groups. This is also a reason why we need unique parameter names throughout the SDL file.
- New in Cloudera Manager 5.4.0, added ${principal} placeholder
It is important to separate the concept of a CSD's version and its compatibility.
{
"version" : "1.23",
"compatibility" : {
"generation" : 2,
"cdhVersion" : {
"min" : 4,
"max" : 5
}
}
}
When we refer to a CSD's version, it is a string that lives in both the service descriptor as version 1.23 and in the filename SPARK-1.23.jar. Apart from requiring the version to match the CSD file name, the infrastructure does not derive any more semantics from the string. The compatibility
structure, on the other hand, is used by the CSD infrastructure to verify preconditions before installing the CSD.
The compatibility section has a generation number used to communicate compatibility between different CSD versions. The generation is scoped to the CSD names and should increase monotonically when there is a breaking change between service descriptors. When Cloudera Manager installs a CSD it looks at the generation number and has three situations to consider:
- The previous and current generation numbers match: CM can safely upgrade the installed CSD.
- There is no previous generation number: CM can install the CSD since this is the first time it has seen this CSD type.
- The previous and current generation number don't match: CM surfaces an error and doesn't allow the user to upgrade the CSD.
When the previous and current generation numbers don't match, the CSD author is communicating that there are breaking changes to the CSD and the user needs to follow additional upgrade steps provided by the CSD author. This is the most heavyweight way of upgrading a CSD, and will require the user to uninstall the CSD and re-install the new CSD.
cdhVersion
is used to restrict the installation of a CSD to a specific version of CDH. When the CSD is installed, the cluster CDH version is checked and if does not fall within the CSD's compatibility range the service type is not available for the cluster.
Kerberos principals are used by services for enabling security using kerberos authentication. A kerberos principal is of the form primary/instance@REALM
. It can be described as below :
{
"name" : "kerberos_principal_name",
"primary" : "principal_primary",
"instance" : "principal_instance"
}
The name of the principal. This name can be used to refer to the kerberos principal in configuration files or process environment for roles.
primary
First part of the principal. This is a required field.
instance
Optional second part of the principal. If omitted, then the principal will just be primary@REALM
. This can refer to a parameter, and in case of a role, to its host. If this is a URI, Cloudera Manager will extract the host name and use that as the instance name.
- New in Cloudera Manager 5.2.0
Specifies the workflow to rolling restart a service. If a service supports rolling restart, this workflow must restart every daemon role within the service. So all roles of the service must be specified in either worker or non-worker steps. Roles can be restarted either one-by-one if they are non-worker roles, or in batches if they are worker roles.
Both non-worker and worker steps descriptors have the following sections -
roleName
Role for which the steps are applied during rolling restart.
bringDownCommands
List of commands to run while bringing the role down during rolling restart. If "Stop" is specified as one of the commands, the regular role stop command is called. If this is not provided, role is simply stopped to bring it down.
bringUpCommands
List of commands to run while bringing the role up during rolling restart. If "Start" is specified as one of the commands, the regular role start command is called. If this is not provided, role is simply started to bring it up.
Note: The commands used for non-worker roles must be role level commands and the commands used for worker roles must be service level commands.
- New in Cloudera Manager 5.5.0