Workload Profile Monitoring for Azure Container Apps

This is a script designed to monitor and assess the provisioning of Azure Container Apps (ACA) environments and their associated workload profiles. It gathers resource usage data, evaluates if the resources are underprovisioned, and sends metrics to Azure Monitor. From there the metrics can be used to build dashboards or alert upon. The script also produces messages to stdout in case of a underprovisioning scenario.

Quickstart

This quickstart assumes you have an existing Azure Container App environment already. We deploy from a ready made container image (simon.azurecr.io/aca-wp-monitor:latest) hosted publicly on Azure Container Registry. Image sha256 is 0c7024c471593f6b2f41c21f13e856e3dfc577ba8047314f980df37cbfe5d35c. This job will run every hour at the top of the hour.

export SUBSCRIPTION_ID=<ACA_ENVIRONMENT_SUBSCRIPTION>
export RESOURCE_GROUP=<ACA_ENVIRONMENT_RESOURCE_GROUP>
export ENVIRONMENT_NAME=<ACA_ENVIRONMENT_NAME>
export JOB_NAME=workload-profile-monitor

# we create the job first and enable managed identity
az containerapp job create \
  --name $JOB_NAME \
  --resource-group $RESOURCE_GROUP \
  --environment $ENVIRONMENT_NAME \
  --cpu 0.5 \
  --memory 1.0Gi \
  --image simon.azurecr.io/aca-wp-monitor:latest \
  --mi-system-assigned \
  --env-vars SUBSCRIPTION_ID=$SUBSCRIPTION_ID RESOURCE_GROUP=$RESOURCE_GROUP ENVIRONMENT_NAME=$ENVIRONMENT_NAME \
  --replica-timeout 3600 \
  --replica-retry-limit 3 \
  --replica-completion-count 1 \
  --parallelism 1 \
  --trigger-type Schedule \
  --cron-expression "0 */1 * * *"

# we retrieve the managed identity to use for role assignments
PRINCIPAL_ID=$(az containerapp job show --name $JOB_NAME --resource-group $RESOURCE_GROUP --query "identity.principalId" --output tsv)

# we assign the Reader role to be able to get the app and workload profile details
az role assignment create --assignee $PRINCIPAL_ID --role Reader --scope /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP

# we assign the Monitoring Publisher Role to be able to send the metrics to the job's metrics panel
az role assignment create --assignee $PRINCIPAL_ID --role "Monitoring Metrics Publisher" --scope /subscriptions/$SUBSCRIPTION_ID

Once complete the roles screen for the job's managed identity should look similar to this screenshot:

Details and Notes

This script is intended to be run as a job inside the ACA environment you intend to monitor.
It uses system assigned managed identity to authenticate itself as both a Monitoring Metrics Publisher and Reader. These roles will have to be assigned to the managed identity for the job.
Metrics are published and available on the job's Metric tab.
The script requires the following environment variables to be provided to it SUBSCRIPTION_ID, RESOURCE_GROUP, ENVIRONMENT_NAME. A forth one, CONTAINER_APP_JOB_NAME for jobs, or CURRENT_APP_NAME for apps, will be provided automatically as part of the ACA runtime environment.
The script uses Azure's Python SDK.

Available Metrics & Output

Metric or Output	Description
Workload Profile	The name the workload profile. This is represented as a dimension and can be access via filters.
Currently Used CPU	The amount of CPU currently being used across all the apps and their currently running replicas on the workload profile.
Currently Used Memory (GB)	The amount of memory currently being used across all the apps and their replicas on a workload profile.
Currently Available CPU	The amount of CPU currently available on the workload profile.
Currently Available Memory (GB)	The amount of memory currently available on the workload profile.
Max Scale Needed CPU	The maximum amount of CPU needed if all the apps on a workload profile would run at maximum scale.
Max Scale Needed Memory (GB)	The maximum amount of memory needed if all the apps on a workload profile would run at maximum scale.
Max Available CPU	The maximum amount of CPU available if the workload profile would be scaled to the maximum.
Max Available Memory (GB)	The maximum amount of memory available if the workload profile would be scaled to the maximum.
Provisioning Assessment	An assessment of whether Max Scale (CPU or Memory) exceeds Max Available (CPU or Memory). This output gets produced to stdout.

Sample Provisioning Alert Message

The most high-level output gets provided via the console execution log for the job or app running the script. In case of a underprovisioned scenario a message is printed there and should be altered on.

WARNING: Workload Profile mem-pool is underprovisioned: CPU underprovisioned by 4.0 vCPUs. Memory underprovisioned by 38.0 GB.

Usage

Once gathered metrics become available under the Metrics panel for the job. The metric namespace is workload profile. All aggregations show the same number for any collection cycle. Individual workload profile are available via dimensional values and can be accessed via filters.

Limitations and Todo

Jobs or Components are currently not part of the assessment.
GPU-based workload profiles are not included in the assessment.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
misc		misc
Dockerfile		Dockerfile
README.md		README.md
aca-workload-profiles-definition.json		aca-workload-profiles-definition.json
monitor.py		monitor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workload Profile Monitoring for Azure Container Apps

Quickstart

Details and Notes

Available Metrics & Output

Sample Provisioning Alert Message

Usage

Limitations and Todo

About

Languages

simonjj/aca-wp-monitor

Folders and files

Latest commit

History

Repository files navigation

Workload Profile Monitoring for Azure Container Apps

Quickstart

Details and Notes

Available Metrics & Output

Sample Provisioning Alert Message

Usage

Limitations and Todo

About

Topics

Resources

Stars

Watchers

Forks

Languages