Skip to content

Commit

Permalink
Adding documentation and usage example for complete overview. Adding …
Browse files Browse the repository at this point in the history
…reference links to OTEL is next.
  • Loading branch information
alex-tsbk committed Aug 23, 2024
1 parent a624f7a commit c544109
Show file tree
Hide file tree
Showing 28 changed files with 1,027 additions and 134 deletions.
138 changes: 10 additions & 128 deletions README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ At least one of `EcsClusters` or `CloudMapNamespaces` must be provided for appli
## Usage Example

> 🚧 Documentation is still in progress, it'll be finished in next week or so.
For the full example of running this in AWS, please navigate to [/example](/example) folder.

You can also refer to [appsettings.json](src/Apptality.CloudMapEcsPrometheusDiscovery/appsettings.json) for supported configuration parameters.

Expand Down Expand Up @@ -82,8 +82,12 @@ Example run command:
# Will be added to all discovered targets.
-e DiscoveryOptions__ExtraPrometheusLabels="custom_static_tag=my-static-tag;" \
# Tag prefix to identify metrics port, [path | "/metrics"], [name | ""] triplets.
# Please refer to the documentation to learn more.
# Please refer to the configuration options to learn more.
-e DiscoveryOptions__MetricsPathPortTagPrefix="METRICS_" \
# Add new or modify existing labels in the response using token replacements,
# to prevent the need of modifying your Grafana dashboards.
# Please refer to the configuration options to learn more.
-e DiscoveryOptions__RelabelConfigurations="cluster_and_service={{ecs_cluster}}-{{ecs_service}}" \
# Instructs .NET application to listen on 9001 inside the container
-e ASPNETCORE_URLS="http://*:9O01" \
# ** DOCKER **
Expand Down Expand Up @@ -112,141 +116,19 @@ Example output:
"custom_resource_tag_environment": "dev",
"custom_resource_tag_component": "my-cool-application",
"custom_static_tag": "my-static-tag",
"cluster_and_service": "my-ecs-cluster-my-fargate-application",
"AmazonECSManaged": "true"
}
},
...
]
```

Plays nice with [OpenTelemetry receivers config](https://opentelemetry.io/docs/collector/configuration/#receivers):
Plays nice with and originally designed for [OpenTelemetry receivers config](https://opentelemetry.io/docs/collector/configuration/#receivers). See [example](/example) for more details.

Considering your ECS Task Definition for discovery looks like this:
## Configuration Options

```terraform
resource "aws_ecs_task_definition" "task_definition" {
family = "${local.resource_prefix}-${local.ecs_name}"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = "512"
memory = "1024"
execution_role_arn = aws_iam_role.ecs_task_role.arn
task_role_arn = aws_iam_role.ecs_task_role.arn
runtime_platform {
operating_system_family = "LINUX"
cpu_architecture = "X86_64"
}
container_definitions = jsonencode([
{
name = "${local.ecs_name}-config-reloader"
image = "apptality/aws-ecs-cloudmap-prometheus-discovery:0.1.0-alpha.29"
cpu = 128
memory = 128
essential = true
portMappings = [
{
containerPort = 9001
protocol = "tcp"
}
]
environment = [
{
name = "AWS_REGION"
value = local.region
},
... REST OF APPLICATION CONFIGURATION ...
]
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = var.cloud_watch_log_group_name
awslogs-region = local.region
awslogs-stream-prefix = "${local.ecs_name}-config-reloader"
}
}
},
{
name = "${local.ecs_name}-collector"
image = "public.ecr.aws/aws-observability/aws-otel-collector:latest"
cpu = 384
memory = 896
essential = true
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = var.cloud_watch_log_group_name
awslogs-region = local.region
awslogs-stream-prefix = "${local.ecs_name}-collector"
}
}
secrets = [
{
# Contents of "otel-collector-config.yaml" below stored in SSM
name = "AOT_CONFIG_CONTENT"
valueFrom = aws_ssm_parameter.ecs_opentelemtry_config.name
}
]
portMappings = [
{
containerPort = 2000
protocol = "udp"
}
]
dependsOn = [
{
containerName = "${local.ecs_name}-config-reloader"
condition = "START"
}
]
}
])
}
```

Your OTEL configuration may look something similar to:

```yaml
# otel-collector-config.yaml
receivers:
prometheus:
config:
global:
scrape_interval: 15s
scrape_timeout: 10s
scrape_configs:
- job_name: ecs_services
http_sd_configs:
- url: http://localhost:9001/prometheus-targets
refresh_interval: 10s
exporters:
prometheusremotewrite:
endpoint: https://aps-workspaces.${REGION}.amazonaws.com/workspaces/${WORKSPACE}/api/v1/remote_write
resource_to_telemetry_conversion:
enabled: true
auth:
authenticator: sigv4auth
extensions:
sigv4auth:
service: aps
region: ${REGION}
health_check: null
pprof:
endpoint: ":1888"
zpages:
endpoint: ":55679"
service:
extensions:
- sigv4auth
- pprof
- zpages
- health_check
pipelines:
metrics:
receivers: [prometheus]
exporters: [prometheusremotewrite]
```
Please refer to [appsettings.json](src/Apptality.CloudMapEcsPrometheusDiscovery/appsettings.json) for complete list of supported configuration options.

## Contributing

Expand Down
46 changes: 46 additions & 0 deletions example/README.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# AWS ECS & CloudMap Prometheus Discovery - End-to-End Example
🚧 🚧 🚧

This example is a work in progress.
This note will be removed when tested in AWS.
This though should be a solid starting point for you to build upon, with no major missing parts of the infrstructure.

---

This is a complete example of what it takes to run an observable
infrastructure in AWS ECS.

This example uses [Terraform](https://www.terraform.io/) to provision the infrastructure,
so some familiarity with Terraform is necessary, but not strongly required.

## Structure

`~/environments` - this is where you would store your environment configurations.
In this example I have a single configuration for `dev`.

`~/modules` - repeatable infrastructure constructs that can be used across multiple environments.

Under `~/modules` you will find the following:
* `infrastructure` - this module creates shared resources in which your applications are run.
This should typically be declared once per environment.
* `application` - this is the ECS application module that will create the ECS service, task definition.
You can have as many applications as you want.
* `monitoring` - this module creates the monitoring resources for your applications.

This example expects you to already have a VPC with public and private subnets defined.

## Setup
1. Install Terraform so it is available in your PATH.

Reload/restart of your shell may be required.
1. Navigate to the `~/examples/environments/dev` directory.
1. Paste AWS credentials in your shell for the target environment
1. Run `terraform init`
1. Run `terraform plan -out plan.tfplan`
1. Inspect the plan and confirm it looks good
1. Run `terraform apply plan.tfplan` to apply the changes

When you're done, remember to clean up:
1. Run `terraform destroy`

> **Note:** This example is semi-production-ready. It is meant to be a starting point for you to build upon.
24 changes: 24 additions & 0 deletions example/environments/dev/locals.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
locals {
# Identifier of VPC in which all resources are being created.
vpc_id = "vpc-1234567890abcdefg"
# Name of integration environment. This is by the "purpose" (you can have many 'dev' environments)
environment = "dev"
# Name of the application environment. For example: aws-dev, aws-prod, etc.
# This is used as a lookup for the source artifacts, docker container tags, etc.
# This would also reference concrete configuration values for the application.
application_environment = "Development"
# List of public subnet ids. These are typically ones in which public facing services are deployed.
# It is recommended to have at least two public subnets in different availability zones for high availability.
public_subnet_ids = [
"subnet-12345678901234567", # vpc-public-subnet-1
"subnet-23456789012345679", # vpc-public-subnet-2
]
# List of private subnet ids. These are typically ones in which private services are deployed (DBs, internal services, etc.)
# It is recommended to have at least two private subnets in different availability zones for high availability.
private_subnet_ids = [
"subnet-34567890123456790", # vpc-private-subnet-1
"subnet-45678901234567901", # vpc-private-subnet-2
]
# Any additional tags to be added to the resources.
tags = {}
}
49 changes: 49 additions & 0 deletions example/environments/dev/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# 1. Bootstrap shared infrastructure
module "infrastructure" {
source = "../../modules/infrastructure"

environment = local.environment
vpc_id = local.vpc_id
public_subnet_ids = local.public_subnet_ids
private_subnet_ids = local.private_subnet_ids
tags = local.tags
}

# 2. Create as many services as needed.
# This template below creates a service called "webservice-backend"
module "app-webservice-backend" {
source = "../../modules/application"

environment = local.environment
application_environment = local.application_environment
vpc_id = local.vpc_id
public_subnet_ids = local.public_subnet_ids
private_subnet_ids = local.private_subnet_ids
ecs_custer_id = module.infrastructure.ecs_cluster_id
ecs_service_name = "webservice-backend"
ecr_repository_url = "123456789012.dkr.ecr.us-west-2.amazonaws.com/webservice-backend"
ecr_image_tag = "latest"
ecs_security_group_id = module.infrastructure.ecs_security_group_id
container_resources = { cpu = "512", memory = "1024" }
cloud_watch_log_group_name = module.infrastructure.cloud_watch_log_group_name
service_discovery_namespace = module.infrastructure.service_discovery_namespace
tags = local.tags
}

# 3. Monitoring
# This service is responsible for discovering Prometheus targets in the ECS cluster
# Once this is deployed and running - all you have to do is to hook this up to your Grafana:
# https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-onboard-query-standalone-grafana.html
module "monitoring" {
source = "../../modules/monitoring"

environment = local.environment
vpc_id = local.vpc_id
public_subnet_ids = local.public_subnet_ids
private_subnet_ids = local.private_subnet_ids
ecs_custer_id = module.infrastructure.ecs_cluster_id
ecs_cluster_name = module.infrastructure.ecs_cluster_name
cloud_watch_log_group_name = module.infrastructure.cloud_watch_log_group_name
service_discovery_namespace = module.infrastructure.service_discovery_namespace
tags = local.tags
}
20 changes: 20 additions & 0 deletions example/environments/dev/terraform.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.59"
}
}
}

provider "aws" {
region = "us-west-2"

default_tags {
tags = {
terraform = true
component = "my-app"
environment = "dev"
}
}
}
31 changes: 31 additions & 0 deletions example/modules/application/cloudmap.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# This is just an example of how to create a Service Discovery Service with multiple scrapable targets configurations for Prometheus.
#
# This application primary purpose is to scrape CloudMap "Service Connect" configurations,
# while supporting CloudMap "Service Discovery" configurations as a fallback
# for backwards compatibility with ECS Service Discovery. AWS ECS Service Discovery is being deprecated.
resource "aws_service_discovery_service" "service_discovery" {
name = "sd-${local.ecs_service_name}"

dns_config {
namespace_id = var.service_discovery_namespace.id
routing_policy = "MULTIVALUE"
dns_records {
ttl = 60
type = "A"
}
}

tags = merge(var.tags, {
# Tags allow specifyig multiple scrapable targets configurations for Prometheus.
# Can bes set at the ECS Service level, ECS Task level, or at the CloudMap Namespace, or CloudMap Service level.
# Below configuration will instruct "aws-ecs-cloudmap-prometheus-discovery" application
# to scrape metrics from container on port 8080, and from the container on port 9779.
# https://github.com/apptality/aws-ecs-cloudmap-prometheus-sd/blob/dev/src/Apptality.CloudMapEcsPrometheusDiscovery/appsettings.json#L115
"METRICS_PORT" = "8080"
"METRICS_PATH" = "/metrics"
"METRICS_NAME" = "/application"
"METRICS_PORT_ECS" = "9779"
"METRICS_PATH_ECS" = "/metrics"
"METRICS_NAME_ECS" = "ecs-exporter"
})
}
11 changes: 11 additions & 0 deletions example/modules/application/config.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.59"
configuration_aliases = [
aws
]
}
}
}
Loading

0 comments on commit c544109

Please sign in to comment.