eksctl create cluster --name=observability \
--region=ap-south-1 \
--zones=ap-south-1a,ap-south-1b \
--without-nodegroup
eksctl utils associate-iam-oidc-provider \
--region ap-south-1 \
--cluster observability \
--approve
eksctl create nodegroup --cluster=observability \
--region=ap-south-1 \
--name=observability-ng-private \
--node-type=t3.medium \
--nodes-min=2 \
--nodes-max=3 \
--node-volume-size=20 \
--managed \
--asg-access \
--external-dns-access \
--full-ecr-access \
--appmesh-access \
--alb-ingress-access \
--node-private-networking
aws eks --region ap-south-1 update-kubeconfig --name k8s-observability
helm install monitoring prometheus-community/kube-prometheus-stack \
-n monitoring \
-f ./k8s/custom-prometheus.yml
To see we have installed all the components successfully, run the below command:
kubectl get pods -n monitoring
To use apps we can to do things like port-forwarding, but we can also use the kubectl proxy command to access the Grafana dashboard. Run the below command:
kubectl port-forward -n monitoring svc/prometheus-operated 9090:9090
kubectl port-forward -n monitoring svc/monitoring-grafan 8080:80
kubectl port-forward -n monitoring svc/alertmanager-operated 9093:9093
To access the Grafana dashboard, open a browser and navigate to http://localhost:8080. Use the default username and password (admin/prom-operator) to log in.
To access the Prometheus dashboard, open a browser and navigate to http://localhost:9090.
To access the Alertmanager dashboard, open a browser and navigate to http://localhost:9093.
- Review the Kubernetes manifest files located in
day-4/kubernetes-manifest
. - Apply the Kubernetes manifest files to your cluster by running:
kubectl create ns dev
kubectl apply -k application/kubernetes-manifest/
- Open a browser and get the LoadBalancer DNS name & hit the DNS name with following routes to test the application:
/
/healthy
/serverError
/notFound
/logs
/example
/metrics
/call-service-b
- Alternatively, you can run the automated script
test.sh
, which will automatically send random requests to the LoadBalancer and generate metrics:
./test.sh <<LOAD_BALANCER_DNS_NAME>>
- Review the Alertmanager configuration files located in
day-4/alerts-alertmanager-servicemonitor-manifest
but below is the brief overview- Before configuring Alertmanager, we need credentials to send emails. For this project, we are using Gmail, but any SMTP provider like AWS SES can be used. so please grab the credentials for that.
- Open your Google account settings and search App password & create a new password & put the password in
day-4/alerts-alertmanager-servicemonitor-manifest/email-secret.yml
- One last thing, please add your email id in the
day-4/alerts-alertmanager-servicemonitor-manifest/alertmanagerconfig.yml
- HighCpuUsage: Triggers a warning alert if the average CPU usage across instances exceeds 50% for more than 5 minutes.
- PodRestart: Triggers a critical alert immediately if any pod restarts more than 2 times.
- Apply the manifest files to your cluster by running:
kubectl apply -k k8s/alerts-alertmanager-servicemonitor-manifest/
- Wait for 4-5 minutes and then check the Prometheus UI to confirm that the custom metrics implemented in the Node.js application are available:
http_requests_total
: counterhttp_request_duration_seconds
: histogramhttp_request_duration_summary_seconds
: summarynode_gauge_example
: gauge for tracking async task duration
- To test the alerting system, manually crash the container more than 2 times to trigger an alert (email notification).
- To crash the application container, hit the following endpoint
<<LOAD_BALANCER_DNS_NAME>>/crash
- You should receive an email once the application container has restarted at least 3 times.
- EFK is a popular logging stack used to collect, store, and analyze logs in Kubernetes.
- Elasticsearch: Stores and indexes log data for easy retrieval.
- Fluentbit: A lightweight log forwarder that collects logs from different sources and sends them to Elasticsearch.
- Kibana: A visualization tool that allows users to explore and analyze logs stored in Elasticsearch.
eksctl create iamserviceaccount \
--name ebs-csi-controller-sa \
--namespace kube-system \
--cluster observability \
--role-name AmazonEKS_EBS_CSI_DriverRole \
--role-only \
--attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
--approve
- This command creates an IAM role for the EBS CSI controller.
- IAM role allows EBS CSI controller to interact with AWS resources, specifically for managing EBS volumes in the Kubernetes cluster.
- We will attach the Role with service account
ARN=$(aws iam get-role --role-name AmazonEKS_EBS_CSI_DriverRole --query 'Role.Arn' --output text)
- Command retrieves the ARN of the IAM role created for the EBS CSI controller service account.
eksctl create addon --cluster observability --name aws-ebs-csi-driver --version latest \
--service-account-role-arn $ARN --force
- Above command deploys the AWS EBS CSI driver as an addon to your Kubernetes cluster.
- It uses the previously created IAM service account role to allow the driver to manage EBS volumes securely.
kubectl create namespace logging
helm repo add elastic https://helm.elastic.co
helm install elasticsearch \
--set replicas=1 \
--set volumeClaimTemplate.storageClassName=gp2 \
--set persistence.labels.enabled=true elastic/elasticsearch -n logging
- Installs Elasticsearch in the
logging
namespace. - It sets the number of replicas, specifies the storage class, and enables persistence labels to ensure data is stored on persistent volumes.
# for username
kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.username}' | base64 -d
# for password
kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d
- Retrieves the password for the Elasticsearch cluster's master credentials from the Kubernetes secret.
- The password is base64 encoded, so it needs to be decoded before use.
- π Note: Please write down the password for future reference
helm install kibana --set service.type=LoadBalancer elastic/kibana -n logging
- Kibana provides a user-friendly interface for exploring and visualizing data stored in Elasticsearch.
- It is exposed as a LoadBalancer service, making it accessible from outside the cluster.
- π Note: Please update the
HTTP_Passwd
field in thefluentbit-values.yml
file with the password retrieved earlier in step 6: (i.e NJyO47UqeYBsoaEU)"
helm repo add fluent https://fluent.github.io/helm-charts
helm install fluent-bit fluent/fluent-bit -f fluentbit-values.yaml -n logging
- This command retrieves the CA certificate from the Elasticsearch master certificate secret and decodes it, saving it to a ca-cert.pem file.
kubectl get secret elasticsearch-master-certs -n logging -o jsonpath='{.data.ca\.crt}' | base64 --decode > ca-cert.pem
- Creates a new Kubernetes namespace called tracing if it doesn't already exist, where Jaeger components will be installed.
kubectl create ns tracing
- Creates a ConfigMap in the tracing namespace, containing the CA certificate to be used by Jaeger for TLS.
kubectl create configmap jaeger-tls --from-file=ca-cert.pem -n tracing
- Creates a Kubernetes Secret in the tracing namespace, containing the CA certificate for Elasticsearch TLS communication.
kubectl create secret generic es-tls-secret --from-file=ca-cert.pem -n tracing
- adds the official Jaeger Helm chart repository to your Helm setup, making it available for installations.
helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
helm repo update
- π Note: Please update the
password
field and other related field in thejaeger-values.yaml
file with the password retrieved earlier in day-4 at step 6: (i.e NJyO47UqeYBsoaEU)" - Command installs Jaeger into the tracing namespace using a custom jaeger-values.yaml configuration file. Ensure the password is updated in the file before installation.
helm install jaeger jaegertracing/jaeger -n tracing --values jaeger-values.yaml
- Command forwards port 8080 on your local machine to the Jaeger Query service, allowing you to access the Jaeger UI locally.
kubectl port-forward svc/jaeger-query 8080:80 -n tracing
- To clean up the resources created in this project, run the following commands:
kubectl delete ns dev
kubectl delete ns logging
kubectl delete ns tracing
- The above commands delete the namespaces created for the application, logging, and tracing components, removing all resources within them.
eksctl delete cluster --name=observability --region=ap-south-1
- The command deletes the EKS cluster created for this project, including all associated resources, like load balancers, security groups, and IAM roles.