Injecting Chaos into a single instance service with Kubernetes on Google CloudMore Tutorials
October 2nd, 2019
ChaosIQ experiments are run through Chaos Toolkit. This tutorial takes you through getting a single node service running on google cloud with kubernetes and running chaos experiment against it to show the pod restart capability. By default the Kubernetes Pods, have a restart policy which means if the Pods dies it will be automatically restarted. The experiment at the end of this tutorial examines the restart hypothesis, and verifies that the URL endpoint is available again, with no further action, after killing the pod in the Chaos experiment, allowing a short delay for the restart.
Getting Started with Kubernetes on Google Cloud
You will need your google account configured for billing and some costs could be incurred if you leave resources running on google cloud. Steps are included at the end to tidy and remove the resources on google cloud.
Setup google cloud
- Visit the Kubernetes Engine page in the Google Cloud Platform Console.
- Create or select a project.
- Wait for the API and related services to be enabled. This can take several minutes.
- Make sure that billing is enabled for your Google Cloud Platform project. Learn how to enable billing.
Setup google command line tools on your machine
We are using command line tools locally, so the following steps should be applied:
- Install the Google Cloud SDK, which includes the gcloud command-line tool.
- Using the gcloud command line tool, install the Kubernetes command-line tool. kubectl is used to communicate with Kubernetes, which is the cluster orchestration system of GKE clusters:
gcloud components install kubectl
- If you dont have it already, please install Docker Community Edition (CE) on your workstation. You will use this to build a container image for the application.
Google Cloud config
Configure Google Cloud so you can create a cluster in your project:
$ gcloud config set project project-id
$ gcloud config set compute/zone compute-zone
$ gcloud container clusters create demo-cluster --num-nodes=1
- The project id is available from the Google Cloud Platform console under the project info, for example valued-fortress-239215
- The compute-zone is a Google Compute Engine zone, for example europe-west2a for London zone a.
Building and deploying docker image
This tutorial deploys a docker image to deploy onto your Kubernetes cluster created above. We are using a single node service for this from the Chaos Toolkit Community Playground, specifically the files from this folder Starlette Starter, this is a builds and deploys a Docker Image, on to Docker Hub. The Image is a simple service using Starlette, this is configured to respond with a json response to http get requests. You can either clone the Community Playground repository from Github and build your own image and deploy on docker hub see the README, or you can use the Deployment File, as is and deploy the service from my Docker hub. Once you have the deployment.yaml in your local directory you can deploy the services in the file by running:
$ kubectl apply -f ./deployment.yaml
Once this has be run you should have a single pod running
$ kubectl get pods - it should show 1 pod in the Running state
$ kubectl get ingres - will show the Ingres service, if the service has been running long enough there will be an EXTERNAL-IP address, you should be able to run curl or use a browser to see the response from the service. e.g enter http://
Run Chaos experiment
Note: Kubernetes restarts the pod within a few seconds so if you want to confirm the pod goes down you can use the following bash command in another terminal
while :; do curl http://
export APPLICATION_ENTRYPOINT_URL=http://220.127.116.11:8080/ ; export POD_LABEL=app=starlette; chaos run kill-pod.json
Open Chaos Catalog
As part of the Open Chaos Initiative we are developing and contributing to the chaos catalog in github within the catalog there is an experiment to kill of a pod using its label, which will automatically restart. The experiment README, contains the full details of how to run the experiment this in a nutshell, with the native chaos` command, can be as simple as:
(chaostk) export APPLICATION_ENTRYPOINT_URL=http://18.104.22.168:8080/ ; \ export POD_LABEL=app=starlette; \ chaos run https://raw.githubusercontent.com/open-chaos/experiment-catalog/master/kubernetes/kill_pod_by_label/kill_pod_by_label_experiment.json
This will run a locally installed chaos toolkit, with the experiment defined in the git hub catalogue. Details of the environment variables and their meaning can be seen README
The above is great when I want to run my experiments in isolation and I have full control over my environment, but what if I am part of a team or working in an enterprise, if I randomly kill off my production services at will it could raise a few eyebrows and possibly have a negative impact on my career progression. This is where the ChaosIQ console comes in. As a user of the console I can view and control my experiments from the console, but more importantly, this can provide visibility and control to my team. My dashboard view can be seen below:
The console environment allows me to work with other members of my team so I can share experiments and the results of executions, I can also setup Safeguards these safeguard's for example, can protect me from:
- running my experiment at a bad time operationally
- running my experiment that clashes with other team members experiments
- allows me to setup a policy to stop all experiments now (big red button)
We are currently inviting people to join ChaosIQ early access program and if you feel the features of the ChaosIQ could benefit your or your team or we can assist in anyway on your Journey to Chaos Engineering, then please see our Early Access Request Page
Cleaning up your gcloud resources
Having completed you tutorial to avoid billing for your cloud resources its a good idea to tidy and release the resources
Delete the cluster
Deleting the cluster removes all GKE and Compute Engine resource
$ gcloud container clusters delete demo-cluster
Chaos Engineering Resources
- Learning Chaos Engineering takes you through all the steps required to get you on the journey to Chaos
- Chaos Engineering Observability how to bring your chaos experiments into the world of system observability
- This week in Chaos Newsletter keep up to date with current events and stories in Chaos Engineering
- ChaosToolkit open source chaos toolkit to build and run your own experiments
- Open Chaos Initiative open community to embrace free and open standards to enable everyone to share, collaborate on and learn from chaos engineering.
- Open Chaos Github open chaos resources on github
- Chaos Experiment Catalog Chaos Experiment Catalog including experiments across a number of different platforms and services such as Kubernetes, Google cloud , AWS and Azure