Injecting Chaos into a single instance service with Kubernetes on Google Cloud

More Tutorials
Injecting Chaos into a single instance service with Kubernetes on Google Cloud

ChaosIQ experiments are run through Chaos Toolkit. This tutorial takes you through getting a single node service running on google cloud with kubernetes and running chaos experiment against it to show the pod restart capability. By default the Kubernetes Pods, have a restart policy which means if the Pods dies it will be automatically restarted. The experiment at the end of this tutorial examines the restart hypothesis, and verifies that the URL endpoint is available again, with no further action, after killing the pod in the Chaos experiment, allowing a short delay for the restart.

Getting Started with Kubernetes on Google Cloud

You will need your google account configured for billing and some costs could be incurred if you leave resources running on google cloud. Steps are included at the end to tidy and remove the resources on google cloud.

Setup google cloud

  1. Visit the Kubernetes Engine page in the Google Cloud Platform Console.
  2. Create or select a project.
  3. Wait for the API and related services to be enabled. This can take several minutes.
  4. Make sure that billing is enabled for your Google Cloud Platform project. Learn how to enable billing.

Setup google command line tools on your machine

We are using command line tools locally, so the following steps should be applied:

  1. Install the Google Cloud SDK, which includes the gcloud command-line tool.
  2. Using the gcloud command line tool, install the Kubernetes command-line tool. kubectl is used to communicate with Kubernetes, which is the cluster orchestration system of GKE clusters:

gcloud components install kubectl

  1. If you dont have it already, please install Docker Community Edition (CE) on your workstation. You will use this to build a container image for the application.

Google Cloud config

Configure Google Cloud so you can create a cluster in your project:

  1. $ gcloud config set project project-id
  2. $ gcloud config set compute/zone compute-zone
  3. $ gcloud container clusters create demo-cluster --num-nodes=1

Note:

Building and deploying docker image

This tutorial deploys a docker image to deploy onto your Kubernetes cluster created above. We are using a single node service for this from the Chaos Toolkit Community Playground, specifically the files from this folder Starlette Starter, this is a builds and deploys a Docker Image, on to Docker Hub. The Image is a simple service using Starlette, this is configured to respond with a json response to http get requests. You can either clone the Community Playground repository from Github and build your own image and deploy on docker hub see the README, or you can use the Deployment File, as is and deploy the service from my Docker hub. Once you have the deployment.yaml in your local directory you can deploy the services in the file by running:

$ kubectl apply -f ./deployment.yaml

Once this has be run you should have a single pod running

$ kubectl get pods - it should show 1 pod in the Running state

$ kubectl get ingres - will show the Ingres service, if the service has been running long enough there will be an EXTERNAL-IP address, you should be able to run curl or use a browser to see the response from the service. e.g enter http://:8080 in your browser address bar or run curl http://:8080, both should result in a json response.

Run Chaos experiment

Note: Kubernetes restarts the pod within a few seconds so if you want to confirm the pod goes down you can use the following bash command in another terminal while :; do curl http://:8080/; sleep 1; done;

export APPLICATION_ENTRYPOINT_URL=http://34.89.17.144:8080/ ;
export POD_LABEL=app=starlette;
chaos run kill-pod.json

Open Chaos Catalog

As part of the Open Chaos Initiative we are developing and contributing to the chaos catalog in github within the catalog there is an experiment to kill of a pod using its label, which will automatically restart. The experiment README, contains the full details of how to run the experiment this in a nutshell, with the native chaos` command, can be as simple as:

(chaostk) export APPLICATION_ENTRYPOINT_URL=http://34.89.17.144:8080/ ; \
          export POD_LABEL=app=starlette; \
          chaos run https://raw.githubusercontent.com/open-chaos/experiment-catalog/master/kubernetes/kill_pod_by_label/kill_pod_by_label_experiment.json

This will run a locally installed chaos toolkit, with the experiment defined in the git hub catalogue. Details of the environment variables and their meaning can be seen README

Chaos Console

The above is great when I want to run my experiments in isolation and I have full control over my environment, but what if I am part of a team or working in an enterprise, if I randomly kill off my production services at will it could raise a few eyebrows and possibly have a negative impact on my career progression. This is where the ChaosIQ console comes in. As a user of the console I can view and control my experiments from the console, but more importantly, this can provide visibility and control to my team. My dashboard view can be seen below:

Chaos Console Dashboard

The console environment allows me to work with other members of my team so I can share experiments and the results of executions, I can also setup Safeguards these safeguard's for example, can protect me from:

  • running my experiment at a bad time operationally
  • running my experiment that clashes with other team members experiments
  • allows me to setup a policy to stop all experiments now (big red button)

We are currently inviting people to join ChaosIQ early access program and if you feel the features of the ChaosIQ could benefit your or your team or we can assist in anyway on your Journey to Chaos Engineering, then please see our Early Access Request Page

Cleaning up your gcloud resources

Having completed you tutorial to avoid billing for your cloud resources its a good idea to tidy and release the resources

Delete the cluster

Deleting the cluster removes all GKE and Compute Engine resource

$ gcloud container clusters delete demo-cluster

Chaos Engineering Resources