Why Use HAProxy
When we decided to use Kubernetes as our container orchestration solution, we had the opportunity to learn all of the Kubernetes terminology. I was familiar with pods, replication controllers, and services from previous work in Kubernetes, but since then, kubernetes had introduced deployment, daemonset, load balancer, and ingress resources (just to name a few).
We needed to expose our cloud monitoring service API endpoints for consumption both inside and outside of the kubernetes cluster, and deciding which of the many kubernetes resources were needed to accomplish that in time for our beta release meant learning about every option, testing it, and bracing for the inevitable issues that come with such things when actual traffic hits it. An Ingress seemed like a reasonable way of exposing our services, but many questions loomed. Will Ingress even be around in the next version of kubernetes? Things are changing quickly and I am hesitant to spend lots of time configuring a resource that may not be supported in a year.
Can we enforce HSTS and ProxyProtocol? How easy is it to route based off of headers and paths, handle redirects, and route both external and internal API calls? Our team has used HAProxy before to do all of these things, and we like how flexible it is while handling very high load and consuming basically no resources.
To make things even easier, HAProxy is stateless, making it a perfect candidate for containerization. After a bit of investigation (time is limited when trying to get ready for a beta launch), I determined that the most flexible setup for us going forward would involve running HAProxy in a Deployment, and exposing it with a NodePort service that is reachable by an AWS Elastic Load Balancer.
The Docker Image
At Blue Matador, I am sort of pioneering our docker experience in production, so I try to take an approach that balances doing things the docker way and doing things in a way that is familiar with our engineering team so we can run things smoothly. While there are many existing public docker images for HAProxy, I chose to roll my own so we could have the convenience of the tools we are used to, while also keeping things short and simple.
Our docker image basically consists of Ubuntu 16.04 (bloated, yes, but familiar) with some essentials added on like telnet and dnsutils, and then copying our haproxy.cfg into the container. To make debugging internal calls easier, I also install rsyslog and tail the HAProxy logs in the CMD.
I know this is less than ideal, but HAProxy only exposes logs via syslog and this is by far the quickest way to get up and running. This gives us a nice and simple way of updating our HAProxy config without managing docker volumes. As our team gets more familiar with our production system running in docker, we will likely base our HAProxy image off of a smaller OS to keep things light.
Earlier I mentioned enforcing HSTS and ProxyProtocol, redirecting, and routing based off of path and header. Below is a cleaned up version of our HAProxy config in case you are interested in doing any of these things.
We rely on the ELB to terminate SSL and send traffic to port 9000. Only internal calls are allowed on port 8000, and other clients are redirected to the SSL endpoint. We determine this by using the host header, by realizing that only things in the cluster will be able to hit the service we set up later. Once your haproxy.cfg is ready, simply build the image, tag it, and push it to your docker repo. An example Dockerfile is included below as a starting point.
The Kubernetes Config
Now that we have our docker image ready to go we can work on the kubernetes config for actually running HAProxy. As mentioned earlier I went with a Deployment resource to manage the lifecycle of the container. I had previous experience with ReplicationControllers so a Deployment was a clear improvement for me. Basically we just template out what the running container needs, label it so we can refer to it from other resources, and define a healthcheck going to the HAProxy stats port.
We run it with
kubectl create -f haproxy_deployment.yaml and wait for the pods to run using
kubectl get pods. If any of your pods fail to start, it could be because HAProxy tries to resolve DNS for every configured backend immediately. If any of the backends you are referencing is not yet created, create them now then recreate the deployment for HAProxy.
Now that our pods are running, it’s time to expose them to the cluster.
kubectl create -f haproxy_service.yaml
Now to expose our services to the internet we need to create an ELB. We use terraform to manage our AWS resources, and I have included an example terraform config for an ELB that handles SSL termination and enables ProxyProtocol below. If you are unfamiliar with terraform, you can view instructions on how to enable ProxyProtocol here.
So we set up the HAProxy service in docker successfully, configure the ELB to correctly handle SSL connections, and everything is great! That is until a few days later when I did end to end testing to make sure all of our services in kubernetes played nicely. I began noticing that some of our internal API calls were timing out after 5 seconds.
Considering some of those calls should not have even left the EC2 instance they were on, that was very alarming. After digging around in the application logs and HAProxy logs I noticed that the calls were not even making it to HAProxy from a container running on the same node. When you have something consistently failing on an interval like 5 seconds, you know there’s a timeout happening.
So why was it taking 5 seconds to resolve DNS for these internal calls? It turns out that DNS lookups were being made for both A and AAAA (IPv6) records in parallel, and waiting for both responses. The internal DNS that was set up by default in kubernetes did not respond to lookups for AAAA records and was timing out after, you guessed it, 5 seconds. The fix is simple really, and all that is required is adding one line to the /etc/resolv.conf of the kubernetes nodes (or wherever your containers inherit their resolv.conf from).
This simple configuration change makes it so that DNS lookups are performed sequentially, succeeding when the A record is returned.
When in a time crunch, it can be difficult to balance trying new technologies, preparing for future reliability, and making sure your engineering and ops team do not face a steep learning curve when working with the system.
Using Docker for more of our system components is something I am adamant about because it makes managing development setup, testing, and delivering updates quickly much easier overall.