Tom Jackson, Lead Software Engineer at Nordstrom discusses what StatefulSets mean for your database workloads.
If you have been following Kubernetes you are probably aware that StatefulSets (aka pods with persistent volumes) recently graduated from alpha to beta. So does that mean they are ready for your database workloads?
Short answer: maybe
Obviously using any beta feature in production is a buyer-beware proposition. But our experience is that Kubernetes is pretty conservative with beta declarations (they generally feel solid, feature complete, and stable), so that alone shouldn’t necessarily be a deal breaker.
Our experience is based on 12+ months of building a k8s cluster using AWS infrastructure and 3+ months of running production workloads on the cluster. Our cluster has 6 worker nodes and is Tectonic-like (CoreOS nodes, Prometheus monitoring, Grafana dashboarding, etc.) but is purely open-source (no Console/UI, no LDAP auth, no paid support, etc.).
In the last few months we’ve started to run more pods with persistent volumes, and in the last few weeks we’ve started to use StatefulSets. Am I ready to spike the ball and declare StatefulSets ready for prime time? Almost, but while StatefulSets are plenty expressive enough to describe our state-ful applications, it feels like the infrastructure support is not quite there, at least on AWS.
What do I mean by “infrastructure support”? Kubernetes is all about packing as many containers as possible onto as few physical (or virtual) hosts as possible. For state-ful containers, that means that the OS on any given host could have many block devices under management (as many as one per container). And as new state-ful containers come into and out of existence, the infrastructure has to seamlessly attach and detach the associated volumes. When we started experimenting with state-ful containers six months ago, a significant fraction of our pods would fail to deploy due to failed of hung volumes. In the last month, we’ve seen dramatic improvements in the failure rate, but still see occasional failures which usually require manual intervention to resolve.
The moral of the story? If you are using a roll-your-own cluster approach like us, I would encourage you to start experimenting with StatefulSets, but go in with reasonable expectations for stability and repeatability. If you are using GCE or a commercial Kubernetes product, you can probably expect better repeatability (and at least you’ll have a vendor you can have fun beating on). But my gut says any clusters that are based on AWS hosts is likely to have higher volume failure rates than you would like for production workloads. Your data will be safe and sound, but your pods won’t be as dynamic as you’d like.
The future looks bright but you will likely find yourself in early-adopter mode for at least a few more months.