Mapping Demandbase’s Container Journey

As a DevOps Engineer at Demandbase, it’s my job to ensure the stability and reliability of our ABM platform. The DevOps team works closely with the rest of product and engineering to foster collaboration, build a culture of transparency and faster feedback and create a system of seamless communication. (We also deploy some code every now and then).

In the past few years, we’ve grown a lot as a team and have taken great strides to improve our processes and make our systems more efficient; and in this blog post, we’ll share how our move to containers has helped us greatly speed up our development process and reduce friction across the entire product and engineering organization.

Going from a “Pets” to “Cattle” Methodology

When I started at Demandbase nearly two years ago, we were operating under a “pets” model. For those of you unfamiliar with the analogy, here’s a handy meme that’s been making the rounds for quite a while now:

Essentially, under a “pets” model, web servers are treated as indispensable: they’re given unique names, are continually updated and modified and require lots of manual intervention and documentation. The trouble with the “pets” methodology is that the infrastructure takes days to create and requires weekly servicing. In our case, it took us nearly seven hours to deploy one change, and if we found a problem, we’d have to start all over again. It was a cumbersome process and it made testing and iterating incredibly difficult. We knew we needed to make a change and tighten up our processes.

Fortunately for us, a big trend at the time was immutable infrastructure—or what’s more commonly known as a “cattle” model. Under a “cattle” umbrella, servers are built using automated tools and are designed for failure. If something needs to be updated, fixed or modified, new servers are built from a new common image and replace old ones.

Making the move from a “pets” to a “cattle” model required us to invest in an industry standard tool called Packer. Packer gave us the ability to create those pieces of immutable infrastructure by using our existing Chef recipes to built version AMIs (Amazon Machine Images), which in turn, cut our deployment times down to 30 minutes. Half an hour was a vast improvement from the previous seven, but we wanted to further streamline the development and the testing experience. So the next step was putting all of this into containers with Docker.

Docker made it possible for us to set up local development environments that were exactly like our live servers and made managing and deploying our applications much easier. In fact, Docker not only cut our development times down to almost five minutes, but it also helped us eliminate a lot of the guesswork that tends to occur when a difference between the development and test environment causes a deployment to fail. Now, if our developers do a release and there’s a major issue we need to roll back, the process only takes a few minutes.

However, the more we started working with Docker, the more we realized we needed a good orchestration framework to manage all of our containers in an efficient and deterministic way. Since containers have a relatively short shelf life, running them in production requires a wide variety of technologies, which have to be integrated and managed. While there were several container management systems on the market at the time, including Amazon ECS, Kubernetes and Rancher, we needed to determine which solution would work best given our current team and infrastructure. To identify the right vendor, we had an old fashioned bake-off: each one of our engineers stood up a test cluster and ran a test application to understand ease of use. Of all the vendors we tested, Rancher ended up winning— since at the time (Spring 2016) it provided the most mature solution; Authorization integration, open source, multiple environments and clouds, full Docker support, and a simplified framework based on the docker-compose syntax. Because most of Demandbase’s apps are 12-Factor-App compliant and don’t rely on persistent storage, this model allows us to run 100% of our container clusters on spot instances for up to 80% cost savings. Currently 90% our applications run on Rancher clusters, and we can quickly deploy multiple applications using containers in a multi-cloud environment.

Wrapping Up

By using container orchestration, we now have a continuous delivery process for internal testing. In addition to greatly speeding up our deployments, we’ve also reduced the amount of friction involved in the development process.

We’ve learned a ton in the past year and half, and now we’re going to start looking additional technologies, such as Kubernetes to gain a deeper understanding of containers and make things more stable and diverse. If this seems like the sort of work you’d be into or you’d like to talk more about DevOps at Demandbase and the technologies covered in this post, send me a note on LinkedIn.