Infrastructure without Incident

By

Tech Spotlight: Terraform

Like a lot of software engineers, I started my career getting familiar and supporting other people’s code, application design and setup. Trial by fire was the way of a junior developer trying to prove he has what it takes to contribute to the broader project. “One of the best ways to get up to speed is to fix bugs and support the app” was the message from the tenured few who built the brick house our customers where reliant on and paying us to maintain. But one thing I learned quickly was that this was a house of lies! Okay, maybe not lies but things behind the scenes were definitely not as smooth operating and stable as indicated by our team and project leads. I can remember one day looking through server configuration and noticing a lot of unusual settings like open file systems permissions, high timeout settings, etc… and asking about them led to abrupt moments of “Don’t touch that” it’ll probably break something but no one ever seemed to know why or remember the impetus that led to the current state of things.

Fast forward to 2016 and while many of you will still likely sympathize with this sort of state of function I’ve been continuously looking for better ways to document and shed light on the bowels of my applications’ infrastructure. Everything from from maintaining wiki’s, doing our best to follow best practices but still often caught in a moment of downtime where we jump in and make changes, tweaks to get things up and running again only to further clutter the landscape for which we rely so heavily on for stable applications. As the head of a technology team within an industry that changes often and is busy building and serving clients needs day in and day out we needed a better way.

Queue our most recent dive into a new technology, Terraform. With Terraform we’ve been able to fully automate the build, setup and configuration of cloud environments with all major providers, AWS, Azure, etc… Even more, Terraform provides the ability to fully comment the scripts for clarity and check them in and out of source control so we have a controlled history of all builds and releases; making it a real game changer. With our move to Terraform we’ve also instituted a strong “No Touch” production support model in which all environments following the “Immutable Server” (unchangeable) practice forcing us to keep focus on improving, tweaking and maintaining only our master Terraform environment scripts. This practice alone can dramatically reduce your downtime / interruption due to environment changes, patches or worse. The number of value adds go well beyond just the maintenance and tracking of changes. Take another scenario in which we have dev environments that we know we’ll only actively use during business hours. With Terraform we’ve been able to pair it with our automation practices to automatically build and destroy our non prod environments on demand, reducing cloud waste and ultimately saving money. While we understand that this sort of approach may be a major shift, many of our clients are excited about the possibilities it provides in helping projects run on infrastructure without incident.

If you are looking for a better way to manage your cloud environments, create a history of builds and releases, all while minimizing cloud waste, we’d love to talk about our experience with Terraform and help you design a solution.

About The Author

Andrew Duncan is the Director of Software for Richmond. He is a driven technologist focused on modern technology stacks and best practices. Andrew believes nothing is more rewarding than making software needs a reality with a focus on flexible, scalable and supportable code.