One cannot escape the fact that the COVID-19 pandemic has drastically affected everyday life across the globe, especially with day-to-day work being impacted by stay-at-home orders, quarantines, and hopefully good sense about our mutual physical health.
What does that mean for businesses that continue running with employees working remotely? And what does that have to do with DevOps?
DevOps exists as the conjunction of development teams and operations teams, and while there’s no clear consensus as to what exactly DevOps does for various organizations, principles have shaken out of the DevOps mindset that are worth exploring for every organization with an IT staff – which these days could be either internal or external, but probably remote either way.
First, there is Infrastructure as Code (IaC). Whether an organization is on the HashiCorp bandwagon with Terraform, tied to a specific cloud provider with their tools like Amazon Web Services’ CloudFormation, or has gone another route using a homebrewed or lesser-known solution, the result is hopefully similar: infrastructure that is relatively self-documenting and deployable with minimal to no modification across multiple infrastructure locations.
IaC’s value is easy to see, especially given the current global situation – it is a minimalist disaster recovery (DR) plan for anything it incorporates. More than that, and yes, this is a dark consideration, if COVID-19 strikes your DevOps staff particularly hard, IaC makes it less difficult for others to maintain continuity and smooth running of the resources managed this way.
Second to consider is a strong Continuous Integration (CI)/Continuous Deployment(CD) pipeline. The tools here are far more myriad than in the IaC space, but commonly include things like Jenkins as an independent tool, TravisCI and CircleCI as services, or AWS’ CodeBuild and CodePipeline. With this concept, stable, tested code is handled in a way that developers and business owners can quickly get both new features and fixes to market quickly.
There are some caveats with CI/CD, however. CI/CD requires good tests and good code hygiene around those tests. While Test Driven Development (TDD) is not an absolute must, if it is not used, then there must be extra effort to document the code’s functionality so that the tests that are written can accurately assess the results in a strong pass/fail/skip methodology. It is useless to have CI/CD for anything except pushing code out if every test passes no matter the change. Without proper tests, utilizing an automated integration and deployment process will create a possibly disastrous exercise in mediocrity as well as huge amounts of technical debt.
Finally, there should be appropriate monitoring in any environment and code. Monitoring is a key aspect of identifying trouble quickly, before it becomes a major issue for the end users of the product being developed and deployed. This is probably the widest product space and ranges across open source, closed source, and hybrid products along with a myriad of licensing and support options.
Monitoring on its own isn’t enough; it needs to be paired with some sort of alerting for critical resources. That might mean trusting the teams to diligently review alert email boxes appropriately, using text alerts, or using an app or service, but if the right people do not know about problems, the right people cannot fix the problems.
IaC, CI/CD, and monitoring are three core concepts of DevOps that should always be applied to reduce the overhead of supporting an IT infrastructure, but they are critical when facing staff shortages compounded by increased demand. Being able to get in front of a problem is the most important principle, but being able to understand and remediate those problems does not fall far behind.
Focusing on these concepts will increase the capabilities of any DevOps team during normal circumstances, but will help that team manage major issues more effectively in a crisis as well.