I’ve spent the last 3 years of my working life on deployment tools, both building them personally and then product managing on teams who build deployment tool.
Deployment tools should be…
Computers are much better than humans at following through lists of predefined instructions.
It’s important that humans get to decide whether or not to proceed with the deployment but once the decision has been made to go ahead it should be automated as much as possible. CLIs make good automation tools.
Many teams are indirectly involved in software deployment. From the teams who build the software being deployed to the teams who build the platforms being deployed onto to the teams who handle the deployments.
Everyone should be able to see how the software is deployed and monitor its success.
It is important that the maintainers of each of these tools can review and contribute to the code that deployments happen from and help to keep them efficient and up-to-date.
Deployment tools should provide the maximum amount of information to the user but allow them to filter or hide it. The tool cannot tell up front which information will be key to triaging a deployment issue so it should not attempt to withhold any. This is why GUIs make the best observation tools.
Deployment is done best when done by the authors of the software being deployed. They are ultimately responsible for the overall success of the software and therefore are the most motivated to ensure a successful deployment. They have the best knowledge of the potential failure modes.
Deployment tools should look to bring safety of deployment to software teams rather than bringing software to deployment teams.
Like it or not, the first step in fixing a failed deployment is often rerunning it. The principle of “first do no harm” apples. The deployment should be rerunnable without much thought and without making a failed deployment worse.
Deployment often involves running the same or similar steps for different environments or different services. There will be the same services deployed in different ways or different services deployed in the same ways.
Deployment workflows for new services should be built by composing pieces from existing workflows.
Since deployment software operates in production environments, error handling is a particular concern.
Errors can occur in all parts of the stack. The deployment tool must gracefully handle all these errors and surface information to a human as quickly as possible. Error information should include: what the error is, where the error occurred, suggestions about how to fix it and suggestions of who to call to help.
Since deployment software is just a class of software, general principles of good software design will apply. From that list though, two principles stand out especially strongly…
Deployments should be testable before they are done in production. It may not be feasible to reproduce exact production environments but the composable parts of the deployment should be individually testable.
The deployment tool should be highly available so that it is difficult for it to go down in the middle of a deployment.