I don’t normally write about my day job, but the subject of this post is something I put a lot of value in. Over the past year I was in charge of researching, deploying, and maintaining modern software development tools at my workplace. It’s been an enlightening experience and it’s given me a firm understanding of the motivation behind these tools. Prior to this, the company had a draconian system for source control which consisted of senior engineers copying source over the network and using a visual merge tool to keep up a known good version of the source. The difference has been night and day after introducing these tools, so I will be spending today’s post discussing the benefits of continuous integration (CI) and how it enforces high quality of source.
I’ve mentioned Git in a blog post long ago so I won’t talk about what it is or how it works here. Instead, I will comment on how you can go wrong with Git and how to integrate CI with it. With regards to the former the “Git Flow” model almost encourages bad practice. It entangles the commit tree by mandating these separate master, develop, feature, and release branches connected via a stream of pointless merge commits. In order to keep up the clarity and utility of a Git log there should ideally only be one master branch on origin. Developers must make sure that their local branches are continuously rebased on the latest origin branch. Only when a topic branch is complete can it be pushed to origin as one commit. The commit message should completely describe the feature being added and nothing else. This model ensures that the developer is continuously integrating the latest source into their development branch which minimizes conflicts and bugs. It also ensures that each commit on the origin branch signifies a meaningful version of the source that someone would conceivably checkout. The commit log is then a very clear sequence of new features and bug fixes being introduces.
With CI embedded into our source control, we now look into how to make it better with code review. In a pre-commit code review workflow, colleagues must review and approve code changes before a push to origin. While there’s the obvious benefit of having a second pair of eyes to catch common errors in code, the true benefit of code review come from the social aspects. I don’t have scientific research to back this up, but I’ve noticed the following social benefits of code review:
- The idea that someone else will be looking at my source makes me code better.
- Reviewing the source of others enforces style consistency and improves the understanding of the rest of the code base.
- Constructive criticism from code review provides a positive feedback loop for developers looking to improve.
Programmers today are familiar with GitHub. It provides source control and a platform for collaboration around software. Code review here works using pull requests. The only issue is it’s SaaS and you have to buy a subscription for things like private repositories which was more than my budget (or lack thereof). To get around this I chose to deploy Phabricator which was open sourced by Facebook. All I had to do was salvage an old computer and self-host the application on the company’s intranet. Just like that, the company suddenly had a platform for documentation, task tracking, and code review. By themselves, documentation and issue tracking are critical to software development, but that will be a discussion for another day. Just know that we went from “tribal knowledge” to having a wiki for documentation. On the issue tracking side we went from email threads to a working ticketing system. Phabricator completely changed our workflow in ways that CI alone could not have.
Finally, we look into automating the review process and providing a truly continuous feedback loop for source pushed to the main repository. This is where Jenkins comes into play. Initially, my task was to automate the build process of our software and take that responsibility away from our company president. Jenkins does this at the click of a button by cleanly checking out the latest source, running a build script, and archiving all the resulting software artifacts. Its role in CI is to apply this process continuously on every commit that gets pushed to the repository. If the build fails for any given product due to an unrelated change the server immediately notifies the developer who must correct the issue. This prevents masking issues that would before go unnoticed until the time comes to build a new version particular product. While our source was never designed for test from the beginning, I’ve introduced a number of automated testing scripts on the built artifacts to Jenkins in lieu of unit testing. These tests make sure the basic operation of our products continuously without the need for human testing resources. It serves as an extra layer of automated quality control which source must pass before being truly committed.
With these tools we are able to complete the CI workflow which enforces high standards for source code being committed to a repository. Git ensures that the developer can always test against the latest source. Phabricator adds a feedback based quality control check and social pressure for high quality contributions. Finally, Jenkins automates the grunt work of building and testing each contribution against each product before giving the final approval. The result is that only the highest quality source code possible gets pushed into the repository. Adopting and deploying CI has changed the way I develop and I can’t think about ever going back to anything less.