A few years ago Codility had a (very rare) outage...
Here's what happened: We updated our website, it started slowing down and, in a few minutes... it ground to a halt. After looking at recent changes to the code we quickly determined the culprit: a recently added feature that stored additional data for all test sessions. It was coded so that every Codility Dashboard page load caused a number of rows to be written to the database. With the large volume of users browsing the tests our database soon became overloaded. The mistake was pretty simple, and if one more person had looked at the code before deployment, we probably would have avoided it.
That, in fact, is what we did to prevent such mistakes in the future: we decided that all changes to our system should go through code review. As we adopted more such practices, we understood that getting them right can make a tremendous impact: we avoid mistakes, we work faster and we write better code. Here are a few things that worked really well for us.
The first step is to be able to make changes without fear. To ensure that, we practice continuous integration. There are automated tests for every part of our system, and we run them on every commit. As a result, we're able to deploy changes soon after they are merged, without going through a lengthy QA process. Most behind-the-scenes releases are a "non-event": usually, it just means one developer pushing a button.
Automated unit and integration tests are enough to smoothly make changes to the code, but may not be enough to ensure the quality of these changes. For instance, an engineer might end up doing something in an inefficient way, as in the case of our outage. That person might also not know all the conventions, or may introduce a subtle bug.
That's the value of code reviews! Instead of committing code directly to the main branch, the author creates a merge request for another person to review. The reviewer has to read the code and either approve it, or point out problems.
There are many benefits here. First of all, it’s a great teaching tool: we can make sure that new engineers know our coding style, standards, and architecture. The reviewer might also point out code that violates some assumptions that the system makes, and breaks as a result. Since two people know the code now, they can discuss it: often, the author and the reviewer come up with a better design together. And of course, if more people are familiar with the code, the team is better off in the long run.
The downside is that the review is asynchronous. The author writes some code, then the reviewer has to read it and comment. After that, it goes back to the author, and so on. We've now found some ways to make this cycle shorter: Reviewers can prepare a list of issues, and then fix them together with the author. Or, developers can “ping-pong”: just make the changes themselves, and pass the code back for the original author to review.
When code reviews are too slow, we also have pair programming. Often, it’s simply useful to have a “co-pilot” sitting together with a programmer: someone who will discuss the design, and watch for mistakes. Pair programming is something like an immediate code review: Instead of reading after the fact, you can talk about the code as it is written.
One thing that we also noticed is that pair programming helps us avoid getting stuck. When working alone, it’s easy to fall into one line of thinking and spend many hours on a bad approach. If someone is forced to explain their ideas to another person, and convince them, there’s a chance that together the two will arrive at a better way of doing something.
The greatest benefit here, however, is that people learn from each other. If a new developer joins, the fastest way to show them around is to sit together and start coding. You can also notice all kinds of interesting habits and tools that the other person is using.
Recently, we also started to practice “ping-pong TDD” (test-driven development) during pair programming. We connect two keyboards to a computer, then the first person starts by writing a failing test, the second person makes it pass, the first person refactors the code, then the second person writes the next failing test, and so on. This is not only a really fun way of doing TDD, but also ensures that both people participate equally and understand all that is happening.
There is one more practice that worked really well combined with those above: The team owns the tasks. There are no individual owners for the programming tasks we do, but we’re responsible for finishing them as a team. This turns out to be really important.
When we were starting out, our first attempt at planning was to split work into weekly iterations, and then at the beginning of iteration to decide who will work on what. We would exchange merge requests and review each other’s code, but generally everybody had their own backlog of things to do.
Unfortunately, as a result we worked in isolation. Since everybody was focused on their own work, there was no reason for us to synchronize. This reduced the opportunity to learn from each other. As a result, people were often stuck on things that somebody else knew how to do much faster.
We noticed that and decided to change the whole arrangement. Now, there is nothing allocated up front. We simply take the tasks from the common backlog when we are free, and decide on the fly how best to work on them. Often, multiple people work on a single task, either by pair programming or by dividing up the work.
When we introduced that change, something interesting happened. Suddenly we were able to finish things much faster. An iteration planned for ten working days was finished in six. The next iteration was also finished ahead of schedule. It seems that we were able to unlock some potential for collaboration that we didn’t know existed.
Our development practices and culture are very much a work in progress (the way they should be!) As an example, recently we updated the onboarding procedure for new engineers - we decided that a new engineer should be able to make a commit and deploy it during their first day. We’re also thinking of starting a more formal mentorship program.
Looking at our practices, I see an underlying theme: most of them are designed to encourage communication and give us more opportunities to learn from each other. As long as we can ensure that, development speed and code quality will follow.