Saturday, July 11, 2015

Continuous delivery culture. Why do we do the things we do the way we do them?

Usually at first there is a problem to be solved. A solution is conjured and implemented. After a while, the solution is re-used and re-used again. It changes depending on the person implementing it and his/hers background, ideas, motives, likes and dislikes. People start implementing the solution because other people do it or someone orders you to do it. The solution becomes part of a culture. This can happen to such extents that the solution causes increasing amounts of side effects, other new problems which require new solutions.


In software development, solutions are often methods and/or pieces of software which change rapidly. This is especially true for the area of continues delivery, which is relatively young and still much in development. Continuous delivery tools and methods are meant to increase software quality and to make software development, test and deployment more easy. Are your continuous delivery efforts actually increasing your software quality and decreasing your time to market or have they lost their momentum and become a bother?

Sometimes it is a good idea to look at the tools you are using or are planning to use and think about what they contribute. Is using them intuitive and do they avoid errors and misunderstandings? Do you spend more time on merging changes and solving deployment issues than actually creating  new functionality? Maybe that is a time to think about how you can improve things.

In this article I will look at current usage of version control and artifact repositories. I will not go to the level of specific products. Next I will describe some common challenges which often arise and give some suggestions on how you can deal with them. The purpose of this is to try and let the reader not take continuous delivery culture for granted but be able to think about the why before and during the what.

Version Control

A purpose of software version control is to track changes in software versions. Who made which change in which version of the software? In version control you can track back what is in a certain version of the software. A release can be installed on an environment and thus indirectly version control allows tracing back which code is installed (comes in handy when something goes wrong).

When using version control, you should ask yourself; can I still without a doubt identify a (complete) version of the software? Do I still know who made which change in which version? If someone says a certain version is installed in a certain environment, can I without a doubt identify the code which was installed from my version control system?

Branching and merging; dangerous if not done right

Most software development projects I've seen, have implemented a branching and merging strategy. People want to work on their own independent code-base and not be bothered by changes other people make (and the other way around). Develop their software in their own isolated sandbox. The idea is that when a change is completed (and conforms to certain agreements (such as quality, testing)), it is merged back to the originating branch and after merging has been completed, usually the branch ceases to have function.

Projects and code modularity

Sometimes you see the following happen which can be quite costly and annoying. Project A and Project B partially share the same code (common components) and have their own separate not overlapping code. One of the projects creates a version control branch to have a stable base to work with, an independent life-cycle and not be bothered by development done by the other project. Both projects go their own way, both also editing the common components (which are now living in two places). At a certain moment they realize they need to come back together again (for example due to environment constraints (a single acceptance environment) or because Project A has something useful which Project B also wants). The branches have to be merged again. This can be a problem because are all the changes Project A and Project B have done to the common components compatible with each other? After merging is complete (this could take a while), an entire regression-test has to be performed for both projects if you want to ensure the merged code still works like expected for both projects. In my experience, this can be painful, especially if automated regression testing is not in place.

Lots of copies

Branching and keeping the branch alive for a prolonged time is against the continuous delivery principle of integrating early and often.

The problem started with the creation of the branch and separate development between the different projects. A branch is essentially an efficient copy of the code. Having multiple copies of the same code is not they way we were taught to develop; Don't repeat yourself (DRY) or Duplication is Evil (DIE) or Once and Only Once (OAOO), Single Point of Truth (SPoT), Single Source Of Truth (SSOT).

Remember agent Smith from The Matrix? Copies are not good!
Increase development time

When developing new features, the so-called 'feature branch' is often used. This can be a nice way to isolate development of a specific piece of software. However at a certain moment, the feature has to be merged with the other code, which in the meanwhile might have changed a lot. Essentially, the feature has to be rebuild on another branch. This is especially so when the technology used is not easy to merge. This can in some cases dramatically increase development time of a feature.

Danger of regression

When bug-fixes are created and there are feature branches and several release branches, is it still clear where a certain fix should go? Is your branching strategy making it easy for yourself or are you introducing an extra complexity and more work? If you do not update the branch used for the release and future releases, your fix might get lost somewhere and resurface at a later time.

A similar issue arises with release branches on which different teams develop. Team A works on release 1.0 which is in production. Team 2 works on release 2.0 which is still in development. Are all the fixes Team A makes (when relevant), also applied to Release 2? Is this checked and enforced?

Solutions

In order to counter such issues, there are several possible and quite obvious solutions. Try to keep the number of separate branches small to avoid misunderstandings and reduce merge effort. Merge back (part of) the changes made on the branch regularly (integrate early and often) and check if they still function as expected. Do not forget to allow unique identification of a version of the software. Introduce a separate life-cycle for the shared components (think about project modularity) and project specific components. This way branching might not even be needed.


Artifact repository

An artifact repository is used for storing artifacts. An artifact has a certain version number. Usually this can be tracked back to a version control system. An artifact repository uniquely identifies an artifact of a specific version. Usually deployable units are stored. An artifact stored in a repository usually has a certain status. For example, it allows you to distinguish released artifacts from 'work-in-progress' or snapshot artifacts. Also an artifact repository is often used as a means to transfer responsibility of the artifact from a certain group to another. For example, development is done, it is put in the artifact repository for operations to deploy it.

When working with an artifact repository, you should consider the following (among other things). If someone says an artifact with a specific version is deployed, can I still say I know exactly what was deployed from the artifact repository, even for example after a year? Once a version is created and released, is it immutable in the artifact repository? If I have deployed a certain artifact, can I at a later time repeat the procedure and get exactly the same result?

Artifact repository as a means of communication

An artifact repository can be used to transfer an artifact from development to operations. Sometimes the artifact in the repository is not complete. For example environment dependent properties are added by operations. Also some of the placeholders are replaced from the artifact and several artifacts are combined and reordered to make deployment easier. Deployment tooling has changed or a property file has been added. Do I still know a year later exactly what is deployed or have the deployment steps after the artifact is fetched from the repository modified the original artifact in such a way it is not recognizable anymore?

Changes in deployment software

Suppose the deployment software has been enhanced with several cool new features. For example the deployment now supports deploying to clustered environments and new property files make deployment more flexible, for example, allow specifying which database database code should be deployed to. Only I can't deploy my old artifacts anymore because the artifact structure and added property files are different. You have a problem here.

Solutions

Carefully thing about the granularity of your artifacts. Small granularity means it might be more difficult to keep track of dependencies but you gain flexibility in your deployment software and better traceability from artifact to deployment. Large artifacts means some actions might be required to allow deployment of your custom deployment unit (custom scripts) and you will get more artifact versions since often code changes lead to new versions and generally more code changes more often.

Carefully think about how you link your deployment to your artifact and how to deal with changes in the deployment software. You can add a dependency to the version of the deployment software to your artifacts or make your deployment software backwards compatible. You can also accept that after you change your deployment software, you cannot deploy old artifacts anymore. This might not be a problem if the new style artifacts are already installed in the production environment and the old style artifacts will never be altered or deployed anymore. You can also create new versions of the different artifacts in the new structure or update as you go.

Conclusion

Implementing continuous delivery can be a pain in the ass! It requires a lot of thought about responsibilities and implementation methods (not even talking about the implementation itself). It is easy to just do what everyone else does and smart people say you should do, but it has also never hurt to think about what you are doing yourself and to understand what you are doing and why. Also it is important to realize what the limitations are of the methods and tools used in order to make sound judgments about them. Try to keep it easy to use and make sure it adds value.