October 08, 2003
Repository Structures and CI
Everyone has their ideas on how the world should be organized. Today I am going to share mine with, well with the world of course.
How to organize a repository is very important to Continuous Integration-practicing teams. Granted, how to organize the repository is important to all repository users -- but I think there are certain themes that have particular importance if your team is practicing Continuous Integration (CI) that might be less noticeable (if not less important) than if you are not. CI teams use the repository differently than other teams, and that means there are certain usage-patterns that can cause you unnecessary pain if you try to apply them to a CI team's repository where things change frequently, and everyone is encouraged to get the "entire" tree when they work (remember that word "entire").
One thing I know for sure is that on every project I've done using CI, getting the repository to play nice with the daily development cycle has initially been a challenge, and once solved, provided a huge boost in productivity.
As an aside, (or perhaps a further aside, we were talking about the world a minute ago) here's the basic daily cycle for individuals on a team practicing CI (and presumably Test Driven Development). Starting at the beginning of the day, and going to the end it looks something like this:
First thing: check to ensure that the last build was successful. Broken code is useless to a team practicing CI. Getting that build machine passing (usually this means compiling and testing properly for CI-practicing teams) comes first.
OK, builds working? Next, grab the latest version of the entire code base (its the definition of "entire" that this blog post is really about. I promise I will get back to it)
Run the build locally to make sure all is well.
Do the TDD test/code/refactor cycle
Periodically sync up. To do that, you essentially repeat the pattern: get latest, run the build with your changes locally and if all's well check-in. If you are using a build tool like CruiseControl, or CC.NET make sure the build passes.
At the end of the day you should have done this larger cycle several times. Finish up with one last check-out/check-in/watch build pass cycle, and head home happy with the knowledge that you made progress today.
OK, but there's that small bit about the entire code base. Getting this right can be tricky.
Josh MacKenzie, one of my fellow ThoughtWorkers said it best: "When I am on a project I want the entire world in the repository, so that there is no question of where I need to go to get everything." And that's how I feel too -- but just exactly how the world gets put into the repository gets us into trouble sometimes.
(see I told you this was about my vision of world order)
Documentation seems to be the big culprit here. Storing docs in the repository makes sense, but having lots of docs in the same tree as the code makes for hefty check-outs, difficulty in cleaning (try checking out several Mb's of Word documents every time you do a clean build) and false positives on when to rebuild.
So, the principle that we seem to veer toward is: "Separate the things needed to build the code, from the things that are not needed." Simple, obvious -- too obvious for a long-winded blog entry. But perhaps a bit too simple.
On my latest project, one more refinement to "entire" has presented itself. One of things needed to build the code base are the development tools. Things like CruiseControl and Ant, or an xUnit tool like NUnit, and many other development tools, are often needed to build the code, often must be versioned along with the code, and thus often end up in the repository.
But (and finally, I arrive at my point) tools don't need to be included in that definition of "entire." These meta-build artifacts change relatively infrequently, and so can be given their own home in the repository.
Phew, glad I got that off my chest. OK, here's my current notion of how the world should be organized:
- Source and dependent third-party binaries go in a tree or sub-tree.
- At the root of this tree goes the master build script, solution file, make file, whatever
- This tree is the basis of builds (and what gets deleted for clean builds).
- Tools go in their own parallel tree that is visible to the build scripts, but not under that root tree.
- Tools get checked-out and integrated when they are changed -- but separately from the code itself.
- Everything else (documents, schedules, etc) gets put into its own parallel (or higher) tree
At least that's my current thoughts on the subject of world order -- and see I didn't mention George W. Bush (or his Father) once. Err well OK, twice then.
Posted by wcaputo at October 8, 2003 05:34 PMHi Bill,
Interesting entry. Just for clarity, do you mean something like? :
ProjectRoot
---|
---CodeAnd3rdPartyBinaries
Posted by: Dađi at October 14, 2003 04:30 PMYes. Then the project's CI build is only concerned with the "CodeAndThirdPartyBinaries" tree.
Incidentally, The tools tree can have its own integration cycle (IOW manage a CI process for the tools tree independently) -- or even (I haven't tried this) a higher-level build process that watches for changes on Tools and indicates to the CodeAnd' tree's build process to update the build machine's tools.
But the simplest thing, is to simply update tools when they change (since presumably they change less often, if they are changing as often as the main code base, I would probably keep them together).
Posted by: Bill at October 16, 2003 08:25 AMBad Links (January 19, 2006)
Visual Studio Team System Jumpstart (January 18, 2006)
Aligining Value (January 17, 2006)
Lisp Again (January 16, 2006)
Getting It Right (January 13, 2006)
Efficiency vs Productivity (January 12, 2006)
Stubbornness (January 10, 2006)
Writing To Annoy Yourself (January 9, 2006)
Due Process In The Workplace (January 5, 2006)
(All Entries...)