Tuesday, 26 August 2014

I have a dream

I have a dream

Where MapGuide and FDO source code are hosted and/or mirrored on GitHub.

Where by the virtue of being hosted on GitHub, these repositories are set up to take advantage of every free service available to improve our code quality and developer workflow:
  • TravisCI for Continuous Integration of Linux builds
  • AppVeyor for Continuous Integration of Windows builds
  • CoverityScan for static code analysis
  • Coveralls for code coverage analysis
  • What other awesome services can we hook on here? Please enlighten me. I'd really want to know.
Where a single commit (from svn or git) can start an avalanche of cloud-based services that will immediately tell me in several hours time (because C++ code builds so fast doesn't it?):
  1. If the build is OK (thanks to Travis and AppVeyor)
  2. Where we should look to improve our test coverage (thanks to coveralls)
  3. Areas in our codebase where we should look to change/tweak/refactor (thanks to coverity scan)
  4. Other useful reports and side-effects.
Now the difference between dream and reality is that there are clear obstacles preventing our dream from being realised. Here's some that I've identified.

1. Git presents a radically different workflow than Subversion

Yes, we're still using subversion for MapGuide and FDO (har! har! Welcome to two-thousand-and-late!). Moving to GitHub means not only a change of tools, but a change of developer workflows and mindset.

So in this respect, rather than a full migration, an automated process of mirroring svn trunk/branches (and any commits made to them) to GitHub would be a more practical solution. Any pointers on how to make this an idiot-proof process?

2. Coverage/support is not universal

MapGuide/FDO are multi-platform codebases. Although TravisCI can cover the Ubuntu side and AppVeyor can cover the Windows side, it does leave out CentOS. I've known enough from experience (or plain ignorance) that CentOS and Ubuntu builds need their own separate build/test/validate cycles.

And actually, Travis VMs being 64-bit Ubuntu Linux doesn't help us either. Our ability to leverage Travis would hinge on whether we can either get 64-bit builds working on Linux or am able to cross-compile and run 32-bit MapGuide on 64-bit Ubuntu, something that has not been tried before.

Also most service hooks (like coveralls and CoverityScan) target Travis and not AppVeyor, meaning whatever reports we get back about code quality and test coverage may have a Linux-biased point of view attached to them.

3. The MapGuide and FDO repositories are HUGE!

The repositories of MapGuide and FDO not only contain the source code of MapGuide and FDO respectively, but the source code of every external thirdparty library and component that MapGuide/FDO depends on, and there's a lot of third-party libraries we depend on.

If we transfer/mirror the current svn repositories to GitHub as-is, I'm guessing we'd probably be getting some nice friendly emails from GitHub about why our repos are so big in no time.

Also would Travis and AppVeyor let us get away with such giant clones/checkouts happening every time a build is triggered in response to a commit? I probably don't think so. Then again, I do live in a country where bandwidth doesn't grow on trees and our current government has destroyed our dreams of faster internet. What do I know?

So what do you think? Is this dream something worth pursuing?


Johan Van de Wauw said...


I actually may be able to help you out with a lot of things.
1) It is quite easy to mirror svn in to git. I'm doing so here for saga gis: https://github.com/johanvdw/SAGA-GIS-git-mirror
In my case I use svn2git on a copy of the svn sources (which I get through rsync from sourceforge). The rsync step is not really necessary but useful the first time you run svn2git.

2) You can speed up building of C++ a lot by using ccache. I'm sure that if you remove most of the third party dependencies (which should not be in the repo, next point) you can get a build in less than 5 minutes.

3) You should just remove some of the third-party libraries from your repository. No need to build them again each time, security updates are applied, ...
Moreover it forces you to make changes in the third-party library in the library itself (which is what a good open source citizen should do, see eg changes to agg-2.4). You also don't need all build tools for those libraries which are properly packaged.
You could still provide scripts to download the right version of these libraries (and keep a mirrored version). But I don't see the point of adding them. Especially the common ones such as php, apache, geos

Daniel O'Connor said...

You can also look at things like http://www.cloudbees.com/dev if you have a complicated, multiple step pipeline. IE: the deps could be built, turned into artifacts and when those change, trigger new builds/be external.