Version control

I’ve commented before that there are a lot of skills that our science graduates need to have, that don’t get explicitly taught at university. That’s because they don’t neatly fit into compartmentalized degree courses where the structure is dictated by technical knowledge. So things such as how to give a half-decent presentation, how to keep accurate and useful records of your experiments, or how to put together a paragraph of coherent scientific writing often don’t get covered at all.  

Another one of the skills to add to the list is how to keep control of your computer programmes. I’m not talking about learning a programming language or two, which probably most physics students will do in their studies. Rather, I’m meaning how to organize computer software at the top level. I was talking to one of our PhD students this morning, and she was asking for help in keeping her computer programmes under control. Specifically, she has a set of programmes that she has written, and they were working yesterday, but today they’re not. What she has done is change a few things in the code, to try something else out, and then tried to change them back to how they were. However, she’s not changed them back to exactly what she had previously, and now she can’t remember what they were supposed to be. 

That’s probably a problem that most early programmers have experienced. She’s been taught how to write the language, but not how to manage the programmes. She’s swimming in a sea of different versions of the same programme. She has problems with version control. 

When working on a computer programme with a team of people, as I’ve done in my previous job, version control has to be done properly. Otherwise one ends up with a mess of different versions, and no-one knows exactly who’s version does what and to what extent it has been verified and validated. Instead, there’s a code custodian, who looks after the latest, approved version. Other people can work on copies of the code, to try out modifications and improvements, but when their modifications are tested and approved it’s the custodian who formally amends the approved code, and updates the version number.  And a written record is kept of exactly what the changes are (even ‘trivial’ ones) and why they’ve been done. The result: Only a single programme is ever ‘the current’ programme,  it should ‘work’, and there will be a formal record of all the tests carried out on it to give confidence that it really does work.  

However, when you’re writing a programme for your own use and probably your use only (as many PhD students will) it’s tempting to think that you can keep track of where you are with it in your head. You don’t need formal version numbers, and formal verification tests, and so forth. Unfortunately, most of these students will eventually discover that it is, in fact, a really good idea to do things properly. While we think we’ll remember why we changed a few lines here and there, the reality is that in three months’ time, we don’t. I have trouble just remembering my passwords after coming back from holiday – remembering minor code changes is near on impossible. Looking after your codes in a formal way saves you a lot of that stress. 

Unfortunately, the chances are that the students are left to find out this for themselves. 

I’ll be on holiday for a couple of weeks enjoying the last of a wonderful northern hemisphere summer, and forgetting my passwords and what I did to my computer codes this week. Back in time for a Waikato spring. 

Leave a Reply