A revision control system, for those with even less programming experience than myself, manages “changes to documents, programs, and other information stored as computer files.” The most advanced ones are used by teams of programmers who simultaneously edit the same code. Simpler revision control is built in to things like wikis and word processors.
I’m wondering whether a revision control system would be helpful for me now, or in the future, even if all I’m doing is statistics.
I’m working with a big dataset (ok Scott, not that big) and I’ve written a fair bit of code. Nothing too complicated, it is half data preparation, and half analysis and graphics. Every so often I save my code under a new name, that way, if I accidentally save bad changes, I can always revert to a previous state. I do the same thing with the dataset itself, and, in R, with my workspace. In fact, I have an extra reason to do this with the data and my R workspace: memory management. R often complains that its running out of memory so I respond by deleting variables that I probably won’t need or could recreate without too much trouble.
It is sometimes annoying to find code that I wrote simply because there is a lot of text to go through. I can only organize it one way, e.g. I could put all the code that makes graphs together, but then the code that makes graphs wouldn’t be placed next to the code that creates the data the graphs are based on.
Is a revision control system overkill for what I’m doing? Any other thoughts?