Summary
This article explains the why and how about software versioning.
null
Table of contents
Introduction
Version control is an important aspect of research software organisation and development. It is essential in tracking and managing changes to a file, a set of files, or scripts over time.
Why is version control important?
Version control is especially important when you work on a collaborative project with multiple contributors and need to track the working histories, review changes, and revert or return to earlier versions. Recording all changes in your research software is a key practice for reproducible research.
How to apply version control to research software?
Today, most platforms hosting software provide some form of version control. For instance, the autosave feature in Microsoft Office allows you to access version histories through cloud storage. More advanced version control systems, such as Git, Mercurial and SVN, provide much more powerful tools that can support version control of complex research projects.
A typical procedure for using version control is as follows:
- Create files, including text, code, or data.
- Work on these files by modifying, deleting or adding content.
- Create a snapshot of the file status (known as a version) at the time.
- Document what was changed in the version history.
The snapshot process is often done manually for text or other types of documents by naming files with suffices such as v01, v02 and so on. Descriptions of the changes made in each version are sometimes stored in a separate document, like a spreadsheet, but they can also be lost without responsible management.
In Git, this process of creating a snapshot and recording a description of changes is called a “commit”. The version history is then automatically saved by the system with rich metadata. You can then easily find and access a specific version of a file or revert your entire project to a previous state. For more information, please read more in The Turing Way handbook on reproducible data science.
Manage your code with Git@WUR
At WUR you can securely work on code/scripts/software using the version control system Git at Git@WUR (a GitLab instance). Unlike the web versions of GitLab and GitHub, Git@WUR is GitLab hosted locally at WUR, where data is stored on WUR premises. On Git@WUR, you can share code, manage and solve issues for code, document code, and share code privately, WUR-wide, or publicly. You are free to use the web versions of GitLab and GitHub as well, but WUR does not offer repository-, account-, storage-, or access management, or other types of support for those web versions.
Features of Git@WUR are:
- Security: code is stored in secure data centres on WUR premises
- Control: you control who can see and can't see the source code by setting your project repository to private (only you and the ones you invite), internal (visible to WUR employees), and public (anyone can see).
- Visibility: if you publish a research article based on source code in Git@WUR, it is possible to apply a persistent identifier to the source code and link it to your research article. This way, both your publication and the source code become more visible.
- Collaboration: You can collaborate on your work with others inside and outside of WUR. WUR employees can be freely added to any project repository. If you want to add non-employees of WUR, please put in a request for an external account by creating a ticket. Make sure to add the following information: person's email address, full name, organisation and the reason for the request.
Support
Do you have any questions? Feel free to contact us at data@wur.nl.
Curious to find out what else WUR Library can offer you?
Visit the Library's website to access the Library’s databases and get a full overview of the Library’s services, tools, and support. You can contact a librarian anytime through the chat box on our website or click Chat online. We’re happy to help you!