CS340:Source Control

From Dependability

Contents

Source Control, or Version Control (which is a more general term for the same, admitting that more than just source code may be versioned), is a must for any realistic software system. As the number of developers and number of source files belonging to a project increases, it quickly becomes impossible to maintain a consistent baseline through simple copy-and-paste file manipulation. Multiple, edited versions of a file become common and manually merging changes submitted by multiple developers becomes an enormous headache. The solution to these problems? Let the computer do what it's good at and manage the information flow for you.

An Overview of Source Control

The basic model underpinning the vast majority of version control systems is that of a library. All of the files associate with a project are collected into a central location, referred to as the repository. When a developer wants to interact with a file in the repository, he or she checks out the file, much as one might check out a book from the library. Advanced version control systems will, like a library, remember who checked out which files and to what location, so that if a developer goes missing in action, the file can be tracked down. When the developer is finished modifying the file, he or she checks in or commits the file back to the repository, at which point the changes are available to other developers working on the project.

Unlike a library, however, many version control systems allow multiple, concurrent checkouts. Developers Alice and Bob can both check out a file and modify it as each needs. Thus the checkout need not be exclusive. When Alice checks in her file (if she is faster than Bob), the system commits those changes exactly as it would had the checkout been exclusive. When Bob checks in his changes, however, the system must diff the two versions and determine if Bob changed some of the same code as Alice. If not, if Alice's and Bob's changes were orthogonal, then the system automatically merges the two versions of the file. If the changes were overlapping, the system requires that Bob resolve the conflict manually and then resubmit his changes. For collaborative work, this feature of a version control system is the most compelling reason for its use.

Most unlike a library, however, is the very notion of versioning. The version control system most properly records versions of a file, rather than merely files themselves. That is to say that every revision of a file that has ever been submitted to the repository is cataloged and stored for later retrieval. Many version control systems even track the version of entire projects, so that at any point, an earlier snapshot of the system can be recovered and examined.

For programmers, this capability is incredibly powerful. How often, having made changes to a file only to discover several days later that the path you've been going down is leading to a dead end, have you wanted to roll back the clock and recover the version of the file before you started making those changes? Source control systems give you this ability. Even in non-collaborative projects, this time-machine-like feature of version control systems should encourage you to use them.

Source Control in CS340

Given the importance of employing source control in real projects of any significant size, it shouldn't be surprising that we are asking you to use it in CS340. There are many cool and useful things that source control can do for you, beyond the activities of checking code out, checking code in, and restoring and viewing earlier versions of a file, but those activities are beyond the scope of the class and hence will not be expected of you. The interested reader will find justification of and information about these other activities in the documentation of the source control system we have selected for your use, Subversion.

Our hope and our goal is for source control to feel as little like a hoop that you have to jump through and as much like a useful tool as possible. That being said, even if you don't find source control to be useful, we want you to use it. The pedagogical element is as important here as the practical: you will run into source control at some point in your career as a software developer; we would be remiss in our instruction if we did not present it to you now.

Let's quickly take a look at the technology that you'll be using, and then we'll jump into descriptions of the specific actions you'll need to perform.

Subversion

Subversion is an open-source version control system that is and should be replacing CVS as the de facto open-source version control standard. We are using subversion, rather than CVS, for a number of reasons; they are well summarized on the Subversion website.

The CS Systems Staff have installed subversion on cs-tl1 and on the lava nodes for your use in CS340. An empty repository, with a basic directory structure to support branching and tagging has been created an placed in each group's home directory. You have two supported choices for interacting with the version control system: TortiseSVN and the command-line subversion client. Windows users will find TortiseSVN to be the most comfortable, most convenient mechanism for interacting with the repository; Linux, MacOS, and other *NIX users will need to use the command-line tools, as (sadly) no version of TortiseSVN exists for non-Windows platforms. TortoiseSVN is installed in Olsson 001, so even if you do not use a Windows machine at home, you will want to familiarize yourself with its behavior so you're not lost when you get to the labs and need to interact with the repository.

TortoiseSVN

TortoiseSVN is a Windows shell extension that enables Windows users to interact with subversion repositories through contextual menus and custom icons that reflect the state of files vis-a-vis the repository. It is based on TortoiseCVS; if you've used the latter, you will find TortoiseSVN quite comfortable to use. The CS Systems Staff has installed TortoiseSVN on the machines in Olsson 001 for your use; if you wish to work on CS340 material at home, you should visit the TortoiseSVN website and download and install the software directly to your machine.

The above screenshots shows TortoiseSVN in use.

svn

*NIX users will need to download and install svn, the command-line subversion client. The Subversion download page lists packages for most flavors of Linux, BSD, and mentions that MacOS X users can acquire binary versions of the software through fink. There is also a command-line version of the client for Windows — but TortoiseSVN is so good, you'll likely prefer using it.

Using Subversion in CS340

For the purposes of this project, you will not need to administer the source control repository. The CS Systems Staff has already created and configured the repository for your use. The very first thing you have to do is create a project baseline, and check it into the trunk of the project.

Establishing a Baseline

The baseline of a project is the set of files that are up to date and ready to be built an run. It encapsulates the notion of "always a working system." It is to the baseline that you will commit your daily changes to source files, after you have verified that the project will build (and performed some nominal testing to ensure that you've not reduced the functionality of your project by your changes), and it is from the baseline that you will check out files for modification.

Establishing the baseline is a task that you only have to perform once and is relatively easy to do. First, check out the project.

The next thing you will need to do is determine what the baseline should look like and what it should contain. You may want to consider keeping documentation about the system in the baseline, as well as information about the design. You may want to keep the specifications in the baseline. At a minimum, all of the code for the system you are building should be placed in the baseline. I suggest you create a directory called "baseline" and in it add all the elements of your system you want to form the baseline. Then, copy this directory to the location of your checkout. Follow the instructions under adding files to add the baseline to source control and then check in the project.

Congratulations: you've established a baseline for your project.

Checking Out Your Project

A checkout generally only needs to be performed once per machine; at the most, once per session. Once you have checked out your files, svn annotates the directories containing the checked-out files with a hidden directory to track the state of each file. Thus, to bring your working copy up to date, you merely need to update the working copy.

With TortoiseSVN

First, browse to the directory into which you want to check out your project. Once there, right click and select "SVN Checkout" from the menu:

Image:Tortoise Checkout step 1.png

You will be presented with a dialog box asking you to provide the URL of the repository and the location of your working copy:

Image:Tortoise Checkout step 2.png

You'll set it up as I have, except that where I entered "maa2q" you will enter your username and where I entered "group25" you will substitute your group number. If you entered (as I did) the name of a folder into which your files will go that does not yet exist, a dialog will ask you if you want the folder created. Click "Yes."

You will then be informed (if this is your first checkout) that the server key is not in the registry. This is similar to the complaint issued by ssh when you first contact a server. Accept the key, or check with Systems Staff if you are paranoid and want to ensure you're talking to the correct machine. Then you'll be asked to enter your password. For some reason that I've not been able to determine, you'll be asked to enter your password thrice. A bother, but it's a one-time deal.

TortoiseSVN then checks out the contents of your project, listing the files it pulls down, and telling you the revision number of the project.

With svn

From the shell prompt, type the following command:

svn co svn+ssh://name@cs-tl1/home/cs340/spring06/groupXX/repos/trunk project

where name is your UNIX username on the server, groupXX is your group number (e.g., group01), and project is the name of a directory on your local filesystem in which the project files will be placed (it should not exist). If you omit project, svn will create the directory trunk and place your files there.

You will be asked to provide your password, thrice for some reason I know not; once you have done so, svn will check out your files.

Updating Your Working Copy

Once you have established a working copy, you need to keep it up to date with the most recent changes made to the repository by other members of your group. You do this with the update command. Unlike with CVS, you should not use update to determine what changes you have made yourself to your working copy; svn provides the status command for that purpose.

When the update runs, you will be shown that state of all of the files in your working copy that have been changed, added, deleted, or renamed since the last time you ran an update. Any changes that were made to the repository will be retrieved and merged, if possible, into your working copy. If, however, you have modified a file that has been changed on the repository and those changes cannot be automatically merged, you will be notified that the changes to the file are in conflict and you will have to resolve the conflict.

With TortoiseSVN

Either right-clicking from within the top level of your working copy (as in the image below) or by right-clicking on the top level folder of your working copy, select "SVN Update" from the menu:

Image:Tortoise Update step 1.png

You will have to enter your password. Once you have done so, you will be presented with a list of files as TortoiseSVN retrieves the changes from the repository. Of greatest interest are those files marked with a "C" — these are files that are in conflict. You will have to resolve the conflict manually, and then tell svn that the conflict has been resolved.

With svn

From the top-level directory of your working copy, issue:

svn update

You are presented with a list of files as svn retrieves the changes from the repository. Of greatest interest are those files marked with a "C" — these are files that are in conflict. You will have to resolve the conflict manually, and then tell svn that the conflict has been resolved.

Adding Files

Note that when adding, deleting and renaming files, all you actually do is schedule an activity to be performed on the repository at a later time. Until you commit, the repository is not changed.

With TortoiseSVN

Right click on the file(s) or directory you wish to add, and select "Add..." from the "TortoiseSVN" submenu:

Image:Tortoise Add step 1.png

You will be presented with a dialog box that shows all of the files you have tried to add (listed recursively in the even of a directory add). Initially, all files are checked as to be added; I've deselected "badfile," deciding not to add it to the repository.

Image:Tortoise Add step 2.png

Select "OK." TortoiseSVN marks the files as being scheduled for addition. You'll notice that blue plus icon appears on the files and folders added, letting you know they're scheduled for addition.

With svn

In the directory containing the files (or directories) you wish to add, type the following:

svn add files

where files is a list of files you wish to add. Unlike CVS, svn recursively adds directores.

Deleting Files

With TortoiseSVN

Right click on the file(s) or directory you wish to delete, and select "Delete" from the "TortoiseSVN" submenu:

Image:Tortoise Delete step 1.png

TortoiseSVN marks the files as being scheduled for deletion. You'll notice that a red "x" appears on the files and directories, letting you know they're scheduled for deletion.

With svn

In the directory containing the files (or directories) you wish to delete, type the following:

svn delete files

where files is a list of files you wish to remove.

As always, the changes will not be made to the repository until you perform a checkin.

Renaming Files

Unlike CVS, svn keeps version information for directories. This is important, as it allows svn to do history-preserving renames of both files and directories. Hence, with svn, you can change the name of a file and yet, when you retrieve an earlier version of the system, the old file with the old filename will be retrieved.

With TortoiseSVN

Right click on the file or directory you wish to rename, and select "Rename..." from the "TortoiseSVN" submenu:

Image:Tortoise Rename step 1.png

Enter the new name for the file or directory in the dialog box that pops up:

Image:Tortoise Rename step 2.png

TortoiseSVN schedules the addition the file or directory with the new name, schedules the deletion of the file or directory with the old name, and moves the contents of the directory (if needed). This process illustrates what actually happens when you rename: you're atomically removing the old name and re-adding the file with the correct name. This is done in a history-preserving manner.

With svn

Checking In Your Project

The first thing you need to do, before you execute a checkin, is ensure that your working copy represents the most recent changes to the repository. If you fail to do this, and if someone else in your group has changed a file since you executed your checkout, the checkin will fail.

Unlike in CVS, you do not determine the state of your working copy through the update command; rather you, use the status command, which tells you how the state of your working copy differs from that of the repository. If your working copy is not up to date, you will need to perform an update and then, possibly, resolve conflicts.

Once you have reconciled any conflicts, you can check in your files.

With TortoiseSVN

First, you need to perform an update of your working copy.

As the update runs, you will be presented with a list of files that are being changed. Any files marked with a "C" are in conflict, that is, you've made changes to that file and so has someone else. You now need to resolve the conflict.

Once the conflict has been resolved, and you have told SVN that it is resolved, you can check in your files. Right click on the top-level folder of your working copy and select "SVN Commit" from the menu:

Image:Tortoise Checkin step 1.png

You will be presented with a dialog box that lists the files that have changed along with their state and which provides a box in which you can enter your log message. Do so, and select "OK:"

Image:Tortoise Checkin step 2.png

You will have to enter your password. Should the checkin fail, svn will tell you why and you will have to go back and correct the problem. The most likely reason for a checking failure is a failure to update your working copy and to resolve conflicts.

With svn

First, you need to perform an update of your working copy.

As the update runs, you will be presented with a list of files that are being changed. Any files marked with a "C" are in conflict, that is, you've made changes to that file and so has someone else. You now need to resolve the conflict.

Once the conflict has been resolved, and you have told SVN that it is resolved, you can check in your files. Do so by issuing:

svn ci --message "Message text"

where "Message text" is a checkin message explaining what you've done and why. If you fail to provide the "--message" flag and a message, a text editor will launch and you can enter your message through the editor.

Should the checkin fail, svn will tell you why and you will have to go back and correct the problem. The most likely reason for a checking failure is a failure to update your working copy and to resolve conflicts.

Resolving Conflicts on Checkin

When a conflict occurs, svn will provide you with at least 3 files:

  • your original copy of the file that is in conflict, suffixed with ".mine"
  • each revision of the file from the repository, suffixed with the revision number (e.g., ".r10")
  • the latest version of the file from the repository.

Once you have inspected the different files and manually (or with the help of a file-merging tool) merged in the changes to the non-suffixed file, you need to declared the conflict resolved. Subversion will not allow you to check in your changed until you have declared the conflict resolved. It is not sufficient to remove the extra files with which the update command provides you.

With TortoiseSVN

Once the conflict has been resolved, right-click on the file and select "Resolved" from the menu.

With svn

From the directory containing the conflicting file, issue:

svn resolved file

where file is the name of the file that was in conflict.

Retrieving an Earlier Version

From time to time, it is useful to be able to go back and see an earlier version of the system. Subversion makes this very easy to do, as the entire project advances in version number each time new changes are committed. Therefore, you can truly use the source control system like a time machine, and see what the baseline looked like at a particular revision number.

You retrieve an earlier version through the checkout mechanism, so you will want to do this from a suitable directory (i.e., not your current svn working copy).

With TortoiseSVN

TortoiseSVN makes this operation particularly easy. Simply follow the instructions for performing a checkout and, when you are presented with the Checkout dialog, select "Revision" rather than "HEAD Revision" and enter the revision number. TortoiseSVN allows you to browse the project checkin logs so that you can more easily find the revision you need.

With svn

To retrieve an earlier revision, issue:

svn co svn+ssh://name@cs-tl1/home/cs340/spring06/groupXX/repos/trunk project
--revision number 

where number is the number of the revision you wish to check out, and name, project and groupXX are your UNIX username on the server, the working directory name, and your group number, respectively. I highly recommend naming project something like baseline-revision-#, where # is the number of the revision you've checked out. Otherwise, you'll need to use:

svn info

at the top of your working copy, in order to determine the revision it represents.

Final Notes and More Information

The instructions here only capture a very small part of all you might need to know to really use source control to its fullest potential; they are not meant to be complete but rather to provide a sort of quick-start guide to source control.

The best place I know of to learn more about Subversion is the Subversion Book. It's generally very readable, but quite long. I use it mostly as a reference, and suggest that you do the same.