Darcs is a revision control system, along the lines of CVS or arch. That means that it keeps track of various revisions and branches of your project, allows for changes to propagate from one branch to another. Darcs is intended to be an ``advanced'' revision control system. Darcs has two particularly distinctive features which differ from other revision control systems: 1) each copy of the source is a fully functional branch, and 2) underlying darcs is a consistent and powerful theory of patches.
Really, only the first of these three layers is of particular interest to me, so the other two are done as simply as possible. At the database layer, darcs just has an ordered list of patches along with the patches themselves, each stored as an individual file. Darcs' distribution system is strongly inspired by that of arch. Like arch, darcs uses a dumb server, typically apache or just a local or network file system when pulling patches. Unlike arch, darcs can only use scp to write to a remote file system. The recommended method is to send patches through gpg-signed email messages, which has the advantage of being mostly asynchronous.
In the last paragraph, I explained revision control systems in terms of three layers. One can also look at them as having two distinct uses. One is to provide a history of previous versions. The other is to keep track of changes that are made to the repository, and to allow these changes to be merged and moved from one repository to another. These two uses are distinct, and almost orthogonal, in the sense that a tool can support one of the two uses optimally while providing no support for the other. Darcs is not intended to maintain a history of versions, although it is possible to kludge together such a revision history, either by making each new patch depend on all previous patches, or by tagging regularly. In a sense, this is what the tag feature is for, but the intention is that tagging will be used only to mark particularly notable versions (e.g. released versions, or perhaps versions that pass a time consuming test suite).
Other revision control systems are centered upon the job of keeping track of a history of versions, with the ability to merge changes being added as it was seen that this would be desirable. But the fundamental object remained the versions themselves.
In such a system, a patch (I am using patch here to mean an encapsulated set of changes) is uniquely determined by two trees. Merging changes that are in two trees consists of finding a common parent tree, computing the diffs of each tree with their parent, and then cleverly combining those two diffs and applying the combined diff to the parent tree, possibly at some point in the process allowing human intervention, to allow for fixing up problems in the merge such as conflicts.
In the world of darcs, the source tree is not the fundamental object, but rather the patch is the fundamental object. Rather than a patch being defined in terms of the difference between two trees, a tree is defined as the result of applying a given set of patches to an empty tree. Moreover, these patches may be reordered (unless there are dependencies between the patches involved) without changing the tree. As a result, there is no need to find a common parent when performing a merge. Or, if you like, their common parent is defined by the set of common patches, and may not correspond to any version in the version history.
One useful consequence of darcs' patch-oriented philosophy is that since a patch need not be uniquely defined by a pair of trees (old and new), we can have several ways of representing the same change, which differ only in how they commute and what the result of merging them is. Of course, creating such a patch will require some sort of user input. This is a Good Thing, since the user creating the patch should be the one forced to think about what they really want to change, rather than the user merging the patch. An example of this is the token replace patch (See Section A.5). This feature make it possible to create a patch, for example, which changes every instance of the variable ``stupidly_named_var'' with ``better_var_name'', while leaving ``other_stupidly_named_var'' untouched. When this patch is merged with any other patch involving the ``stupidly_named_var'', that instance will also be modified to ``better_var_name''. This is in contrast to a more conventional merging method which would not only fail to change new instances of the variable, but would also involves conflicts when merging with any patch that modifies lines containing the variable. By more using additional information about the programmer's intent, darcs is thus able to make the process of changing a variable name the trivial task that it really is, which is really just a trivial search and replace, modulo tokenizing the code appropriately.
The patch formalism discussed in Appendix A is what makes darcs' approach possible. In order for a tree to consist of a set of patches, there must be a deterministic merge of any set patches, regardless of the order in which they must be merged. This requires that one be able to reorder patches. While I don't know that the patches are required to be invertible as well, my implementation certainly requires invertibility. In particular, invertibility is required to make use of Theorem 2, which is used extensively in the manipulation of merges.
[FIXME: sections in brackets in this file are notes to myself or explanatory notes indicating something that is incomplete. I must work more on this.]
[Note: this section is incomplete, but is intended to orient CVS users as to how darcs is different, and how to do with darcs what they would have done with CVS.]
Darcs is very different from CVS.
CVS breaks the users into two categories: those who can commit and those who can't. For those who can't, CVS is just a way of getting the latest version. If they want to contribute to the project, they have to use a different tool (probably patch/diff). Darcs doesn't have this clear distinction between those who can commit and those who can't. With darcs, any contributer can take advantage of darcs to make changes and share those changes with others-either with a central repository, or simply with other users who might like to have those improvements. Since it is easy to apply a darcs patch from an email, and easy to use darcs to push patches via email, there is less need to give contributors write access to a centralized repository.
[Note: this section is incomplete, but is intended to orient arch users as to how darcs is different, and how to do with darcs what they would have done with arch.]
Although arch, like darcs, is a distributed system, and the two systems have many similarities (both require no special server, for example), their essential organization is very different--perhaps more so than the differences between darcs and CVS. But hopefully the biggest difference that arch users will find is that darcs is much simpler and easier to use.
Like CVS, arch has a two level system--there are repositories, and in order to modify a repository one must check out a working directory. This leads to ``interesting'' possibilities such as checking out a working directory from one repository and then committing to a different repository. On top of this, arch has a rather system for dealing with branches and versioning within each repository. Darcs uses a much simpler scheme, in which each working directory has an associated repository carrying just one brach. Every repository (and every working directory) is a branch.
Unlike darcs, arch is fully capable of running in a truly centralized manner, and when used in that manner (i.e. with only one repository) is roughly feature-equivalent (and complexity-equivalent?) with CVS.
When using arch in a distributed manner, each contributer creates a repository to store his or her modifications. Getting those modifications into a central repository then is a two step process. First you do a commit to your repository, and then either you or someone with write permissions on the central repository runs a [I don't recall what command] to move the patchset from your repository to the central one. An analagous process is used in darcs. First you use ``record'' to record your changes locally (this is like committing to your local arch repository). Then either you or someone with write access either pull to the central repository, or use push to send your changes to it (or to its maintainer, if you don't have write access).
Isaac Jones 2004-04-12