Manual for Pijul
# Why Pijul?

Pijul is the first distributed version control system to be based on a sound mathematical theory of changes. It is inspired by [Darcs]http://darcs.net, but aims at solving the soundness and performance issues of Darcs.

Pijul has a number of features that allow it to scale to very large repositories and fast-paced workflows. In particular, **change commutation** means that changes written independently can be applied in any order, without changing the result. This property simplifies workflows, allowing Pijul to **clone sub-parts of repositories**, to **solve conflicts reliably**, to **easily combine different versions**.

### Change commutation

In Pijul, for any two changes A and B, either A and B can be applied in any order, or A depends on B, or B depends on A.

- **[Use case: early stage of a project]** Change commutation makes Pijul a
  highly forgiving system, as you can "unapply" (or "unrecord")
  changes made in the past, without having to change the identity of
  new changes. A reader familiar with Git will understand  "rebasing", in git terms).

  This tends to happen pretty often in the early stages of a project,
  when most things are still uncertain. With Pijul, exploring new
  features and new implementations comes at no extra cost in time.

- **[Use case: the project becomes stable]** As your project grows, change
  commutation saves even more time: imagine a project with two main
  branches, a stable one containing only the original product, and
  bugfixes, and an unstable one, where new features are constantly
  added.

  The team working on the unstable branch is likely to discover old
  bugs, and fix them in the stable branch too.

  In Pijul, maintainers of the stable branch can simply pull only the
  changes they are interested in. Those changes *do not* change when
  imported, which means that pulling new changes in will work just as
  expected.


### Associativity

In Pijul, change application is an *associative* operation, meaning
that applying some change A, and then a set of changes (BC) at once,
yields the same result as applying (AB) first, and then C.

With branches, the first scenario looks like this: Bob creates A,
while Alice creates B, C, and Bob finally merges both B and C at once.

<div style="text-align:center">
<img src="./associativity0.svg"/>
</div>

The second scenario would look like the following, with Bob creating
commit A, and then pulling B. At that moment, Bob has both A and B on
his branch, and wants to pull C from Alice.

<div style="text-align:center">
<img src="./associativity1.svg"/>
</div>


Note that this is different from change reordering: here, we apply A,
then B, then C, in the same order in both scenarios.

Using math words such as "associative" for such a simple operation may
sound like nitpicking, because intuition suggests that it should
always be the case. However, **Git doesn't guarantee that property**,
even if A, B, and C do not conflict.
Concretely, this means that Git (and relatives) can **sometimes shuffle lines around**, because these systems *only* track versions, rather than the changes that happen between the versions. And even though one can reconstruct one from the other, the following example (taken from [here]https://tahoe-lafs.org/~zooko/badmerge/simple.html) shows that tracking versions only does not yield the expected result.


<div style="text-align:center">
<div style="display:inline-block">
<img style="margin:1em 2em;clear:both;display:block" src="./badmerge0.svg"/>
Git merge (which A is which?)
</div>
<div style="display:inline-block">
<img style="margin:1em 2em;clear:both;display:block" src="./goodmerge.svg"/>
Pijul merge
</div>
</div>

In this diagram, Alice and Bob start from a common file with the lines A and B. Alice adds G above everything, and then another instance of A and B above that (her new lines are shown in green). Meanwhile, Bob adds a line X between the original A and B.

This example will be merged by Git, SVN, Mercurial… into the file shown on the left, with the relative positions of G and X swapped, where as Pijul (and Darcs) yield the file on the right, preserving the order between the lines. Note that this example **has nothing to do with a conflict**, since the edits happen in different parts of the file. And in fact neither Git nor Pijul will report a conflict in this case.

The reason for the counter-intuitive behaviour in Git is that Git runs a heuristic algorithm called *three-way merge* or *diff3*, which extends *diff* to two "new" versions instead of one. Note, however, that *diff* has multiple optimal solutions, and the same change can be described equivalently by different diffs. While this is fine for *diff* (since the patch resulting from *diff* has a unique interpretation), it is ambiguous in the case of *diff3* and might lead to arbitrary reshuffling of files.


Obviously, this **does not mean that the merge will have the intended semantics**: code should be still reviewed and tests should still be run. But at least a **review of the change will not be made useless** by a reshuffling of lines by the version control tool.

## Modelling conflicts

Conflicts are a normal thing in the internal representation of a Pijul
repository. Actually, after applying new changes, we even have to do extra work to find where the conflicts are.

In particular, changes editing sides of a conflict can be applied
without resolving the conflict. This guarantees that no information
ever gets lost.

This is different from both Git and Darcs:

- In Git, conflicts are not really handled after they are output to
  files. For instance, if one commits just after a conflict occurs,
  git will commit the entire conflict (including markers).

- In Darcs, conflicts can lead to the [exponential merge
  problem](http://darcs.net/FAQ/Performance#is-the-exponential-merge-problem-fixed-yet),
  which might cause it to take several hours to merge even a two-lines
  change.


## Comparisons with other version control systems

### Pijul for Git/Mercurial/SVN/… users

The main difference between Pijul and Git (and related systems) is
that Pijul deals with changes (or patches), whereas Git deals only
with snapshots (or versions).

There are many advantages to using changes. First, changes are the
intuitive atomic unit of work. Moreover, changes can be merged
according to formal axioms that guarantee correctness in 100% of
cases, whereas commits have to be /stitched together based on their
contents, rather than on the edits that took place/. This is why in
these systems, conflicts are often painful, as there is no real way to
solve a conflict once and for all (for example, Git has the `rerere`
command to try and simulate that in some cases).

### Pijul for Darcs users

Pijul is mostly a formally correct version of Darcs' theory of
changes, as well as a new algorithm for merging changes. Its main
innovation compared to Darcs is to use a better data structure for its
pristine, allowing for:

- A sane representation of conflicts: Pijul's pristine is stored in a
  "conflict-tolerant" data structure. Many changes can be applied to
  it, and the presence or absence of conflicts are only computed
  afterwards, by looking at the pristine.

- Conflicting changes always commute in Pijul, and never commute in Darcs.

- Fast algorithms: Pijul's pristine can be seen as a "cache" of
  applied changes, to which new changes can be applied directly,
  without having to compute anything on the repository's history.


However, Pijul's pristine format was designed to comply with axioms on
a specific set of operations only. As a result, some of darcs'
features, such as `darcs replace`, are not (yet) available.