README.md

What is this?

This is intended to serve as a reference for developers interested in working on/contributing to Pijul.

Notation/conventions used herein

Out of respect for the time and sanity of readers, we will try to be careful about how we refer to elements in the hierarchicy of nest_user > repository > remote > channel > (change|discussion), as certain concepts only apply to particular combinations of these elements. As an example, when we want to implicate specifically a pair of both a repository and a channel when writing in plain text, we'll explicitly write it as a pair (repo, channel), since "repo/channel" is commonly meant to convey as "repo OR channel". When we use backticks to show a (pseudo-)code block, we'll continue to use a forward slash since that's what's used as a path separator in nest URLs. For example, pijul pull user/repo.

Before submitting a patch

Please run cargo fmt to format your changes before submitting a patch. As long as you have cargo fmt installed, you can get pijul to do this for you by adding the following to .pijul/config:

[hooks]
record = [ "cargo fmt" ]

For contributors using Rust Analyzer (esp. with VScode):

By default, Rust Analyzer runs cargo check with the --test flag included, which can really slow down editor interactions and feedback. For VScode users, it's recommended to set up a workspace in your local copy of the pijul repository. An example of a basic workspace file can be found below; you would edit the <your_carg_bin_path> entry to point to your cargo binary, and place this file in the project root as IE pijul.code-workspace. When you open vscode, it will ask you if you want to open the workspace.

Example pijul.code-workspace:

{
	"folders": [
		{
			"path": "."
		}
    ],
    "settings": {
        "rust-analyzer.checkOnSave.overrideCommand": [
            "<your_cargo_bin_path>"
            "check",
            "--workspace",
            "--message-format=json",
            "--manifest-path",
            "./Cargo.toml",
            "--lib",
            "--bins",
            "--examples"
        ]
    }
}

For users of other editors, if RA is taking a long time and reporting a large number of errors, this would be the first place to look. If you have a specific workaround for your setup, please submit it as a patch and it will be added to this list.

Scripting tricks (for tests, automation, etc.)

When testing modifications or debugging issues, it's often the case that you'll want to recreate fairly complex situations and repository states quickly and in a reproducible manner. To that end, it's useful to know how to write scripts that interact with pijul.

Forcing specific timestamps

pijul record --timestamp <time>

Minimizing interactive editor pop-ups/pager interactions during script execution

Record a patch without being prompted for hunk selection/change message

pijul record --all --message "<msg>"

Push/pull without being prompted for hunk selection

pijul pull --all <remote>

pijul push --all <remote>

Logging

Pijul uses log and env_logger to log messages to stdout; messages are tagged with priority levels (log provides the levels error, warn, info, debug, trace). The function that sets up pijul's logging is pijul::env_logger_init. You can consult the documentation for log and env_logger to get more information, but to turn logging on, run pijul with:

RUST_LOG=<level> <command>

for example:

RUST_LOG=warn pijul change

A quick and dirty example of how you can add new logging messages with the macros provided by log (which use standard format string syntax):

fn some_fun() {
    ...
    log::warn!("I want to see this value: {}", value);
    
    ...
}

Important types, API touchstones

Transactions

The big ticket source of state is a transaction (often these variables are called txn in the source code). There are different _TxnT traits implemented by different types of transaction, but the basic concrete type for a transaction is GenericTxn

pijul::repository::Repository

Unsurprisingly, this contains a lot of state pertaining to a particular repository.

pijul::remote::RemoteRepo

RemoteRepo is a concrete type frequently used to interact with a Transaction. RemoteRepo is an enum, with a variant for each kind of remote; Local, Ssh, Http, LocalChannel, as well as a bookkeeping None variant. For clarity, Local represents a local repo other than the one the user is currently working within, while LocalChannel is a channel within the current repo.

libpijul::pristine::Hash and libpijul::pristine::Merkle

Hash and SerializedHash are sequences of bytes which represent changes.
Merkle and SerializedMerkle are sequences of bytes which represent the state of a repository.
Both Hash and Merkle appear to users as strings of base32.\

You can freely convert between Hash <-> SerializedHash and Merkle <-> SerializedMerkle using the From and Into traits. Example: Hash::from(_) or my_hash.into().

Common tasks in Pijul's codebase

Visiting the change log

Use TxnTExt::reverse_log(..) (most recent first) or TxnTExt::log(..) (oldest first). You'll need a transaction and a ChannelRef. An example can be found in pijul::command::log.rs.

Working internally with push/pull (remotes, caching, and you)

Pijul recognizes two "views" of a (remote, channel) pair. The first is just the actual state of the (remote, channel). The second is a locally stored version of the remote, which is the last version of the actual remote that we've signed off on (we'll continue to call this the "local remote cache"). During a push or pull, the local remote cache almost always just downloads and caches whatever new patches are present in the (remote, channel) before you select what you actually want to push or pull via hunk selection. (NOTE: For LocalChannel remotes, there's no local cache; LocalChannel means it's just another channel in the same repository you're already working in).

API notes:

  • As usual, we need some kind of Transaction for most of the interactions with either view of a remote.
  • pijul::remote::RemoteRepo represents the actual remote.
  • libpijul::pristine::RemoteRef represents the local cache of the last remote we've signed off on.

Example of the base case:

  • local/master is comprised of the patches [(0, A), (1, B)]
  • the last time we interacted with remote/bugfix, it was also at [(0, A), (1, B)]
  • remote/bugfix is now comprised of the patches [(0, A), (1, B), (2, C), (3, D), (4, E)]

If we invoke pijul pull remote/bufgix, pijul will put the new patches [(2, C), (3, D), (4, E)] in the local remote cache for remote/bugfix, and ask which of the new changes you actually want to pull into local/master. Now, even if only (2, C) is pulled via hunk selection, subsequent pull operations generally won't have to re-download (3, D) and (4, E).

At time of writing, the only notable exception to this straight-forward caching strategy is if the actual remote has unrecorded a patch that ALSO exists in the channel we're pulling to or pushing from.

Example of the exceptional case:
we're pulling from remote/bugfix into local/master.

  • local/master is comprised of the patches [(0, A), (1, B), (2, C), (3, D)],
  • the last version of remote/bugfix we saw was [(0, A), (1, B), (2, C)]

If, when we invoke pijul pull remote/bugfix, we discover that the new set of changes comprising that (remote, channel) is [(0, A), (_, _), (_, _), (3, X), (4, Y)], meaning that the actual remote has unrecorded (1, B) and (2, C) since the last time we interacted with it, then the patches after the dichotomy (the last point at which the local remote cache and the actual remote were the same; here (0, A)) will not be cached.

We want to notify user that (1, B) and (2, C) have been unrecorded in the remote they're pulling from (or pushing to) before they're presented with hunk selection, and preserving the divergence between the local remote cache and the actual remote allows us to do this. Furthermore, we don't want to overwrite the local remote cache, because we want to continue to remind the user of this unrecord until either a) the user forces the cache to update (with the --force-cache flag), or b) the user fixes the discrepancy by unrecording (1, B) and (2, C) in local/master.

For the sake of completeness, if the actual remote has unrecorded one or more patches, but they do NOT exist in the channel we're trying to pull/push to, the cache will be updated and the user will not be notified. It is assumed that the user isn't concerned with those patches since they were either actively ignored during a previous hunk selection, or we never knew about them in the first place.