The sound distributed version control system

#625 push/pull patch order mismatch

Opened by joshua.landau.ws on January 20, 2022
joshua.landau.ws on January 20, 2022

I have two channels, c1 and c2, with two different patches each, all of which are independent.

$ pijul log --channel c1 --hash-only
CKUKQ7QZZSN3W32YTGAWCVODLAJXT62KWEBIDR3MBP7HFOIPTZXAC
R2O3SWB5AL35D4CGCSFCBOBRY2SBQBQ3NTHAEYEQSXXAZBIPCZ6QC
K7IQJDHXQBZ5BVRL6CZ6U567IMITXR4NFWLDSKBBY6PLRSGVVDRQC

$ pijul log --channel c2 --hash-only 
JEURNCSH72YVXQBJZFZ322MX6JKZES7Y7ZLLQ3G3U4TKOFQ5EPBQC
ZKYMZON6L2ZXIUI64S2M3UE7HAEJTWEXB3J4C4TXAIHXVTJSCHHAC
K7IQJDHXQBZ5BVRL6CZ6U567IMITXR4NFWLDSKBBY6PLRSGVVDRQC

If I pull the changes from c2 into c1, I get what I expect,

$ pijul push . --from-channel c2 --to-channel c1     
Uploading changes [==================================================] 2/2                                                         

$ pijul log --channel c1 --hash-only            
JEURNCSH72YVXQBJZFZ322MX6JKZES7Y7ZLLQ3G3U4TKOFQ5EPBQC
ZKYMZON6L2ZXIUI64S2M3UE7HAEJTWEXB3J4C4TXAIHXVTJSCHHAC
CKUKQ7QZZSN3W32YTGAWCVODLAJXT62KWEBIDR3MBP7HFOIPTZXAC
R2O3SWB5AL35D4CGCSFCBOBRY2SBQBQ3NTHAEYEQSXXAZBIPCZ6QC
K7IQJDHXQBZ5BVRL6CZ6U567IMITXR4NFWLDSKBBY6PLRSGVVDRQC

c1’s history has c2’s patches applied on top of it, in the same order that they were in for c2.

If instead I pull, the applied patches switch order.

$ pijul unrecord ZKY JEU ; pijul reset

$ pijul pull . --from-channel c2 --to-channel c1
Downloading changes [>                                                 ] 0/2                                                   
            Applying [==================================================] 2/2                                                     
Completing changes [                                                  ] 0/0                                                   
Outputting repository ↖                                                                                                            

$ pijul log --channel c1 --hash-only            
ZKYMZON6L2ZXIUI64S2M3UE7HAEJTWEXB3J4C4TXAIHXVTJSCHHAC
JEURNCSH72YVXQBJZFZ322MX6JKZES7Y7ZLLQ3G3U4TKOFQ5EPBQC
CKUKQ7QZZSN3W32YTGAWCVODLAJXT62KWEBIDR3MBP7HFOIPTZXAC
R2O3SWB5AL35D4CGCSFCBOBRY2SBQBQ3NTHAEYEQSXXAZBIPCZ6QC
K7IQJDHXQBZ5BVRL6CZ6U567IMITXR4NFWLDSKBBY6PLRSGVVDRQC

I don’t understand what order guarantees log is meant to provide, if any, but I would hope that it preserves the original order whenever possible, which it is not doing in this case.

spacefrogg on January 26, 2022

There are no (strong) order guarantees, because a pull could receive independent changes in arbitrary order.

Your assumption would break down anyways, when for some reason, you would have pulled one of the changes out of order at an earlier time. Should the output of pijul log be re-ordered to match the other side when you pull the remaining patches? By which criterion should that be decided? The order on the original side was arbitrary to begin with.

joshua.landau.ws on January 29, 2022

People rely on having a true history. Pijul’s DAG tracks physical dependencies, but code has lots of other implicit dependencies, both with itself and with the outside world. It needs to be useful to look at a log and say that this ordering represents an actual history of the channel as people used it, and when preserving that order is free, like it is here, there is no reason not to do so.

It is fine for two codebases to have different history orders as long as they both reflect the true local histories of the repository. If your centralized master is cherry picking people’s patches in a different order they created them, that’s fine, that order is still a true order that is being tested and built upon by everybody else. If you then bisect that history you are still going to get a state that is verified to work. You can’t do that if commands are randomly shuffling history.

I’ll note the Pijul blog acknowledges the importance of preserving linear history.

Indeed, everybody wants to see the order of operations in a repository, for many reasons. For example:

  • We want to keep a record of the operations performed on our repository.
  • We want to go back in time.

And in fact, Pijul allows you to do exactly that, but in a more rigorous way than Git. Indeed, take the scenario where Alice and Bob work together, Alice makes a change AA while Bob makes BB. When they put their work together, Alice applies Bob’s change, resulting in the log ABAB, while Bob applies Alice’s change, resulting in the log BABA. In this case, there is no “true” linear history, since they worked on different things, and took different steps at different times. However, both of them want to be able to go back in time, step-by-step, and not just “step-by-step-according-to-Bob’s-order”.

In this case the fix is just to make sure these operations preserve that order as best they can.

Should the output of pijul log be re-ordered to match the other side when you pull the remaining patches? By which criterion should that be decided? The order on the original side was arbitrary to begin with.

It seems fine to default to preserving the local patch order. It does seem like there should be some shorthand method of rebasing your channel’s log on top of another channel’s log, but it’s not something you’d have to do often, and there’s no obvious benefit to doing it eagerly.

joyously on February 23, 2022

If each change maintains its original recording timestamp, those would be the history of the creation of changes. Applying a change to a repo could store a separate timestamp and that would be the history of apply. Then the log could be sorted in either one by default.