pijul_org / pijul

#278 new corruption bug

Opened by yory, on May 20, 2018
Open
yory commented on May 20, 2018

Yesterday I was working heavily with pijul webview. I pushed a patch with an accidental newline in the message, so I unrecorded it in my own repo and recorded it again, because I knew there was an unrecord button in the nest for doing the same with the online copy. But the button ends with a server error, so I couldn't remove. Now if you try to clone my repo you get a conflict.

The bigger issue is that something got heavily corrupted in my own local copy too, independently from the issue with the nest, probably when doing unrecord + record again. When doing a diff it told me that the file pijul_webview.py had been moved, even though it was still there and unchanged (then if I moved it in a new dir pijul diff panicked with this backtrace:

   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::print
             at libstd/sys_common/backtrace.rs:71
             at libstd/sys_common/backtrace.rs:59
   2: std::panicking::default_hook::{{closure}}
             at libstd/panicking.rs:207
   3: std::panicking::default_hook
             at libstd/panicking.rs:223
   4: std::panicking::rust_panic_with_hook
             at libstd/panicking.rs:402
   5: std::panicking::begin_panic_fmt
             at libstd/panicking.rs:349
   6: rust_begin_unwind
             at libstd/panicking.rs:325
   7: core::panicking::panic_fmt
             at libcore/panicking.rs:72
   8: core::panicking::panic
             at libcore/panicking.rs:51
   9: libpijul::record::<impl libpijul::backend::GenericTxn<sanakirja::transaction::MutTxn<'env, ()>, R>>::record_inode
  10: libpijul::record::<impl libpijul::backend::GenericTxn<sanakirja::transaction::MutTxn<'env, ()>, R>>::record_children
  11: libpijul::record::<impl libpijul::backend::GenericTxn<sanakirja::transaction::MutTxn<'env, ()>, T>>::record
  12: pijul::commands::status::run
  13: pijul::main
  14: std::rt::lang_start::{{closure}}
  15: std::panicking::try::do_call
             at libstd/rt.rs:59
             at libstd/panicking.rs:306
  16: __rust_maybe_catch_panic
             at libpanic_unwind/lib.rs:102
  17: std::rt::lang_start_internal
             at libstd/panicking.rs:285
             at libstd/panic.rs:361
             at libstd/rt.rs:58
  18: main
  19: __libc_start_main
  20: _start

).

I suppose that pijul was viewing it as pijul_webview.py.ALPHANUMERIC_MESS. Sadly I deleted the .pijul folder and rebooted it on impulse, which I regret as it might have helped to debug.

I tried to replicate it in a new repo, but couldn't.

pmeunier commented on May 20, 2018

This sounds to me like you've created a conflict on a file name, i.e. two files with the same name, or one file with two names (ALPHANUMERIC_MESS is the local id of the patch that gave the file this name). I'm not sure what happens when you try to unrecord or rollback a resolution of one of these conflicts.

Thanks for the backtrace, but unfortunately, it isn't super useful as such. For next time, it would be much more useful to do the following:

  • Include the cause of the panic (the line before the start of the backtrace).
  • Use a debug build of Pijul instead of a release build.
  • Run Pijul with environment variable RUST_LOG="pijul=debug,libpijul=debug".
pmeunier commented on May 21, 2018

By the way, what patch are you trying to unrecord on the Nest? Can you give me its hash?

yory commented on May 21, 2018

the patch I want to unrecord is 7FvZcQ7BK7KXmTaWuX6EUeWMkuBhGPWKzsWUKpsprhDyvYtvBiezCQTowvY5WPwUxtuN4peZCaJC7WinGmmb1pFs

Yes, the backtrace was the only thing I could salvage as it was still in my clipboard manager's memory. Next time i'll follow your steps. It probably is just like you suppose, as I was doing some heavy reorganizing of the project's structure.

pmeunier commented on May 21, 2018

Ok, the Nest still uses libpijul 0.10. I can unrecord it fine locally, but the Nest seems to fail, probably due to a bug we fixed between 0.10 and 0.11.

lthms commented on May 21, 2018

I'm not sure what happens when you try to unrecord or rollback a resolution of one of these conflicts.

There is probably something to do here, for instance, writing test cases to “control” the behaviour? (know what it is exactly, avoid changes we don't expect, etc.)

yory commented on May 21, 2018

@lthms definetely. I tried again to recreate a test case, but still can't reproduce it again.

jcarr commented on June 10, 2018

I ran into this issue on another repository. Pijul is version 0.10.1

Unzippable repository is here: (I can't seem to get pijul push to work at all right now) https://drive.google.com/file/d/1F9NPi3MZqyJFG2TLhBNxNUyZmIVUh0F5/view?usp=sharing

Error occurs on record. There seems to be general bugs (such as having two sets of change for one real change) with the Artifact lines, but the error seems to occur when adding the Job.scala lines.

I did an unrecord before this which may have corrupted it. I have also renamed one of the files in the patch.

exact versions of everything are at: https://github.com/P-E-Meunier/nixpkgs/blob/71347239f8c4025fc548e6596e801f03ff9352d8/pkgs/applications/version-management/pijul/default.nix (see https://github.com/P-E-Meunier/nixpkgs/commit/71347239f8c4025fc548e6596e801f03ff9352d8#diff-5d51f803f9f6ea89d4cbbe8f3943e694 for all deps)