pijul debug in run
λ ./pijul clone https://nest.pijul.com/pijul/pijul x
Repository created at /home/levi/experiment/x
Downloading changes [==> ] 59/990
Applying [==> ] 59/990
[2023-12-31T16:13:46Z ERROR pijul] Error: TxnErr(
Sanakirja(
IO(
Os {
code: 0,
kind: Uncategorized,
message: "No error: 0",
},
),
),
)
Error: No error: 0 (os error 0)
Pijul debug in gdb session.
~/experiment took 1m18s
λ rust-gdb -q ./pijul
Reading symbols from ./pijul...
(gdb) b main.rs:160
Breakpoint 1 at 0xa3603c: main.rs:160. (2 locations)
(gdb) clone https://nest.pijul.com/pijul/pijul pijul_repo
No symbol 'https' in current context
(gdb) run clone https://nest.pijul.com/pijul/pijul pijul_repo
Starting program: /home/levi/experiment/pijul clone https://nest.pijul.com/pijul/pijul pijul_repo
[New LWP 141540 of process 99972]
[New LWP 141541 of process 99972]
[New LWP 141542 of process 99972]
[New LWP 141543 of process 99972]
[New LWP 141544 of process 99972]
[New LWP 141545 of process 99972]
[New LWP 141546 of process 99972]
[New LWP 141547 of process 99972]
[New LWP 141548 of process 99972]
[New LWP 141549 of process 99972]
[New LWP 141550 of process 99972]
[New LWP 141551 of process 99972]
[New LWP 141552 of process 99972]
Repository created at /home/levi/experiment/pijul_repo
[New LWP 141553 of process 99972]
[New LWP 141554 of process 99972]
Downloading changes [==> ] 59/990
Applying [==> ] 59/990
Thread 1 hit Breakpoint 1.1, pijul::main::{async_block#0} () at src/main.rs:160
Downloading changes [===> ] 60/990
(gdb) n
Thread 1 hit Breakpoint 1.2, pijul::main::{async_block#0} () at src/main.rs:160
160 log::error!("Error: {:#?}", e);
(gdb)
Applying [==> ] 59/990
Downloading changes [===> ] 60/990
Applying [==> ] 59/990
Downloading changes [===> ] 60/990
Applying [==> ] 59/990
[2023-12-31T16:19:27Z ERROR pijul] Error: TxnErr(
Sanakirja(
IO(
Os {
code: 0,
kind: Uncategorized,
message: "No error: 0",
},
),
),
)
Downloading changes [===> ] 60/990
Applying [==> ] 59/990
Downloading changes [===> ] 60/990
164b) Err(e) => writeln!(std::io::stderr(), "Error: {}", e).unwrap_or(()),
(gdb)
Applying [==> ] 59/990
Downloading changes [===> ] 60/990
Error: Applying [==> ] 59/990
No error: 0 (os error 0)
165 }
(gdb)
Downloading changes [===> ] 60/990
Applying [==> ] 59/990
[LWP 141553 of process 99972 exited]
[LWP 141544 of process 99972 exited]
[LWP 141548 of process 99972 exited]
[LWP 141551 of process 99972 exited]
[LWP 141546 of process 99972 exited]
[LWP 141541 of process 99972 exited]
[LWP 141552 of process 99972 exited]
[LWP 141549 of process 99972 exited]
[LWP 141540 of process 99972 exited]
[LWP 141550 of process 99972 exited]
[LWP 141543 of process 99972 exited]
[LWP 141554 of process 99972 exited]
[LWP 141542 of process 99972 exited]
[LWP 141547 of process 99972 exited]
[LWP 141545 of process 99972 exited]
[Inferior 1 (process 99972) exited with code 01]
(gdb)
last lines on (gdb) b clone.rs:1
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
284 }
(gdb)
Downloading changes [==> ] 52/990
Applying [==> ] 52/990
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
284 }
(gdb)
Downloading changes [==> ] 53/990
Applying [==> ] 53/990
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
284 }
(gdb)
Downloading changes [==> ] 54/990
Applying [==> ] 54/990
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
Downloading changes [==> ] 55/990
286b) self.park();
(gdb)
Applying [==> ] 55/990
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
284 }
(gdb)
Downloading changes [==> ] 56/990
Applying [==> ] 56/990
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
284 }
(gdb)
Downloading changes [==> ] 57/990
Applying [==> ] 57/990
282 if let Ready(v) = crate::runtime::coop::budget(|| f.as_mut().poll(&mut cx)) {
(gdb)
Downloading changes [==> ] 58/990
Applying [==> ] 58/990
284 }
(gdb)
Downloading changes [==> ] 58/990
Applying [==> ] 58/990
Downloading changes [==> ] 59/990
Applying [==> ] 58/990
[2023-12-31T18:01:46Z ERROR pijul] Error: TxnErr(
Sanakirja(
IO(
Os {
code: 0,
kind: Uncategorized,
message: "No error: 0",
},
),
),
)
This can be closed since issue does not arise on UFS mount point.
With reference to #849 this bug could be related. This and all the future FreeBSD
and ZFS
file system I/O problems could be related. So the solution would be probably the same. Yesterday @tankf33der discovered that this is not FreeBSD
alone issue, neither ZFS
alone issue. Pijul works fine on ZFS
under Linux, and works fine on FreeBSD
under UFS
file system. But the combination of both FreeBSD/ZFS
cause the problem. We have several hypotheses, including missing/incomplete implementation (see #849), To low kernel knobs values for ZFS.
To rule out at least one of these things, we need the opinion of a programmer on tokio::io::bsd
All these problems seem to have one problem in common at its core - handling concurrency properly.
Let me remind you that pijul clone
only fail on larger repositories (unproved), and Pijul record fails…only some of the time.
Just to add, the tandem FreeBSD+ZFS is the de facto standard. And it is recognized as default and used in more than 90% of cases.
Here’s the thing, I always downloaded my experimental repositories, which were rather small. This time I wanted to download the repository of pijul itself.
First the test, let’s try it on some rather small repo, let’s say: https://nest.pijul.com/finchie/new_manual/changes
Looks good, now let’s try pijul repo.
Unfortunately, Error 0, is marked as “Not used” So I conclude that the error is unknown. And it’s probably not only on FreeBSD. In any case, this is an error that is not defined in the C standard library. So it must be something more bizarre. I know it’s Rust, but apparently Rust libraries follow the same conventions.
It does not even create a directory, and it is always
58/990
in the case of pijul repo.A working hypothesis is that this is somehow dependent on the size of the repository ? Probably a similar error as in another thread: #849 Error: No such file or directory (os error 2) (FreeBSD) An OS-dependent error, related to async I/O in Tokio module ?
This time, setting
RUST_BACKTRACE=0
for my shell no longer helps. I rather doubt that this time changing tobeta-8
will help anything. Pijul has very hard times running on BSD’s.I will post the debug session shortly…