This is rather surprising, are you sure you compiled with --release
? @tankf33der tests Pijul on the Linux kernel regularly, it’s much much much faster than that (like 100-1000 times at least).
There was one regression lately in the ZStd library, where compressing patches would take thousands of time more time than before, but I doubt that’s enough to explain your results.
Yes, I did these tests before and it was much faster, something must have broken. That was like 3-4 month ago.
It seems I built release:
pijul v1.0.0-alpha.52
Finished release [optimized] target(s) in 3m 17s
cargo install pijul --version "~1.0.0-alpha"
But, I now tried from the repository:
The latest change:
Change OQWST3TEEDXBU44SFXPV2SBWYLNMQDIXWYAGHUBHNUB6LGYJENNQC
Author: FZQ2g7VfnzLYM4mtTVDk9HAZjA8Jk9ndkwN1icgbtWUr
Date: 2021-07-28 08:20:11.733136979 UTC
Fixing errors in pijul-git
cd pijul
cargo build --release
Compiling pijul v1.0.0-alpha.52 (/home/ganwell/git/pijul/pijul)
Finished release [optimized] target(s) in 3m 07s
Is that correct?
The new results:
$> time /home/ganwell/git/pijul/target/release/pijul record -a -m "init"
Hash: 7MEAMB2ZLLKUO6KJ5F55X6RWWAMYLMMNPML7DCN7FRNT25ETZVEQC
real 42m1.438s
user 41m54.659s
sys 0m2.751s
$> time /home/ganwell/git/pijul/target/release/pijul fork new
real 0m0.026s
user 0m0.004s
sys 0m0.004s
$> time /home/ganwell/git/pijul/target/release/pijul channel switch new
Outputting repository ↑
real 315m35.897s
user 328m8.412s
sys 130m54.262s
This is my performance analysis pijul vs. linux (latest, version 5.14, 1M commits, 72.8k files, 4.8k dirs).
Long story short: pijul is OK
in this scenario.
My environment - cheap, 5y old, HP laptop - “Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz” with SSD.
$ git clone https://github.com/torvalds/linux
$ cd linux
$ rm -rf .git
$ time pijul init
real 0m0,020s
user 0m0,004s
sys 0m0,012s
$ time pijul add -r .
real 0m1,128s
user 0m2,021s
sys 0m1,719s
$ pijul ls | wc -l
77646
$ time pijul record -am"."
Hash: P3XIHBZAULWGSEHAYYQPRJ7WIEXFPGJGT2V24HUFBNS3TZ6PMH7AC
real 1m18,132s
user 1m14,686s
sys 0m3,139s
$
Lets stop here.
@ganwell, check my timings, you initial record was 42min.
What is your local zstd library version?
For anybody who care about performance - darcs
can not initially record linux 2.0 sources out of the box.
@tankf33der I use the version from the OS
$> ldd /home/ganwell/.cargo/bin/pijul
linux-vdso.so.1 (0x00007fffd3b42000)
libzstd.so.1 => /usr/lib/libzstd.so.1 (0x00007f04fad33000)
$> pamac list | grep zstd
lib32-zstd 1.5.0-2 multilib 1.1 MB
zstd 1.5.0-1 core 3.8 MB
What version do I need? Or is it vendored somewhere somehow?
1.5.0 have the performance degradation, try to downgrade to 1.4.9 you will get performance boost.
Now pijul is fast again. Thanks alot!
$> time pijul record -a -m "init"
Hash: OTFSNWOHD4UWAZLG3JNOYEMH4QJNBZHGZELK6SXWCUR5U3ZYW6CAC
real 0m54.084s
user 0m51.786s
sys 0m1.790s
By the way, there are probably things we could do to make it faster. @ganwell, if you’re interested, feel free to add timers to the functions in libpijul::record
and see what is fast and what isn’t.
The other part of recording a patch is applying it, but I don’t really think we can gain much in that area, although timing libpijul::apply::apply
wouldn’t do any harm.
If it turns out the major part of the time is spent in ZStd, there isn’t much we can do, but I don’t think that’s the case.
I am interested in making Pijul as fast as possible (even though correctness and debugging is my top priority ATM).
tankf33der/peace-and-war
run book getting much slower after ~350 records and import history from pkgsrc getting slower every ~1000 records.
@pmeunier channel switch is still as slow as before. I started to only use shallow repositories, not importing the history, so no problem for me. Still I tried to profile:
https://user.fm/files/v2-841e66796d91beac79b8ccfb59869eb6/profile.png
Some questions I would ask myself if I knew the code better:
It seems find_path() is called from Reset::reset, I think this means via a queue.
The project has 1700 files and 1400 changes.
Here is the profile: https://user.fm/files/v2-27e087b672a16aa1844e5340ec54d453/callgrind.out.good.bz2
You can open it with kcachegrind.
EDIT: The project is caluma again. https://github.com/projectcaluma/caluma
If you want to profile you need to lower the min stack size:
RUST_MIN_STACK=0 valgrind –tool=callgrind /home/ganwell/git/pijul/target/release/pijul channel switch new
Otherwise the program will go into a backtrace loop. valgrind only provides 1mb of stack and rust expects more (not knowing that it is running in the valgrind-vm)
http://www.codeofview.com/fix-rs/2017/01/24/how-to-optimize-rust-programs-on-linux/
The above makes it sound like I profiled the complete channel switch, but I couldn’t after a while rust will use too much stack. So question 2. can’t be answered, sorry.
Thanks for that. I don’t know if find_path
is called too often. If you don’t use more than one channel, get
should return instantly when called from rc
.
I can avoid channel switches with git-pijul now.
I believe this discussion is resolved, feel free to reopen if not.
(I’m trying to close as many discussion as possible in order to move to beta).
I tested with the linux-kernal again and I wanted to share my timing tests.
I gave up on switching a channel, I did it last night and it finished in the morning, so it definitely finishes. Switching channels is in general a slow operation, also in my very small projects.
System