The sound distributed version control system

#585 Fix the hunk_roundtrip test

Closed on February 22, 2023
potocpav on December 9, 2021

I fixed the hunk_roundtrip_test. I had to do some small cleanups in the parser to fix it. I think there were actually some edge-case bugs.

To better parse edge lists, I changed the syntax to include square brackets around them. Therefore, what used to be this:

21. Edit in "libpijul/src/change/parse.rs":358 6.64713 "UTF-8"
B:BD 6.74568 -> 9.293:336/9
-     let has_encoding = encoding.is_some();

would now be this:

21. Edit in "libpijul/src/change/parse.rs":358 6.64713 "UTF-8"
[B:BD 6.74568 -> 9.293:336/9]
-     let has_encoding = encoding.is_some();

This is nice to have when there are empty edge lists. Before, the line would just be missing, but that is a bit harder to parse.

If you don’t like this change, @pemeunier, I will revert that, and adapt the parser to handle the case without the brackets correctly.

pmeunier on December 9, 2021

That’s a pretty big change, and a breaking one. I didn’t see any failure of that test since #HW7DZ2B42HNF35ZIJAVB4RMQS6CTKUWX5DBBBVF6P4ZQKHGUCTCQC, did you see failures?

potocpav added a change on December 9, 2021
O3HBS22OJLTL737HXDAVOFIAXBD536LYUVHELWNMIF6UPKLDWSPQC
potocpav on December 9, 2021

Yeah, I will just revert this part of the change.

pmeunier on December 9, 2021

A related concern: I don’t know much about quickcheck, but I loved your test’s simplicity. Do you think we can make it deterministic?

potocpav on December 9, 2021

What do you mean by that? I think a seed can be supplied to QuickCheck to make it sort-of deterministic (changing only when you change the data representation or generation parameters)

potocpav on December 9, 2021

I was definitely seeing test failures, of multiple kinds. Were you not? I don’t know how that would be possible. I used

cargo test hunk_roundtrip_test

EDIT: Errors were (at least) in the constructors Replacement, FileAddition, and one out-of-bounds array access

pmeunier on December 9, 2021

I wasn’t! The best we can do is to add a random seed, and have panics show the seed, so we still get random samples, but we can also reproduce failures and fix them.

Another comment I just saw about parse_contents, here is something I wrote:

        let not_empty = if backslash.is_some() && vec[vec.len() - 1] == b'\n' {
            vec.pop().is_some()
        } else {
            !vec.is_empty()
        };

I didn’t think much about vec[vec.len()-1], but that can panic if vec.is_empty(). If you want to fix that in your patch’s next iteration, you can!

I won’t touch that file before reading your next version anyway.

potocpav added a change on December 9, 2021
C4TC7UKFYIOOABSXDCS5H4EVOMAIZK5OEQ5IF4O7RQYY72AZJRVQC
potocpav added a change on December 9, 2021
D2YXPJIY5E4TEPUQIAJGTNVE3JJISUVTXYSNSOALUQE546WV44EQC
potocpav on December 9, 2021

The snippet you showed here is probably fine. The problem with out-of-bounds was on this line:

if e[0].flag.deleted {
    "-"
} else {
    "+"
}
potocpav added a change on December 9, 2021
YQMLICLW2FABJBX5AQGPZIHICYLECR53TLPEBGUEH23YNKIMKC7AC
main
potocpav on December 9, 2021

Sorry, I was having trouble cleaning up the patches and pushing them correctly. The last one should be OK. I (of course) removed the breaking change.

Question: Why is the block if has_encoding || not_empty { .. } there?

        let not_empty = if backslash.is_some() && vec[vec.len() - 1] == b'\n' {
            vec.pop().is_some()
        } else {
            !vec.is_empty()
        };
        if has_encoding || not_empty {
            return Ok((i, vec));
        }

I replaced it with simple Ok((i, vec)) in my patch. Is there something I’m missing? The roundtrip was failing on it.

Also, as a side-note, the first if is fully equivalent to:

        let not_empty = !vec.is_empty();
        if backslash.is_some() && vec[vec.len() - 1] == b'\n' {
           vec.pop();
        }
pmeunier on February 22, 2023

Thanks! And sorry it took me so long to get back to this.

I don’t fully remember, but I believe the has_encoding block had to do with patches on binary files, which do not have an encoding defined.

But the previous solution was indeed incorrect, as we shouldn’t be returning errors as soon as there is no encoding.

pmeunier closed this discussion on February 22, 2023