The sound distributed version control system

#505 Structured CLI output

Opened by ammkrn on August 3, 2021
ammkrn on August 3, 2021

I’d like to solicit opinions/concerns about having a flag for structured output from the CLI in the same way cargo has a --message-format flag. The idea is to let users more easily write scripts, programs, and tools by getting pijul’s output in something like JSON or another serde serialization target. There has already been some discussion on Zulip about how to create workflows that accommodate certain styles or preferences which appear very doable as just compositions of existing commands and their output.

At first blush it seems like users should be able to get all of the information they need by just composing CLI commands and piecing together the JSON output, avoiding something nuts like git pretty.

zseri on August 4, 2021

I support this. It should be noted that in https://nest.pijul.com/pijul/pijul:main/SL45MHGVMBZRS.DIAAA I already used the existing format, which is sometimes already easily parsable.

In pijul2svg.sh I used pijul log | grep ^Change which could be optimized to pijul log --hash-only. It would be nice if something similar could be added to output only parts of changes, e.g. pijul change --dependencies-only or pijul change --section Dependencies, which would only print the Dependencies section.

For things for which newline-separated listing suffices, I don’t think that needs to change. But I think it should be possible to render the change format into something else than the current weird combination of Markdown, TOML and patches…

spacefrogg on August 6, 2021

I also support this. This should be an extensible output format, meaning the writer should have liberty to add new fields and the reader should be able to skip unknown fields. The specification should also contain presentation of arbitrary binary data as text.

I would throw GNU rec into the ring, here, as it provides explicit format specification and can represent data as text. Also, it is human-readable, -editable.

ammkrn on August 7, 2021

Thanks for that script as a data point zseri. I agree that allowing people to get specific subsets of information is a good goal.

To spacefrogg’s point, Ideally this would just use serde so everyone can have the format they want (though possibly behind a feature flag). I agree that at least for now it should just be extensible (but documented) rather than something with a strict schema.

ammkrn on August 10, 2021

Now that I’m starting to work on this I think we might have to pick an output format and go with it, at least initially. Serde requires the whole data set to be serialized first in order to format/write the thing as a JSON object, but for very large data sets (like if someone wanted to get the entire change log for a big/old project as JSON) you really want the output to be streaming so it doesn’t take up a huge amount of memory. At first blush this seems like it will require writing a printer that carries around state for keeping track of scopes.

zseri on August 11, 2021

common Serde formats can also serialize to a Writeable object, e.g. stdout

ammkrn on August 12, 2021

Oh, right. Good eye.

ammkrn added a change on August 14, 2021
OU6JOR3CDZTH2H3NTGMV3WDIAWPD3VEJI7JRY3VJ7LPDR3QOA52QC
main
ammkrn on August 14, 2021

@zseri @pmeunier @spacefrogg

This combines the filtering and the serialization, but let me know a) what you think of this way of doing the streaming serialization thing, and b) if you have opinions on what the interface should be for excluding/choosing certain fields. The plan is to do something similar for the other CLI commands as I get time. There are notes in the description field of the patch.

pmeunier on September 9, 2021

Done! The only thing I changed is the double formatting of the error, since serde::ser::Error::custom already takes a Display.

pmeunier closed this discussion on September 9, 2021
pmeunier on September 9, 2021

I guess I’ll leave it open, so that we can do the same for other commands in the future. Thanks for the change!

pmeunier reopened this discussion on September 9, 2021
ammkrn added a change today at 06:41
structured output for pijul change created yesterday at 05:37
HCJ7BOUVOEFS6X7OTHI6JZR2ASFIGEPQCTL5K3JTSBI7Y7M7W3KQC
ammkrn today at 06:42

@pmeunier

I got around to doing the hard one (change); credit and diff should be relatively tame by comparison. There’s almost certainly some stuff you’ll want to change here since you have a better idea of what info is/isn’t going to be important, specifically the rendering of edge_map/new_vertex, and the best names for fields such that they make sense to consumers.

The general approach I took was to wrap stuff in newtypes called Pretty<x> which encompass the relevant state and have their own implementation of Serialize. There was a lot of hand-rolling, but it isn’t yet clear to me whether there’s a better or more mechanical way to do this. I’ve used a derive macro for some similar stuff in the past, but the state has to be much more centralized, and it’s not going to work well when you need specific logic to extract stuff (like slicing to get the change contents).

My assumption is that the format of Hunks isn’t going to undergo huge changes in the near future, so hopefully it won’t be too much of a maintenance burden. If you’re interested the write_<x> methods can also be moved into Display implementations on the Pretty<x> structures.