Replies: 4 comments 11 replies
-
Very well written post, with tons of good information. Thanks! Disclaimer: I'm in favour of using Protobuf, and always rely on protojson during de-/serialization.
Even if the formal definition is done in protobuf, the canonical serialization can still be protojson generated JSON, and so JSON would be what the majority of the developers would interact with. We should also have all services expose both gPRC and HTTP APIs, e.g. Grafeas does this by using grpc-gateway: https://github.com/grafeas/grafeas/blob/master/go/v1beta1/server/server.go#L75
IMHO we should encourage all libraries to rely on the generated code via protobuf
IMHO we should discourage all libraries to manually parse JSON and always rely on protojson.
I think we should. And with by offering HTTP API with JSON (via the
Initially I'm in favour of the former (decode into to the protobuf generated format). With the future in mind, when all Sigstore services offers APIs defined by protobufs, this will not be an issue as they would natively be in the expected protojson format (or via gRPC direct).
I'm in favour of this. It simplifies mental model as we can add context without having long and hard to read attribute names.
IMHO yes, see above on using
I can help with this if wanted, as by defining the Sigstore bundle I think some documentation/comments have already been improved.
Having all certificates in the chain as done by Fulcio seems easier. But I don't have a strong opinion here. Protobuf has quite widespread support of languages, the official protbuf compiler supports:
Other implementations exists for: |
Beta Was this translation helpful? Give feedback.
-
Great writeup - I am in favor of protobuf, and continuing the pattern started in fulcio of using the grpc-gateway to expose both GRPC and HTTP/JSON interfaces. |
Beta Was this translation helpful? Give feedback.
-
+1000. Thanks for this thorough and clarifying write-up @znewman01 . In the community meeting I voted for JS because from my point of view, the current user facing API is the bundle, and IMO as long as it's stored as JS we should optimize that representation to be easy to parse and use... because folks inevitably will treat it as the API even if we'd rather they use it through our tools. I like @kommendorkapten arguments towards using protojson, as it sounds like it might allow us to have our cake and eat it too. Offline we chatted about potentially making the bundle opaque to discourage user slicing and dicing and minimize footguns. In that model cosign and the SDKs are the only API, at which point I think we could go all-in on proto and considerably simplify client code. |
Beta Was this translation helpful? Give feedback.
-
@kommendorkapten @bobcallaway anybody want to throw in 2c? You all good with (b)? |
Beta Was this translation helpful? Give feedback.
-
Sigstore has a few traditional APIs:
It's also got a few non-traditional "APIs," especially data formats:
There's not a lot of consistency across all of these, and there are some awkward bits:
Yes, that's base64'd PEM (which is, yes, already base64'd).
While there are relatively few consumers of this API, we have the opportunity to make some improvements. I believe that the new bundle format is looking good (thanks, @ kommendorkapten and reviewers) except we're still a little confused about how to encode things.
A key point of disagreement is: do we try to make the proto representation idiomatic, or the JSON representation idiomatic? I posit that we can't have both in the same format, and that trying to have both (following the spirit of this comment) is an underlying cause of a lot of inconsistencies and awkwardness.
Background on encoding
This section is a whirlwind tour of encodings for cryptographic structures, data formats, and APIs. It's not going to be able to do a good job, so click through if you really want to learn more. But hopefully it will help set the stage.
Data definitions all start with an interface definition language (IDL). This is a formal way to write down the format of data. In Sigstore, we care about three:
You can also skip the IDL, and just write a server that handles different routes as CTFE does; I recommend against this, because the IDL is a nice, declarative way to document and define an interface all at once.
At the end of this process, you've described a representation of an abstract thing. For instance, a "birthday" is an abstract concept; an
(int, int)
tuple representing month and day is the representation. Note that at this point we still don't know how to write our abstract concept as bits and bytes! For that, we need an encoding. The relevant ones here are:0123456789ABCDEF
.You may compose these encodings. For instance, you may take DER-encoded value and PEM-encode it. Some of these encodings, like JSON, don't support binary data, so such data must be encoded in a text-friendly manner.
Sigstore is moving towards proto for its interfaces (source: I looked at two data points and drew a line). Protos have their own encoding, but also support serialization to JSON, and this is often how they're used for remote procedure calls (RPCs) over the web.
Example encodings
Here is a table of possible encodings of most types that appear in either the bundle, Fulcio API, or Rekor API, including representations I was able to find and some reasonable alternatives; it also has my overly-opinionated take on how idiomatic the encodings to proto and JSON are.
SHA2_256
)sha256:...
)SHA256(DER(pubkey))
(a) @haydentherapper likes PEM, but IMO the point of PEM is that it's for when you don't have an easy way to encode bytes or communicate the data format; in the proto setting, we have both. PEM parsing can be the source of bugs, and in some languages, PEM requires an additional dependency over common cryptographic libraries.
(b) PEM is just DER + base64 with line wrapping and a header/footer line, so this is almost PEM.
(c) Algorithm-dependent, opaque binary blob.
(d) You can't really encode DER directly into a string; I haven't figured out how to actually trigger this code path from the API.
(e) Stringly-typed, so you could consider that idiomatic if you like that sort of thing.
Some arguments for each
Arguments for a proto-first worldview:
For JSON-first worldview:
jq
.For a proto-first worldview, but use PEM for certs/keys:
I'm sure I'm missing a lot; that's why this is a discussion! Please weigh in 😄
Open Questions
Beta Was this translation helpful? Give feedback.
All reactions