container: typo and clarity in the spec

This commit is contained in:
Michael Muré
2025-01-08 18:55:08 +01:00
parent 8f3f1c775e
commit c792a4cce5

View File

@@ -25,9 +25,11 @@ The UCAN spec itself is transport agnostic. This specification describes how to
## 2.1 Inner structure ## 2.1 Inner structure
UCAN tokens, regardless of their kind ([Delegation], [Invocation], [Revocation], [Promise]) MUST be first signed and serialized into bytes according to their respective specification. As the token's CID is not part of the serialized container, any CID returned by this operation is to be ignored. UCAN tokens, regardless of their kind ([Delegation], [Invocation], [Revocation], [Promise]) MUST be first signed and serialized into DAG-CBOR bytes according to their respective specification. As the token's CID is not part of the serialized container, any CID returned by this operation is to be ignored.
All the tokens' bytes MUST be assembled in a [CBOR] array, which is then inserted as the value under the `ctn-v1` string key, in a CBOR map. The ordering of tokens in the array MUST NOT matter. For clarity, the CBOR shape is given below: All the tokens' bytes MUST be assembled in a [CBOR] array, which is then inserted as the value under the `ctn-v1` string key, in a CBOR map. The ordering of tokens in the array MUST NOT matter. Also, this array SHOULD NOT have duplicate entries.
For clarity, the CBOR shape is given below:
```json ```json
{ {
@@ -41,12 +43,12 @@ All the tokens' bytes MUST be assembled in a [CBOR] array, which is then inserte
## 2.2 Serialisation ## 2.2 Serialisation
To serialize the container into bytes, the inner CBOR structure MUST be serialized into bytes according to the CBOR specification. The resulting bytes MAY be compressed by a supported algorithm, then MAY be encoded with a supported base encoding. To serialize the container into bytes, the inner CBOR structure MUST then be serialized into bytes according to the CBOR specification. The resulting bytes MAY be compressed by a supported algorithm, then MAY be encoded with a supported base encoding.
The following compression algorithm are REQUIRED to be supported: The following compression algorithms are REQUIRED to be supported:
- [GZIP] - [GZIP]
The following base encoding combination are REQUIRED to be supported: The following base encoding combinations are REQUIRED to be supported:
- base64, standard alphabet, padding - base64, standard alphabet, padding
- base64, URL alphabet, no padding - base64, URL alphabet, no padding
@@ -61,23 +63,25 @@ The CBOR bytes MUST be prepended by a single byte header to indicate the selecti
| 0x4F | O | base64 std padding | gzip | | 0x4F | O | base64 std padding | gzip |
| 0x50 | P | base64 url (no padding) | gzip | | 0x50 | P | base64 url (no padding) | gzip |
For clarity, the resulting serialisation is in the form of `<header byte><cbor bytes, optionally compressed, optionally encoded>`.
# 3 FAQ # 3 FAQ
## 3.1 Why not include the UCAN CIDs? ## 3.1 Why not include the UCAN CIDs?
Several attacks are possible if UCAN tokens aren't validated. If CIDs aren't validated, at least two attacks are possible: [privilege escalation] and [cache poisoning], as UCAN delegation proofs depends on a correct hash-linked structure. Several attacks are possible if UCAN tokens aren't validated. If CIDs aren't validated, at least two attacks are possible: [privilege escalation] and [cache poisoning], as UCAN delegation proofs depends on a correct hash-linked structure.
By not including the CID in the container, the recipient is forced to hash (and thus validate) the CIDs for each token. If presented with a claimed CID paired with the token bytes, implementers could ignore CID validation, breaking a core part of the proof chain security model. Hash functions are very fast on a couple kilobytes of data so the overhead is still very low. It also reduces significantly the size of the container. By not including the CID in the container, the recipient is forced to hash (and thus validate) the CIDs for each token. If presented with a claimed CID paired with the token bytes, implementers could ignore CID validation, breaking a core part of the proof chain security model. Hash functions are very fast on a couple kilobytes of data so the overhead is still very low. It also significantly reduces the size of the container.
## 3.2 Why compress? Why not always compress? ## 3.2 Why compress? Why not always compress?
Compression is a relatively demanding operation. As such, using it is a tradeoff between size on the wire and CPU/memory usage, both when writing and reading a container. The transport itself can make compression worthwhile or note: for example, HTTP/2 and HTTP/3 headers are already compressed, but HTTP/1 headers are not. This being highly contextual, the choice is left to the final implementer. Compression is a relatively demanding operation. As such, using it is a tradeoff between size on the wire and CPU/memory usage, both when writing and reading a container. The transport itself can make compression worthwhile or not: for example, HTTP/2 and HTTP/3 headers are already compressed, but HTTP/1 headers are not. This being highly contextual, the choice is left to the final implementer.
# 4 Implementation recommendations # 4 Implementation recommendations
## 4.1 Dissociate reader and writer ## 4.1 Dissociate reader and writer
While it tempting to write a single implementation to read and write a container, it is RECOMMENDED to separate the implementation into a reader and a writer. The writer can simply accept arbitrary tokens as bytes, while the reader provide a read-only view with convenient access functions. While it is tempting to write a single implementation to read and write a container, it is RECOMMENDED to separate the implementation into a reader and a writer. The writer can simply accept arbitrary tokens as bytes, while the reader provides a read-only view with convenient access functions.
# 5 Acknowledgments # 5 Acknowledgments