Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures
draft-ietf-cbor-cddl-08
Yes
(Alexey Melnikov)
No Objection
(Alvaro Retana)
(Deborah Brungard)
(Martin Vigoureux)
(Spencer Dawkins)
(Terry Manderson)
Note: This ballot was opened for revision 05 and is now closed.
Alexey Melnikov Former IESG member
Yes
Yes
(2018-11-19 for -06)
Not sent
Adam Roach Former IESG member
(was Discuss)
No Objection
No Objection
(2018-11-19 for -06)
Sent for earlier
I also have a handful of non-critical comments of varying importance. Please expand "CBOR": (1) In the title (2) Upon first use in the document body See https://www.rfc-editor.org/materials/abbrev.expansion.txt for details. --------------------------------------------------------------------------- §1.2: > New terms are introduced in _cursive_. CDDL text in the running text > is in "typewriter". This is perplexing, as I know of no tool that will render the canonical form of current RFCs in the way being described. Is the intention to hold this document until the new RFC format is available? --------------------------------------------------------------------------- §2: > The rest of this section introduces a number of basic concepts of > CDDL, and section Section 3 defines additional syntax. Appendix C Nit: "...and Section 3..." --------------------------------------------------------------------------- §2.2.2: > delimited by a "//" (double slash). Note that the "//" operators > binds much more weakly than the other CDDL operators, so each line Nit: "...operator binds..." or "...operators bind..." --------------------------------------------------------------------------- §3.1: > o A name can consist of any of the characters from the set {'A', > ..., 'Z', 'a', ..., 'z', '0', ..., '9', '_', '-', '@', '.', '$'}, This looks like a formal syntax of some kind, but I don't know where it's defined. Notably, since this document has just defined ".." to be an inclusive range operator and "..." to be an exclusive range operator, defining the set of allowed characters in this way seems to run the risk of interpreting, e.g., "Z" to be disallowed. I suggest either defining the set of allowed characters using a formally defined and cited grammar (e.g., ABNF), or using prose. --------------------------------------------------------------------------- §3.1: > o outside strings, whitespace (spaces, newlines, and comments) is > used to separate syntactic elements for readability (and to > separate identifiers or numbers that follow each other); it is > otherwise completely optional. This seems nominally at odds with the following text in §2.2.2.1, which points to at least one other case where whitespace is mandatory: > When using a name as > the left hand side of a range operator, use spacing as in > > min .. max > > to separate off the range operator. --------------------------------------------------------------------------- §3.1: > If prefixed as "h" or "b64", the string is > interpreted as a sequence of pairs of hex digits (base16) or a > base64(url) string, respectively Please normatively cite RFC 4648, sections 8 and 5 respectively. --------------------------------------------------------------------------- §3.8.1: > When applied to an unsigned integer, the ".size" control restricts > the range of that integer by giving a maximum number of bytes that > should be needed in a computer representation of that unsigned > integer. In other words, "uint .size N" is equivalent to > "0...BYTES_N", where BYTES_N == 256**N. > > audio_sample = uint .size 3 ; 24-bit, equivalent to 0..16777215 > > Figure 9: Control for integer size in bytes While they're semantically the same, the example is oddly mismatched with the preceding text. Consider instead: audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216 --------------------------------------------------------------------------- Appendix B: > / "#" "6" ["." uint] "(" S type S ")" ; note no space! No space where? I see two space productions in that rule (so it clearly applies to some specific location), and there are several places where spaces cannot appear. > type1 = type2 [S (rangeop / ctlop) S type2] This rule doesn't seem to properly capture the ambiguity of "a...b". There is a terribly complex way to address this by defining parallel "type2" and "type3" rules that differ only in whether a dot is allowed to appear in their value, and defining type1 as requiring a space after the type that can contain dots -- but that is probably overkill. It's probably sufficient to reiterate the warning about requiring a space under such circumstances as a comment on this rule. > HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" It is a common implementor mistake to forget that ABNF is, by default, case-insensitive. It is probably worth adding a comment here as a reminder. (The same applies to "0x", "0b", "e", and "p" above, but these seem less likely to appear in arbitrary case.) --------------------------------------------------------------------------- Appendix B: > SCHAR = %x20-21 / %x23-5B / %x5D-10FFFD / SESC > SESC = "\" %x20-10FFFD ... > PCHAR = %x20-10FFFD These almost certainly should be: SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC SESC = "\" %x20-7E / %x80-10FFFD ... PCHAR = %x20-7E / %x80-10FFFD (i.e., exclude the control character %x7F) --------------------------------------------------------------------------- Appendix C: > (It is not an error to extend a rule name > that has not yet been defined; this makes the right hand side the > first entry in the choice being created.) Is it an error to redefine a rule name that has already been defined?
Alissa Cooper Former IESG member
No Objection
No Objection
(2018-11-19 for -06)
Sent
Thank you for a clear document and for addressing the Gen-ART review comments.
Alvaro Retana Former IESG member
No Objection
No Objection
(for -06)
Not sent
Ben Campbell Former IESG member
No Objection
No Objection
(2018-11-20 for -06)
Sent
Most of my comments have already been captured by others, save one: Is there a specific reason the normative appendices are not part of the main body? I think a lot of RFC readers assume that appendices are optional to read. We should not surprise them without reason.
Benjamin Kaduk Former IESG member
No Objection
No Objection
(2018-11-18 for -05)
Sent
Thanks for updating the editor's copy pursuant to the secdir review! As I was reading, I wondered about potential confusion between a numerical value and the corresponding text string when used as a keytype, especially for barewords. The bareword ABNF requires a leading EALPHA, which should force the right parsing, while the memberkey ABNF still allows literal values to be used as keys. I do wonder, though, if the 'id' ABNF's limitations on textual names (i.e., strings that could be interpreted as numbers are disallowed) should be mentioned in the main text as how disambiguation is enforced in general. It's a little weird to use PersonalData as an example, given the privacy considerations inherent in storing personal data, but I guess this is not really a flaw in the spec. Section 1 Nit: bullet (G3) lacks grammatical parallelism with its sibling bullets; something like "Be able to" would restore parity. Section 2 1. Instead of defining all four types of composition in CDDL separately, or even defining one kind for arrays (vectors and records) and one kind for maps (tables and structs), there is only one kind of composition in CDDL: the _group_ (Section 2.1). This perhaps reads a bit strongly, as we do go on to define syntactic sugar for arrays and maps, even though they build on the shared group abstraction. Section 2.1 Note that the (curly) braces signify the creation of a map; the groups themselves are neutral as to whether they will be used in a map or an array. [...] Note that the lists inside the braces in the above definitions constitute (anonymous) groups, while "identity" is a named group. I might add another sentence in one of these places foreshadowing the behavior that groups are "macro-like" the sense that when used in the description of another group, their contents are siblings of the elements that are new in the other group, as opposed to being part of a nested structure. Section 3.1 o CDDL uses UTF-8 [RFC3629] for its encoding. It's pretty rare for it to be sufficient to just say "UTF-8" in a technical spec; what kind of internationalization review has been done? Do we need to specify anything about normalization or canonicalization? Section 3.5.1 The "struct" usage of maps is similar to the way JSON objects are used in many JSON applications. A map is defined in the same way as defining an array (see Section 3.4), except for using curly braces "{}" instead of square brackets "[]". Taken together, these paragraphs read as if (1) a struct is a type of map, and (2) a map uses curly brackets. But the following example shows a struct as enclosed within square brackets. Where am I going wrong? GpsCoordinates = { longitude : uint, ; multiplied by 10^7 latitude : uint, ; multiplied by 10^7 } It is perhaps irresponsible to include an example that does not specify the units of the measurement (e.g., degrees or radians). Section 3.8.6 value from being sent over the wire. This control is only meaningful when the control type is used in an optional context; otherwise there would be no way to express the default value. Maybe s/express/utilize/? That is, the ".default" control still expresses what the default value would be, but that information would never be used. Section 5 o Where the CDDL includes extension points, the impact of extensions on the security of the system needs to be carefully considered. Would it make sense to also add guidance for judicious use of .within to constrain extension points? Writers of CDDL specifications are strongly encouraged to value simplicity and transparency of the specification over its elegance. Keep it as simple as possible while still expressing the needed data model. Perhaps "simplicity of [type] constructions", since some readers may equate simplicity [of design] and elegance. Section 6.1 I don't really understand why there's a need for distinctions based on the presence of an internal dot, especially given that this document does not define any such operators. What would such a control operator look like? Section 7.2 It seems that RFC 4648 might need to be a normative reference given that it specifies how some byte string literals are interpreted in EDN. Appendix B On first glance I wonder if some of the S should be 1*WS to avoid parsing ambiguities, but I did not think about it very hard. Note that this ABNF does not attempt to reflect the detailed rules of what can be in a prefixed byte string. Before I made it this far, I was going to note that the "bytes" definition seems to allow me to use a "h" or b64" prefix with "arbitrary" contents; it seems that an alternate construction could embody the semantic restrictions for such strings into the ABNF. How bad would it be if a future update to this document attempted to actually reflect the "detailed rules of what can be in a prefixed byte string"? Appendix D I can't decide if most of the "#" entries need double-quotes around them to parse properly as ABNF. Is it best to think about this CBOR major/minor notation as an extension to standard ABNF?
Deborah Brungard Former IESG member
No Objection
No Objection
(for -06)
Not sent
Eric Rescorla Former IESG member
(was Discuss)
No Objection
No Objection
(2019-03-23 for -07)
Sent
Thank you for addressing my DISCUSS
Martin Vigoureux Former IESG member
No Objection
No Objection
(for -06)
Not sent
Spencer Dawkins Former IESG member
No Objection
No Objection
(for -06)
Not sent
Suresh Krishnan Former IESG member
No Objection
No Objection
(2018-11-21 for -06)
Sent
* Section 3.8.1 Looks like there is an off-by-one error here. Shouldn't BYTES_N == 256**N be BYTES_N == 256**N-1 instead?
Terry Manderson Former IESG member
No Objection
No Objection
(for -06)
Not sent