Compacted-DNS (C-DNS): A Format for DNS Packet Capture
draft-ietf-dnsop-dns-capture-format-10
Yes
Warren Kumari
No Objection
(Deborah Brungard)
(Ignas Bagdonas)
(Spencer Dawkins)
Recuse
Note: This ballot was opened for revision 08 and is now closed.
Warren Kumari
Yes
Adam Roach Former IESG member
No Objection
No Objection
(2018-11-20 for -08)
Sent
I support Benjamin's DISCUSS regarding a treatment of the privacy issues related to this capture format. --------------------------------------------------------------------------- id-nits reports: ** There are 11 instances of too long lines in the document, the longest one being 9 characters in excess of 72. -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? (See https://www.iab.org/2016/11/07/iab-statement-on-ipv6/ for more information about the second issue) --------------------------------------------------------------------------- §5: > o CBOR is an IETF standard and familiar to IETF participants. It is While CBOR is standards-track, it's nowhere near standard yet. Suggest: "...is an IETF specification..." (See BCP 9) --------------------------------------------------------------------------- §9.1: > DNS style name compression is used on the individual names within the Nit: "DNS-style" --------------------------------------------------------------------------- Appendix A: > file-type-id : tstr .regexp "C-DNS", I'm far from a CDDL expert, but I just read through that specification, and it seems to me that this is a bit overwrought. I think you can accomplish the same with the much simpler production: file-type-id : "C-DNS", Similarly: > major-format-version => uint .eq 1, > minor-format-version => uint .eq 0, would seem to mean the same as the simpler: > major-format-version => 1, > minor-format-version => 0, --------------------------------------------------------------------------- Appendix B: > The next name added is bar.com. This is matched against example.com. bar.com is allocated to a private individual who has already had to contend with a lot of unwanted traffic (see https://www.bar.com/ for details). We should consider not making things worse for them. Please use an RFC 2606 address instead.
Alexey Melnikov Former IESG member
(was Discuss)
No Objection
No Objection
(2018-11-30 for -09)
Sent
Thank you for addressing my DISCUSS and comments.
Alissa Cooper Former IESG member
No Objection
No Objection
(2018-11-20 for -08)
Sent
I support Benjamin's first DISCUSS point. In addition to documenting the privacy considerations, I think it's important for this document to be crystal clear about who is meant to be doing the data collection -- namely, the server operator. There are some statements in the document that otherwise could be construed to be encouraging third-party passive monitoring of DNS traffic without explaining why, which seems like a problem: Section 1: "There has long been a need to collect DNS queries and responses on authoritative and recursive name servers for monitoring and analysis." Section 3: "In an ideal world, it would be optimal to collect full packet captures of all packets going in or out of a name server."
Alvaro Retana Former IESG member
No Objection
No Objection
(2018-11-19 for -08)
Not sent
I support Benjamin's DISCUSS.
Ben Campbell Former IESG member
No Objection
No Objection
(2018-11-20 for -08)
Sent
Benjamin Kaduk Former IESG member
(was Discuss)
No Objection
No Objection
(2018-12-02 for -09)
Sent
Thank you for addressing (or being in the process of addressing) my Discuss points! Adding IANA registries for the bitmaps will address that concern, so I am proactively clearing my position on the basis that those registry creations are in the works. [Original COMMENT section preserved below] Section 2 Please consider using the RFC 8174 version of the BCP 14 boilerplate. Section 3 Because of these considerations, a major factor in the design of the format is minimal storage size of the capture files. maybe "storage and transmission"? Section 6 In Figure 2, the Query name is marked as "(q)" (only present if there is a query), but the running text in Section 4 (bullet 1) says that the Question section from the response can be used as an identifying QNAME if there is a response with no corresponding query. Am I misexpanding QNAME here, or is there a disagreement between these two parts of the text? In particular, I do not see a part of Figure 2 that would correspond to a Question section in the response, given the various "(q)"/"(r)" markings. Section 6.2.2 Messages with OPCODES known to the recording application but not listed in the Storage Parameters are discarded (regardless of whether they are malformed or not). (Do we need to say anything that the "discarded" is only w.r.t. the capture process, and not meant to imply that DNS queries would not get a normal response?) Section 6.2.4 Please consider using IPv6 examples, per https://www.iab.org/2016/11/07/iab-statement-on-ipv6/ . Section 7.2 o The column T gives the CBOR data type of the item. * U - Unsigned integer * I - Signed integer This is venturing a bit far from my normal area of expertise, but my understanding is that CBOR native major types are only provided for unsigned integer and negative integer, with "signed integer" being an abstraction at a slightly higher layer that needs to be managed in the application. Do we need to add any clarifying text here or will the meaning be clear to the reader? Section 7.4 Should probably forward-reference section 8 for the format version numbers' semantics. Section 7.4.1.1 We should we reference the IANA registries by name for any of these fields (e.g., opcodes, rr-types, etc.). (Also in Section 7.5.3.1, etc.) Are the storage flags going to be allocated in sequence by updating standards-track documents, or some other mechanism? (Is a registry necessary?) For the various address prefix fields, do we need to specify that the full addresses are stored when the corresponding prefix field is absent? Section 7.4.1.1.1 Am I parsing the "query-response-hints" text correctly to say that a bit is set in the bitmap if the corresponding field is recorded (if present) by the collecting implementation? The causality of "if the field is omitted the bit is unset" goes in a direction that is not what I expected. (Similarly for the other fields in this table.) Section 7.4.2 Do we need a reference for "promiscuous mode"? Just to check: in "server-addresses", I just infer the IP version from the length of the byte string? Do we need to say more about where the vlan-ids identifiers are taken from? Is the "generator-id" string intended to only be human readable? Only within a specific (administrative) context? Section 7.5.1 Does "earliest-time" include leap seconds? Section 7.5.3 The "ip-address" description seems to imply that very short ipv6 prefix lengths could cause confusion as to the address type being indicated (e.g., setting to 32 when no ipv4 prefix length is set, or setting to the same value as the ipv4 prefix length). Do we need to restrict the ipv6 prefix lengths to being 33 or larger? Are the "name-rdata" contents in wire format or presentation format? Section 7.5.3.2 What's the allocation policy/procedure for the remaining qr-transport-flags transport values? For additional bits in any/all of the flags fields listed here? Something of a side note, what's the mnemonic for the "sig" in "qr-sig-flags"? That is, what is it a signature of or over (it doesn't seem like it's a cryptographic signature, which may be what is confusing me)? For "query-rcode"/"response-rcode", should there be a reference for "OPT", and/or for any of the EDNS stuff in here? (The Terminology section only mentions using the naming from RFC 1035, that I can see.) The "mm-transport-flags" here bear a striking resemblance to the "qr-transport-flags" from Section 7.5.3.2; should there be a shared registry for their contents? (I guess the TransportFlags CDDL to some extent serves this function.) Section 7.7 How is the value of the "ae-code" determined? Appendix A We could perhaps apply some constraints on (e.g.) the address-prefex length fields to be .le the relevant lengths. Appendix C.6 Using a strong compression, block sizes over 10,000 query/response pairs would seem to offer limited improvements. nit: Using a strong compression scheme
Deborah Brungard Former IESG member
No Objection
No Objection
(for -08)
Not sent
Eric Rescorla Former IESG member
No Objection
No Objection
(2018-11-20 for -08)
Sent
Rich version of this review at: https://mozphab-ietf.devsvcdev.mozaws.net/D4670 I support Ben's DISCUSS. Am I understanding the design correctly in that you can't have literals in the individual per-query values, but instead references to the block tables, so you can't stream inside a block? IMPORTANT S 7.5.1. > | earliest-time | O | A | A timestamp (2 unsigned integers, | > | | | | "Timestamp") for the earliest record | > | | | | in the "Block" item. The first integer | > | | | | is the number of seconds since the | > | | | | Posix epoch ("time_t"). The second | > | | | | integer is the number of ticks since | I don't know what a "tick" is, so you need some definitionf or this. COMMENTS S 7.2. > > o The column O marks whether items in a map are optional. > > * O - Optional. The item may be omitted. > > * M - Mandatory. The item must be present. FWIW, I think you might be happier with a mandatory "X" and optional nothing, but that's up to you. S 7.4.1.1.1. > > +------------------+---+---+----------------------------------------+ > | Field | O | T | Description | > +------------------+---+---+----------------------------------------+ > | query-response | M | U | Hints indicating which "QueryResponse" | > | -hints | | | fields are omitted, see section | Nit: generally, you would say "indicating which fields are included" if 1 means included. S 7.5.3. > +---------------------+---+---+-------------------------------------+ > > 7.5.3. "BlockTables" > > Arrays containing data referenced by individual "QueryResponse" or > "MalformedMessage" items in this "Block". Each element is an array This is not a sentence. S 7.5.3. > | qrr | O | A | Array of type "Question". Each entry | > | | | | is the contents of a single question, | > | | | | where a question is the second or | > | | | | subsequent question in a query. See | > | | | | Section 7.5.3.3. | > | | | | | So if I understand this correctly: QRR is a map of integers to question QLIST is a map of integers to question lists, with each question list consisting of a set of indexes int o QRR? S 7.5.3.2. > > +--------------------+---+---+--------------------------------------+ > | Field | O | T | Description | > +--------------------+---+---+--------------------------------------+ > | server-address | O | U | The index in the item in the "ip- | > | -index | | | address" array of the server IP | So am I understanding correctly that you can't put the server address literally in here. It has to be in the block tables? S 7.5.3.2. > +--------------------+---+---+--------------------------------------+ > | server-address | O | U | The index in the item in the "ip- | > | -index | | | address" array of the server IP | > | | | | address. See Section 7.5.3. | > | | | | | > | server-port | O | U | The server port. | Isn't the server port generally constant? It seems like you might save some bits by having this attached to the server and then indixed abvoe. S 7.5.3.2. > | | | | used to service the query. | > | | | | Bit 0. IP version. 0 if IPv4, 1 if | > | | | | IPv6 | > | | | | Bit 1-4. Transport. 4 bit unsigned | > | | | | value where 0 = UDP, 1 = TCP, 2 = | > | | | | TLS, 3 = DTLS. Values 4-15 are | You might want to specify DoH S 7.5.3.5. > | | | | Bit 0. IP version. 0 if IPv4, 1 if | > | | | | IPv6 | > | | | | Bit 1-4. Transport. 4 bit unsigned | > | | | | value where 0 = UDP, 1 = TCP, 2 = | > | | | | TLS, 3 = DTLS. Values 4-15 are | > | | | | reserved for future use. | Again, probably want to specify DoH. S 17.3. > [18] https://github.com/dns-stats/draft-dns-capture- > format/blob/master/file-size-versus-block-size.svg > > Appendix A. CDDL > > This appendix gives a CDDL [I-D.ietf-cbor-cddl] specification for Is this a normative appendix?
Ignas Bagdonas Former IESG member
No Objection
No Objection
(for -08)
Not sent
Martin Vigoureux Former IESG member
No Objection
No Objection
(2018-11-21 for -08)
Not sent
few nits: s/types are are omitted/types are omitted/ s/type is are omitted/type is omitted/ s/common collection and and storage parameters/common collection and storage parameters/
Spencer Dawkins Former IESG member
No Objection
No Objection
(for -08)
Not sent
Suresh Krishnan Former IESG member
No Objection
No Objection
(2018-11-21 for -08)
Sent
* Section 7.4.1.1. Looks like you can limit the {client,server}-address-prefix-{ipv4,ipv6} fields to one byte to restrict the range. e.g. client-address-prefix-ipv6 => uint .size 1 Similar restrictions can be used for port (2) and TTL/hop limit (1) fields.
Terry Manderson Former IESG member
Recuse
Recuse
(2018-11-20 for -08)
Sent
I'm a co-author and financial sponsor of this work - recusing.