Negotiating Human Language in Real-Time Communications
draft-ietf-slim-negotiating-human-language-24
Yes
(Alexey Melnikov)
No Objection
Warren Kumari
(Alia Atlas)
(Deborah Brungard)
(Kathleen Moriarty)
(Spencer Dawkins)
(Suresh Krishnan)
Note: This ballot was opened for revision 19 and is now closed.
Warren Kumari
No Objection
Adam Roach Former IESG member
Yes
Yes
(2018-01-09 for -22)
Unknown
I'm glad to see this document being published; thanks to everyone to worked on it. One tiny nit; section 5.1 contains the following text: > This document defines two media-level attributes starting with > 'hlang' (short for "human interactive language")... I think this is a hold-over from when the string was "humintlang" rather than "hlang" -- it probably makes more sense to say: > This document defines two media-level attributes starting with > 'hlang' (short for "human language")...
Alexey Melnikov Former IESG member
Yes
Yes
(for -19)
Unknown
Ben Campbell Former IESG member
Yes
Yes
(2018-01-09 for -22)
Unknown
I'm balloting "yes" because I think this is important work, but I have some comments: Substantive Comments: - General: It seems to be that this is as much about human behavior as it is capabilities negotiating. Example case: I make a video call and express that I would like to receive Klingon. (Is there a tag for that ? :-) The callee can speak Klingon and Esperanto, so we agree on Klingon. What keeps the callee from speaking Esparanto instead? I realize we can't force people to stick to the negotiated languages--but should we expect that users should at least be given some sort of UI indication about the negotiated language(s)? It seems like a paragraph or two on that subject is warranted, even if it just to say it's out of scope. -1, paragraph 6: (related to Ekr's comments) Does the selection of a single tag in an answer imply an assumption only one language will be used? There are communities where people tend to mix 2 or more languages freely and fluidly. Is that sort of thing out of scope? - 5.1, paragraph 2: Can you elaborate on the motivation to have a separate hlang-send and hlang-recv parameter vs having a single language parameter and instead setting the stream to send or receive only, especially in light of the recommendation to set both directions the same for bi-directional language selection? I don't mean to dispute that approach; I just think a bit more explanation of the design choice would be helpful to the reader. I can imagine some use cases, for example a speech-impaired person who does not plan to speak on a video call may still wish to send video to show facial expressions, etc. (I just re-read the discussion resulting from Ekr's comments, and recognize that this overlaps heavily with that.) -5.1, paragraph 3: "... which in most cases is one of the languages in the offer's..." Are there cases where it might not? -5.1, last paragraph: "This is not a problem." Can you elaborate? That sort of statement usually takes the form "This is not a problem, because..." -5.2, last paragraph: Is there a reason to give such weak guidance on how to indicate the call is rejected? (Along those lines, are non-SIP uses of SDP in scope?) Editorial Comments and Nits: -5.1, paragraph 4: The first MUST seems like a statement of fact.
Alia Atlas Former IESG member
No Objection
No Objection
(for -19)
Unknown
Alissa Cooper Former IESG member
No Objection
No Objection
(2018-01-09 for -22)
Unknown
== Section 7 == "In addition, if the 'hlang-send' or 'hlang-recv' values are altered or deleted en route, the session could fail or languages incomprehensible to the caller could be selected; however, this is also a risk if any SDP parameters are modified en route." Given that one of the primary use cases for the attributes defined here is for emergency calling, it seems worthwhile to call out the new specific threat that these attributes enable in that case, namely the targeted manipulation/forgery of the language attributes for the purposes of denying emergency services to a caller. This general class of attacks is contemplated in Section 5.2.2 of RFC 5069, although there may be a better reference to cite here for what to do if you don't want your emergency calls subject to that kind of attack (I can't recall another document off the top of my head). == Section 8 == This seems weak for not including some words to indicate what to do to mitigate the risks of exposing this information.
Alvaro Retana Former IESG member
No Objection
No Objection
(2018-01-06 for -19)
Unknown
Thanks for writing an interesting document! Given that this document doesn’t mandate the behavior in the case of not having languages in common, why does it matter if the combination is “difficult to match together” or not? I’m wondering about this piece of text (from 5.2): ...The two SHOULD NOT be set to languages which are difficult to match together (e.g., specifying a desire to send audio in Hungarian and receive audio in Portuguese will make it difficult to successfully complete the call). I don’t understand how “difficult to match” can be enforced from a normative point of view. Difficulty seems to be a subjective criteria -- the example shows a pair that I would consider difficult too (I don't speak Hungarian!), but other pairings could still be difficult for me but easy for others. Using “SHOULD NOT” (instead of “MUST NOT”) implies that there are cases in which it is ok to do it (again, probably subjectively). It seems to me that the “SHOULD NOT” could be a simple “should not”. BTW, that reminds me: please use the template text from rfc8174 (instead of rfc2119). Nit: It would be nice to expand SPD in the abstract and put a reference to rfc4566 in the Introduction.
Deborah Brungard Former IESG member
No Objection
No Objection
(for -22)
Unknown
Eric Rescorla Former IESG member
No Objection
No Objection
(2018-01-06 for -19)
Unknown
Document: draft-ietf-slim-negotiating-human-language-17.txt 1. I'm not marking this first point DISCUSS, but I do think it's important it be addressed and I trust the AD will ensure that it is. This document is ambiguous about the contents of the answer attribute. Specifically, it says: In an answer, 'hlang-send' is the language the answerer will send if using the media for language (which in most cases is one of the languages in the offer's 'hlang-recv'), and 'hlang-recv' is the language the answerer expects to receive if using the media for language (which in most cases is one of the languages in the offer's 'hlang-send'). However, the next paragraph permits >1 tag, as does the ABNF in S 6.1. Each value MUST be a list of one or more language tags per BCP 47 [RFC5646], separated by white space. BCP 47 describes mechanisms for matching language tags. Note that [RFC5646] Section 4.1 advises to "tag content wisely" and not include unnecessary subtags. So, how am I supposed to interpret an answer with >1 tag? Is this forbidden? I can imagine a number of semantics, but it's important it be clear in the document. 2. The negotiation structure here does not match that which is conventionally used with SDP, where each side indicates the formats it is prepared to receive and the other side can send any of them. Why did you use this structure? One reason you might is that you expect the answer to resolve which language is in use, however because SIP supports early media (i.e., media which is delivered prior to the answer.)
Kathleen Moriarty Former IESG member
No Objection
No Objection
(for -19)
Unknown
Mirja Kühlewind Former IESG member
No Objection
No Objection
(2018-01-08 for -19)
Unknown
One question: I can't really imagine cases where the send and recv would be used to indicate different things. Can you provide an example (and better explain in the document why this 'complexity' was added)? One purely editorial note: I think section 5.1 could simply be removed before final publication as part of the reasoning is given in the intro already.
Spencer Dawkins Former IESG member
No Objection
No Objection
(for -19)
Unknown
Suresh Krishnan Former IESG member
No Objection
No Objection
(for -22)
Unknown