Negotiating Human Language in Real-Time Communications
Draft of message to be sent after approval:
From: The IESG <firstname.lastname@example.org> To: IETF-Announce <email@example.com> Cc: firstname.lastname@example.org, The IESG <email@example.com>, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Subject: Protocol Action: 'Negotiating Human Language in Real-Time Communications' to Proposed Standard (draft-ietf-slim-negotiating-human-language-24.txt) The IESG has approved the following document: - 'Negotiating Human Language in Real-Time Communications' (draft-ietf-slim-negotiating-human-language-24.txt) as Proposed Standard This document is the product of the Selection of Language for Internet Media Working Group. The IESG contact persons are Adam Roach, Alexey Melnikov and Ben Campbell. A URL of this Internet Draft is: https://datatracker.ietf.org/doc/draft-ietf-slim-negotiating-human-language/
Technical Summary: In establishing a multi-media communications session, it can be important to ensure that the caller's language and media needs match the capabilities of the called party. This is important in non-emergency uses (such as when calling a company call center) or in emergencies where a call can be handled by a call taker capable of communicating with the user, or a translator or relay operator can be bridged into the call during setup. This document describes the problem of negotiating human (natural) language needs, abilities and preferences in spoken, written and signed languages. It also provides a solution using new stream attributes within the Session Description Protocol (SDP). Working Group Summary: This draft has undergone 13 revisions since its initial IETF last call (which occurred on draft -06). These revisions were required to address issues raised by the IETF community, such as: 1. The meaning of the "*" in language negotiation. The SDP directorate review in the initial IETF last call expressed concern over the handling of the asterisk, which had the properties of a session attribute while being included within individual m-lines. WG consensus was to remove the asterisk, whose role had been advisory. 2. Routing of calls. The SDP directorate review in the initial IETF last call expressed concern about whether the document intended the use of SDP for routing of SIP traffic. Language was added to indicate clearly that call routing was out of scope. 3. Combining of hlang-send/hlang-recv. In IETF last call, a reviewer suggested that the document allow combining the hlang-send and recv indications so as to allow more efficient representation in cases where language preference is symmetrical. This suggestion was not accepted by the WG since it was not clear that the efficiency was worth the additional complexity. In addition to issues brought up in IETF last call, there was substantial WG discussion on the following points: 4. Undefined language/modality combinations. Language tags do not always distinguish spoken from written language, so some combinations of languages and media are not well defined. The text in Section 5.4 resulted from WG discussion of several scenarios: a. Captioning. While the document supports negotiation of sign language in a video stream, it does not define how to indicate that captioning (e.g. placement of text within the video stream) is desired. WG Consensus did not support use of suppressed script tags for this purpose. b. SignWriting (communicating sign language in written form). Currently only a single language tag has been defined for SignWriting so that written communication of sign language in a text stream (or in captioning) is also not defined. c. Lipreading (spoken language within video). There was not WG consensus for explicitly indicating the desire for spoken language in a video stream (e.g. by use of the -Zxxx script subtag), since the ability to negotiate "lip sync" is already provided in RFC 5888. As a result of these discussions, Section 5.4 leaves a number of potential combinations of language and media undefined. Assuming that implementation experience shows a need to define these scenarios, they can be addressed in future work. 5. Preferences between media. As an example, an individual might be able to understand written English communicated using Realtime Text, but might prefer spoken English audio. The current draft enables all modes of communication to be negotiated, but does not indicate a preference between them. WG consensus was that it was acceptable and possibly more reliable for mutually supported media to be negotiated and brought up, then let the conversants decide which media to use, rather than taking on the additional complexity of negotiating media preference beforehand. During discussion, it was pointed out that quality issues could influence media preferences during a call. For example, on a call where audio, video and text are all available, sending video may interfere with audio quality so that video sending needs to be disabled. Alternatively, audio quality could be poor so that the conversants need to resort to text. So media quality issues can negate the "best laid plans" of media preference negotiation. Document Quality: There are no current implementations of draft-ietf-slim-negotiating-language. However, the North American Emergency Number Association (NENA) has referenced it in NENA 08-01 (i3 Stage 3 version 2) in describing attributes of emergency calls presented to an ESInet and within 3GPP some CRs introduced in SA1 have referenced the functionality. Therefore implementation is expected. Personnel: Bernard Aboba is the Document Shepard. The responsible area director is Alexey Melnikov.