Last Call Review of draft-faltstrom-unicode11-07
review-faltstrom-unicode11-07-i18ndir-lc-alvestrand-2019-03-27-00

Request Review of draft-faltstrom-unicode11
Requested rev. no specific revision (document currently at 08)
Type Last Call Review
Team Internationalization Directorate (i18ndir)
Deadline 2019-03-05
Requested 2019-02-05
Draft last updated 2019-03-27
Completed reviews I18ndir Last Call review of -07 by Harald Alvestrand (diff)
Genart Last Call review of -07 by Dan Romascanu (diff)
Secdir Last Call review of -08 by Samuel Weiler
Genart Telechat review of -08 by Dan Romascanu
Assignment Reviewer Harald Alvestrand
State Completed
Review review-faltstrom-unicode11-07-i18ndir-lc-alvestrand-2019-03-27
Reviewed rev. 07 (document currently at 08)
Review result On the Right Track
Review completed: 2019-03-27

Review
review-faltstrom-unicode11-07-i18ndir-lc-alvestrand-2019-03-27

Overall conclusion: Not ready yet, needs some updates. New I-D recommended.
[Note: As part of the discussion that resulted in this text, a new I-D has been issued.]

Context issues
=============
The discussion of draft-faltstrom-unicode11 in the directorate has shown that the directorate members share a number of concerns about the current state of IDNA, only some of which are directly relevant to this memo.

IDNA2008 considered limits to what was reasonable to register and use in the DNS at a number of levels:

- A level of “don’t register stuff that causes confusion”. This requires human judgment, and reasonable people may disagree about what causes confusion.
- A level of “don’t register stuff that is structurally invalid under the relevant writing system”. Aspects of this can be captured in rulesets (ICANN’s RZ-LGR efforts fall into this category), but requires deep expertise; this is captured in IDNA2008 as the “don’t register what you don’t understand” rule.
- A level of “this is stuff that you should never register, and applications can reasonably choose to treat it as an error or an attack if it ever shows up”. This is the distinction that is captured in the classification of codepoints as DISALLOWED, and where IDNA2008 (with updates) gives precise rules.

The current document focuses on the last level only - the maintenance of the distinction between PVALID and DISALLOWED. (It also considers whether new CONTEXTO and CONTEXTJ rules are needed).

It is clear from directorate discussion that work needs to be done at the other levels outlined above too, but it is not clear from the discussion what form that work should take or what fora that work is reasonably performed in; the work may or may not involve a revision of the basic IDNA2008 specifications.

We suggest to insert a paragraph in the document describing the context of the state of IDNA2008, and explain what issues this document does not attempt to address. Specifically that the conclusion of the document is what to do regarding Unicode versions up to and including 11, and that this is not to be used as expectations of future versions of Unicode.

In addition, it’s become clear that IDNA2008 does not specify the mechanisms and expectations of the review of new versions of Unicode in enough detail; with the review of a number of versions of Unicode behind us, we should be able to describe those procedures and expectations better than IDNA2008 does. However, this may need to happen in another document than this one.

Content issues
============
Section 4.1 does not specify where to find the conclusion of the IETF discussion on U+08A1.
It is not easy to see from the text whether the algorithms and procedures will render U+0628 U+0654 an illegal sequence or a legal sequence. No matter what the resolution is, the document should make it obvious what the conclusion is (and why).

RFC 5892 states that SPHERICAL ANGLE OPENING UP is DISALLOWED not PVALID:
27D0..2B4C  ; DISALLOWED - the draft says it’s PVALD; this needs changing.

Section 4.1 ought to include numbers for how many characters ended up in DISALLOWED vs PVALID - ideally, for each Unicode version since IDNA2008 was issued. This may also be something that is recommended for the IANA tables rather than this document.

Given the time that has passed since this work started, we should consider whether or not to include Unicode 12.

Nits
====
These have been submitted separately to the author, and are not enumerated here.