Early Review of draft-hoffman-xml2rfc-04
review-hoffman-xml2rfc-04-genart-early-campbell-2014-08-08-00

Request Review of draft-hoffman-xml2rfc
Requested rev. no specific revision (document currently at 23)
Type Early Review
Team General Area Review Team (Gen-ART) (genart)
Deadline 2014-08-08
Requested 2014-03-21
Draft last updated 2014-08-08
Completed reviews Genart Early review of -04 by Elwyn Davies (diff)
Genart Early review of -04 by Joel Halpern (diff)
Genart Early review of -04 by Ben Campbell (diff)

Assignments

Review
review-hoffman-xml2rfc-04-genart-early-campbell-2014-08-08

I am the assigned Gen-ART reviewer for this draft. For background on
Gen-ART, please see the FAQ at

<

http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other comments
you may receive.

Document: draft-hoffman-xml2rfc-04
Reviewer: Ben Campbell
Review Date:  2014-04-03
IETF LC End Date: Not in last call

Note: This is an early review, rather than the usual Gen-ART review at IETF last call. The Gen-ART processes do not precisely apply.

Summary: This draft is on the right track. I have a few comments and questions that may or may not be worth consideration.

*** Major Issues: ***

None

*** Minor issues: ***

-- General:

There are a few "to do" items left in the draft.  (Not surprising for an early review.)

-- Section 1, paragraph 2:

I find it confusing that we have two drafts in parallel, one for V2 that is obsoleted by this draft, and this draft for V3. With this draft copying much of V2 forward, which is authoritative?

-- 1.1, entry for <tident>

THANK YOU!!!! I think faking indented text using the <list> work around was one of the most annoying things about previous versions.

--2.2:

Can I have more than one of any given address sub-element? If so, how should it render?

-- 2.5.3:

If "height" ought to be avoided, should it be deprecated?

Also, what's the unit for height?

(Same questions apply to "width")

-- 2.5.5:

An example of how to use inlined graphics files might be helpful.

-- 2.5.6:

A generic "code" value might be useful. Otherwise, how do you expect these to be used? Syntax highlighting? I assume the processor should never actually try to interpret code based on the "type" attribute. (Are there security considerations there?)

-- 2.7:

It seems like <b> (and also <i>)  is just as much format dependent as things like <width>. Wouldn't something like <em> be more useful for this purpose, where you have an abstract concept of emphasis that can be rendered differently for different formats? This seems like a place where our use cases are different than those for HTML, where markup is usually rendered by browsers that have similar font display capabilities.

In particular, what happens to <b>text<.b> when rendered into ASCII text?

-- 2.9:

Why does the "cite" attribute have different requirements than, say, xref? In particular, why wouldn't I want to cite a reference in the current doc, rather than a URL?

-- 2.9.1:

How would one reference an auto generated anchor identifier? If it doesn't exist until processing time...?

-- 2.15.2

What is the format for Month? Must you write it out? Abbreviate? Number?)

-- 2.26

An example might be helpful

--2.33.2

Is it possible to continue numbering from a previous list?

-- 2.43.5

The text says that "iprExtract" is used to reference a specific section.... Can you elaborate on how? E.g. do you reference an anchor? Are you allowed to have more than one "as-is" section?

-- 2.44.2:

Why are numberless subsections disallowed? It's pretty common to see documents that have non-numbered sub-heads inside numbered sections.

-- 2.44.5:

The default for the TOC attribute is "default"? Does that mean the actual meaning of "default" is controlled elsewhere?

Also, is there a way to limit the depth for ToCs?

-- 2.45 (and subsections)

Should the seriesInfo "name" values match the table from 2.40?

-- 2.54:

Is it possible to select the symbol to be used for an unordered list?

-- 5, security considerations

I wonder if the possibility of executable content existing in an RFC or draft is worth mention here? For an extreme example, see RFC6716, Appendix A. But more realistically, 2.5.5 mentions the possibility of in-lined binary content, and 2.5.6 allows the identification of code in a way that a naive processor implementation might take as an invitation to interpret said code.

Also, does the need for a processor to potentially render binary content in general (in-lined or otherwise) expand the attack surface over that for previous versions of XML2RFC?


*** Nits/editorial comments: ***

-- 1, paragraph 3:

I find the paragraph a little confusing. Saying certain elements will not be used in text generation may be a bit overstated, in that the generation of metadata may well become part of the process of generating the final text. It might be more correct to say "may not affect the rendering of the final text".  (And in the case of the example of index generation, I see that the text of this draft has an imbedded index--so I assume the related elements _were_ used in generating the text.

-- 1.1:

Is the section on changes from V2 intended to remain in the final RFC? Often such sections are removed, but there is info in here (esp 1.1.1) that may be useful in the long run.

-- 1.1, bulleted list of changes:

I find the past tense a bit confusing. I assume that the list represents incremental differences from V2, but the past tense made me wonder if it was meant as ways that V2 is different than V3.  (I suspect the author thought in terms of "changes that were made as we wrote this text", but I think the reader will think in terms of "changes this text makes to V2".

-- List entry for "postalline"

I had to stare at "postalline" for a while before I realized it meant "postal line" and not some sort of material, perhaps made from recycled mail (poh- stah-LEEN ). The point being that I find combined words like this very hard to read without some sort of separator or mixed case. Please consider that when creating new elements with compound word names. (e.g. <postal-line> or <postalLine> are much easier to read, for me at least.)

-- 1.1, 2nd paragraph, "less importantly"

I suggest dropping the value judgement. (I would personally find the lack of offline operation more of an issue than a need to change my anchor names, but that's just me.)

-- 2.5.8:

Please expand CJK on first use.

-- 2.40, series value table:

The value for 3GPP is "TBD". Is there an expectation that the RFC will have something real there?

-- 2.48, Content Model:

It might be worth marking deprecated elements. Otherwise a reader may not follow the xref to discover a legal child element is deprecated.

-- 2.58:

It might be helpful to mention what happens if an <xref> is _not_ empty.