Network Working Group                                          J. Levine
Internet-Draft                                      Taughannock Networks
Intended status: Informational                                P. Hoffman
Expires: April 20, 2013                                   VPN Consortium
                                                        October 17, 2012


     Variants in Second-Level Names Registered in Top Level Domains
                      draft-levine-tld-variant-02

Abstract

   Internationalized Domain Names for Applications (IDNA) provides a
   method to map a subset of names written in Unicode into the DNS.
   Some languages allow a particular host name to be written in multiple
   ways because some characters have multiple representations, often
   known as "variants".  This document surveys the approaches that
   ICANN-contracted top level domains have taken to the registration and
   provisioning of domain names that have variants.  This document is
   not (and will not be) a product of the IETF, does not (and will not)
   propose any method to make variants work "correctly", and is not (and
   will not be) an introduction to internationalization or IDNA.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 20, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of



Levine & Hoffman         Expires April 20, 2013                 [Page 1]


Internet-Draft    Variants in second-level domain names     October 2012


   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Base Documents . . . . . . . . . . . . . . . . . . . . . . . .  5
   4.  Domain Practices . . . . . . . . . . . . . . . . . . . . . . .  5
     4.1.  AERO . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.2.  ASIA . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.3.  BIZ  . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.4.  CAT  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.5.  COM  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.6.  COOP . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.7.  INFO . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.8.  JOBS . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.9.  MOBI . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.10. MUSEUM . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.11. NAME . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.12. NET  . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.13. ORG  . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.14. POST . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.15. PRO  . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     4.16. TEL  . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     4.17. TRAVEL . . . . . . . . . . . . . . . . . . . . . . . . . .  8
     4.18. XXX  . . . . . . . . . . . . . . . . . . . . . . . . . . .  9
   5.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  9
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12

















Levine & Hoffman         Expires April 20, 2013                 [Page 2]


Internet-Draft    Variants in second-level domain names     October 2012


1.  Introduction

   Internationalized Domain Names for Applications (IDNA) [RFC5890]
   allows host names in the DNS [RFC1035] to contain characters from the
   Unicode repertoire.  Some Unicode characters are considered to be
   "variants" of one another.  In some cases, the variants are multiple
   equally valid ways of writing the same thing, such as traditional and
   simplified Chinese characters.  Some languages written in Latin
   characters with accents and diacritical marks, known as decorated
   characters, allow the decorations to be omitted in some situations;
   for example, French sometimes omits accents on capital letters,
   depending on country and culture.  Due to the difficulty of
   representing decorated characters in ASCII systems, many users have
   informally used undecorated characters in DNS host names, even when
   they are not linguistically equivalent to the decorated versions.

   ICANN has no single agreed-on definition of "variant".  In IDN
   Variant TLDs [VARIANTTLDS], ICANN says that variants "occur when a
   single conceptual character can be identified with two or more
   different Unicode Code Points with graphic representations that may
   be visually similar".  In ICANN's IDN Variant Issues Project report
   [VIPREPORT], it says in part "There is today no fully accepted
   definition for what may constitute a variant relationship between
   top-level labels".  RFC 3743 [RFC3743] (an Informational RFC, not the
   product of the IETF), say that the idea of variants is "wherein one
   conceptual character can be identified with several different Code
   Points in character sets for computer use".

   The proper handing of variant names has been a topic of extensive
   debate and research, with little consensus reached on how to handle
   them, or even what characters are variants of each other.  Many
   people would like variant names to behave "the same", for a diverse
   range of meanings of "same."  In some cases it is a textual
   similarity, such as variants having corresponding DNS records, in
   some it is functional similarity, such as variant names resolving to
   the same web server, while in others it is user experience
   similarity, such as names resolving to web sites which while not
   identical are perceived by human users as equivalent.

   This document provides a snapshot of variant handling in the top
   level domains managed by ICANN, so called gTLDs (generic TLDs) and
   sTLDs (sponsored TLDs), as of late 2012.  We chose those domains
   because ICANN requires each TLD to describe its IDN and variant
   practices, and the TLD zone files are available for inspection, to
   verify what actually goes into the zones.

   Since "variant" can mean vastly different things to different people,
   there is also no agreement about when when two zones are supposed to



Levine & Hoffman         Expires April 20, 2013                 [Page 3]


Internet-Draft    Variants in second-level domain names     October 2012


   "behave the same".  Also, the gTLDs and sTLDs might have different
   views of what variants are and are not required to report to ICANN
   about their policies.


2.  Terminology

   We use some terminology that has become fairly well agreed when
   discussing variant names.

   Bundle:  The IDN practices documents (see below) can identify sets of
      code points that are considered equivalent using Language Variant
      Tables, defined in [RFC3743].  A set of names in which the
      characters in each position are equivalent is known as a bundle,
      or more technically as an "IDL Package".  The variant rules vary
      among languages, and for the same language can vary among TLDs.
      Many languages do not define equivalent characters, and hence do
      not have bundles.

   Preferred variant:  When a Language Variant Table defines sets of
      equivalent characters, one character in each set is designated as
      preferred.  In a bundle, the variant that consists entirely of
      preferred characters is the preferred variant.  Typically it is
      the variant that best matches the way that words are written in
      natural language.  Preferred variants are specific to both
      languages and countries.  For example, in some Chinese-speaking
      countries, the preferred variant is simplified characters, while
      in others it is traditional characters.

   Blocking:  When one name (the "base name") in a bundle is registered
      in a TLD, the rest of the names in the bundle are often blocked at
      the registry level: nobody other than the person who registered
      the base name" can register any of the other names in the bundle.
      In some cases even though the names are blocked from registration
      by anyone else, the registrant or registry can activate some or
      all of the otherwise blocked names.

   Parallel NS:  Multiple names in a bundle are provisioned in the TLD
      with identical NS records, so they all are handled by the same
      name servers.

   DNAME aliasing:  The DNAME [RFC6672] DNS record creates a shadow tree
      of DNS records, roughly as though there were a CNAME in the shadow
      tree pointing to each name in the target tree.  DNAMEs have been
      used both as second-level names, to provide resolution for several
      names in a bundle, and as first-level names, to provide resolution
      for every name under a TLD.




Levine & Hoffman         Expires April 20, 2013                 [Page 4]


Internet-Draft    Variants in second-level domain names     October 2012


3.  Base Documents

   ICANN has published a variety of documents on variant management.
   The most important are the "Guidelines for the Implementation of
   Internationalized Domain Names" issued in Version 1.0 [G1] and
   Version 3.0 [G3].

   ICANN says that TLDs are supposed to register an IDN practices
   document with IANA for each language in which the TLD accepts IDN
   registrations, to be entered in the IANA Repository of IDN Practices
   [IANAIDN].  The practices document lists the Unicode characters
   allowed in names in the language, which characters are considered
   equivalent, and which of an equivalent group is preferred.  Some TLDs
   have been more diligent than others at keeping the registry up to
   date.

   Some of the ICANN agreements with each TLD [ICANNAGREE] describe the
   TLD's IDN practices, but most don't.


4.  Domain Practices

4.1.  AERO

   The .AERO TLD has no IDNs, and no rules or practices for them.

4.2.  ASIA

   The .ASIA domain accepts registrations in many Asian languages.  They
   have IANA tables for Japanese, Korean, and Chinese.  The IANA tables
   refer to their CJK IDN policies [ASIACJK], which say that applied-for
   and preferred IDN variants are "active and included in the zone."  No
   IDN publication mechanism is described in the documentation, but
   since the zone file contains no DNAMEs, they must be using parallel
   NS for variants.

4.3.  BIZ

   ICANN gave the registry (Neustar) non-specific permission to register
   in a letter in 2004 [TWOMEY04A].  The IDN rules were apparently
   discussed with ICANN, but not defined; see Appendix 9 of the registry
   agreement [ICANNBIZ9].

   They have about a dozen IANA tables.  No IDN publication mechanism is
   described, but from inspection it appears that variants are blocked.






Levine & Hoffman         Expires April 20, 2013                 [Page 5]


Internet-Draft    Variants in second-level domain names     October 2012


4.4.  CAT

   The IDN rules are described in Appendix S Part VII.2 [ICANNCATS] of
   the ICANN agreement.  "Registry will take a very cautious approach in
   its IDN offerings.  IDNs will be bundled with the equivalent ASCII
   domains."  The only language is Catalan.  No IDN publication
   mechanism is described.

   Although the Catalan IDN practices document does not identify variant
   characters, in practice bundles consist of names with accented and
   unaccented vowels, and "ll" and the Catalan "ela geminada" written as
   two L's with a dot in between.

   When a registrant registers an IDN, the registry also includes the
   ASCII version.  From inspection of the zonefile, the ASCII version is
   provisioned with NS, and the IDN is a DNAME alias of the ASCII
   version.

4.5.  COM

   ICANN and Verisign have extensive correspondence about IDNs and
   variants, including letters to ICANN from Ben Turner [TURNER03] and
   Ed Lewis [LEWIS03].

   The IANA registry has tables for several dozen languages, including
   archaic languages such as hieroglyphics and Aramaic.  Verisign
   publishes documents describing Scripts and Languages [VRSNLANG],
   Character Variants [VRSNCHAR], Registration Rules [VRSNRULES], and
   additional registration logic [VRSNADDL].

   In Chinese, variants are blocked (see [VRSNADDL].)  In other
   languages there appears to be no bundling or blocking.

4.6.  COOP

   The .COOP TLD has no IDNs, and no rules or practices for them.

4.7.  INFO

   The IANA registry has tables for Danish, Hungarian, Lithuanian,
   Latvian, and Swedish from 2005.  The domain also has names in Greek,
   Russian, Arabic, and other languages but no IANA tables.

   The registry agreement Appendix 9 [ICANNINFO9] refers to a 2003
   letter from Paul Twomey [TWOMEY03] that refers to blocking variants.






Levine & Hoffman         Expires April 20, 2013                 [Page 6]


Internet-Draft    Variants in second-level domain names     October 2012


4.8.  JOBS

   The .JOBS TLD has no IDNs, and no rules or practices for them.

4.9.  MOBI

   The zone file has about 22,000 IDNs.  The domain has no tables at
   IANA.  The registry agreement Appendix S [ICANNMOBIS] says that IDNs
   are provisioned according to [G1].

4.10.  MUSEUM

   The zone file has many IDNs, but spot checks find that many are lame
   or dead.  A 2004 letter from Paul Twomey [TWOMEY04] refers to [G1].

   The registry has a detailed policy page [MUSEUMIDN].  IDNs are
   accepted in Latin and Hebrew scripts, with plans for Arabic, Chinese,
   Japanese, Korean, Cyrillic, and Greek.  They do no bundling or
   blocking, but names that may be confusable due to visual similarity
   are not allowed, apparently determined by manual inspection, which is
   practical due to the very small size of the domain.

4.11.  NAME

   The .NAME domain is now managed by Verisign, and has same long list
   of scripts as .COM and .NET.  A 2004 letter from Paul Twomey
   [TWOMEY04B] refers to Appendix K of the agreement, but appendices are
   numbered.  Appendix 11 [ICANNNAME11] is about restrictions on names,
   but says nothing about IDNs.  The Letter above refers to [G1].

4.12.  NET

   The domain is managed the same as .COM.

4.13.  ORG

   A 2003 letter from Paul Twomey [TWOMEY03A] refers to [G1].  The
   registry has a list of IDN languages [PIRIDN], all written in Latin
   script.  The practices for some but not all are registered with IANA,
   Since none of the languages do bundling, there is presumably no
   blocking.

4.14.  POST

   The .POST TLD appears to have no registrations at all yet.






Levine & Hoffman         Expires April 20, 2013                 [Page 7]


Internet-Draft    Variants in second-level domain names     October 2012


4.15.  PRO

   The .PRO TLD has no IDNs, and no rules or practices for them.

4.16.  TEL

   The zone has many IDNs.  It is probably operating according to a 2004
   letter from Paul Twomey [TWOMEY04A] to Neustar which did not mention
   specific TLDs.  Its policy page [TELPOLICY] has links to IDN
   practices for 17 languages, all but one of which are registered with
   IANA.  None of the Latin scripts do bundling or blocking.  The
   Japanese practices say that variants are blocked.  The Chinese
   practices document says:

      Therefore, in addition to the blocking mechanism, bundling is also
      implemented for the Chinese language IDNs.  When registering a
      Chinese language IDN (primary domain name) up to two additional
      variant domain names will be automatically registered.  The first
      variant will consist entirely of simplified Chinese characters
      that correspond to those comprising the primary domain name.  The
      second variant will consist exclusively of traditional Chinese
      characters that correspond to those comprising the primary domain
      name.

      The primary domain name together with the requested variants
      constitutes a bundle on which all operations are atomic.  For
      example, if the registrant adds a name server to the primary
      domain name, all names in the bundle will be associated with that
      new name server.

   The zone has no DNAME records, so the second paragraph strongly
   suggests parallel NS.

   The .TEL TLD, intended as an online directory, does not allow
   registrants to enter arbitrary RR's in the zone.  Nearly all names
   have NS records pointing to Telnic's own name servers.  The A records
   all point to Telnic's own web server that shows directory
   information.  NAPTR records provide the telephone number of
   registrants for whom they have one.  Users can only directly
   provision MX records.  Except that there are 16 domains, none IDNs,
   that point to random other name servers and mostly appear to be
   parked.

4.17.  TRAVEL

   The .TRAVEL TLD has no IDNs, and no rules or practices for them.





Levine & Hoffman         Expires April 20, 2013                 [Page 8]


Internet-Draft    Variants in second-level domain names     October 2012


4.18.  XXX

   The .XXX TLD has no IDNs, and no rules or practices for them.


5.  References

   [ASIACJK]  DotAsia Organisation, ".ASIA CJK (Chinese Japanese Korean)
              IDN Policies", May 2011, <http://dot.asia/policies/
              DotAsia-CJK-IDN-Policies-COMPLETE--2011-05-04.pdf>.

   [G1]       ICANN, "Guidelines for the Implementation of
              Internationalized Domain Names, Version 1.0", <http://
              www.icann.org/en/resources/idn/
              idn-guidelines-20jun03-en.htm>.

   [G3]       ICANN, "Guidelines for the Implementation of
              Internationalized Domain Names, Version 3.0", <http://
              www.icann.org/en/resources/idn/
              idn-guidelines-02sep11-en.htm>.

   [IANAIDN]  IANA, "Repository of IDN Practices",
              <http://www.iana.org/domains/idn-tables>.

   [ICANNAGREE]
              ICANN, "ICANN Registry agreements",
              <http://www.icann.org/en/about/agreements/registries>.

   [ICANNBIZ9]
              ICANN, "Appendix 9 of ICANN .BIZ Registry agreement",
              Dec 2006, <http://www.icann.org/en/about/agreements/
              registries/biz/appendix-09-08dec06-en.htm>.

   [ICANNCATS]
              ICANN, "Appendix S of ICANN .CAT Registry agreement",
              Mar 2006, <http://www.icann.org/en/about/agreements/
              registries/cat/cat-appendixs-22mar06-en.htm>.

   [ICANNINFO9]
              ICANN, "Appendix 9 of ICANN .INFO Registry agreement",
              Dec 2006, <http://www.icann.org/en/tlds/agreements/info/
              appendix-09-08dec06.htm>.

   [ICANNMOBIS]
              ICANN, "Appendix S of ICANN .MOBI Registry agreement",
              Nov 2005, <http://www.icann.org/en/about/agreements/
              registries/mobi/mobi-appendixs-23nov05-en.htm>.




Levine & Hoffman         Expires April 20, 2013                 [Page 9]


Internet-Draft    Variants in second-level domain names     October 2012


   [ICANNNAME11]
              ICANN, "Appendix 11 of ICANN .NAME Registry agreement",
              Mar 2011, <http://www.icann.org/en/about/agreements/
              registries/name/appendix-11-25mar11-en.htm>.

   [LEWIS03]  Lewis, E., "Letter from Ed Lewis to Paul Twomey",
              Oct 2003, <http://www.icann.org/en/news/correspondence/
              lewis-to-twomey-13oct03-en.htm>.

   [MUSEUMIDN]
              Musedoma, "Internationalized Domain Names (IDN) in .museum
              - Policies and terms of use", Jan 2009,
              <http://about.museum/idn/idnpolicy.html>.

   [PIRIDN]   Public Interest Registry, "Expanding Multi-Lingual Options
              in Domain Name Versatility", Jan 2009,
              <http://www.pir.org/idn>.

   [RFC1035]  Mockapetris, P., "Domain names - implementation and
              specification", STD 13, RFC 1035, November 1987.

   [RFC3743]  Konishi, K., Huang, K., Qian, H., and Y. Ko, "Joint
              Engineering Team (JET) Guidelines for Internationalized
              Domain Names (IDN) Registration and Administration for
              Chinese, Japanese, and Korean", RFC 3743, April 2004.

   [RFC5890]  Klensin, J., "Internationalized Domain Names for
              Applications (IDNA): Definitions and Document Framework",
              RFC 5890, August 2010.

   [RFC6672]  Rose, S. and W. Wijngaards, "DNAME Redirection in the
              DNS", RFC 6672, June 2012.

   [TELPOLICY]
              Telnic, ".TEL Policies", Jan 2009,
              <http://www.telnic.org/policies.html>.

   [TURNER03]
              Turner, B., "Letter from Ben Turner to Paul Twomey",
              Nov 2003, <http://www.icann.org/en/news/correspondence/
              turner-to-twomey-17nov03-en.htm>.

   [TWOMEY03]
              Twomey, P., "Letter from Paul Twomey to Ram Mohan",
              Aug 2003, <http://www.icann.org/en/news/correspondence/
              twomey-to-mohan-19aug03-en.htm>.

   [TWOMEY03A]



Levine & Hoffman         Expires April 20, 2013                [Page 10]


Internet-Draft    Variants in second-level domain names     October 2012


              Twomey, P., "Letter from Paul Twomey to Edward Viltz",
              Oct 2003, <http://www.icann.org/en/news/correspondence/
              twomey-to-viltz-21oct03-en.htm>.

   [TWOMEY04]
              Twomey, P., "Letter from Paul Twomey to Cary Karp",
              Jan 2004, <http://www.icann.org/en/news/correspondence/
              twomey-to-karp-20jan04-en.htm>.

   [TWOMEY04A]
              Twomey, P., "Letter from Paul Twomey to Richard Tindal",
              July 2004, <http://www.icann.org/en/correspondence/
              twomey-to-tindal-29jul04.pdf>.

   [TWOMEY04B]
              Twomey, P., "Letter from Paul Twomey to Geir Rasmussen",
              Aug 2004, <http://www.icann.org/correspondence/
              twomey-to-rasmussen-15aug04.pdf>.

   [VARIANTTLDS]
              ICANN, "IDN Variant TLDs",
              <http://www.icann.org/en/resources/idn/variant-tlds>.

   [VIPREPORT]
              ICANN, "A Study of Issues Related to the Management of IDN
              Variant TLDs (Integrated Issues Report)", <http://
              www.icann.org/en/topics/idn/
              idn-vip-integrated-issues-final-clean-20feb12-en.pdf>.

   [VRSNADDL]
              Verisign, "Additional Logic", <http://www.verisigninc.com/
              en_US/products-and-services/domain-name-services/
              domain-information-center/idn-code-points/
              additional-logic/index.xhtml>.

   [VRSNCHAR]
              Verisign, "Character Variants", <http://
              www.verisigninc.com/en_US/products-and-services/
              domain-name-services/domain-information-center/
              idn-resources/character-variants/index.xhtml>.

   [VRSNLANG]
              Verisign, "Scripts and Languages", <http://
              www.verisigninc.com/en_US/products-and-services/
              domain-name-services/domain-information-center/
              idn-resources/scripts-languages/index.xhtml>.

   [VRSNRULES]



Levine & Hoffman         Expires April 20, 2013                [Page 11]


Internet-Draft    Variants in second-level domain names     October 2012


              Verisign, "Registration Rules", <http://
              www.verisigninc.com/en_US/products-and-services/
              domain-name-services/domain-information-center/
              idn-code-points/registration-rules/index.xhtml>.


Authors' Addresses

   John Levine
   Taughannock Networks
   PO Box 727
   Trumansburg, NY  14886

   Phone: +1 831 480 2300
   Email: standards@taugh.com
   URI:   http://jl.ly


   Paul Hoffman
   VPN Consortium

   Email: paul.hoffman@vpnc.org>





























Levine & Hoffman         Expires April 20, 2013                [Page 12]