Skip to main content

Variant Names in Top Level Domains
draft-levine-tld-variant-00

The information below is for an old version of the document.
Document Type
This is an older version of an Internet-Draft that was ultimately published as RFC 6927.
Authors John R. Levine , Paul E. Hoffman
Last updated 2012-10-07
RFC stream (None)
Formats
IETF conflict review conflict-review-levine-tld-variant, conflict-review-levine-tld-variant, conflict-review-levine-tld-variant, conflict-review-levine-tld-variant, conflict-review-levine-tld-variant, conflict-review-levine-tld-variant
Additional resources
Stream Stream state (No stream defined)
Consensus boilerplate Unknown
RFC Editor Note (None)
IESG IESG state Became RFC 6927 (Informational)
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-levine-tld-variant-00
Network Working Group                                          J. Levine
Internet-Draft                                      Taughannock Networks
Intended status: Informational                                P. Hoffman
Expires: April 02, 2013                                   VPN Consortium
                                                            October 2012

                   Variant Names in Top Level Domains
                      draft-levine-tld-variant-00

Abstract

   IDNA [RFC5890] provides a method to map a subset of names written in
   Unicode into the DNS.  Some languages allow a particular name to be
   written in multiple ways that are represented differently in IDNA,
   known as "variants".  We survey the approaches that ICANN-managed top
   level domains have taken to the registration and provisioning of
   variant names.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 02, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (http://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Table of Contents

Levine & Hoffman         Expires April 02, 2013                 [Page 1]
Internet-Draft                TLD variants                  October 2012

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  2
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Base documents . . . . . . . . . . . . . . . . . . . . . . . .  3
   4.  Domain practices . . . . . . . . . . . . . . . . . . . . . . .  4
     4.1.  AERO . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
     4.2.  ASIA . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
     4.3.  BIZ  . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
     4.4.  CAT  . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
     4.5.  COM  . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
     4.6.  COOP . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.7.  INFO . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.8.  JOBS . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.9.  MOBI . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.10. MUSEUM . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.11. NAME . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
     4.12. NET  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.13. ORG  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.14. POST . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.15. PRO  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.16. TEL  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
     4.17. TRAVEL . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.18. XXX  . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
   5.  Note about the references  . . . . . . . . . . . . . . . . . .  7
   6.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  7
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .  9

1.  Introduction

   IDNA [RFC5890] provides a method to map a subset of names written in
   Unicode into the DNS [RFC1035].  Some languages allow a particular
   name to be written in multiple ways that are represented differently
   in IDNA, known as "variants".  In some cases, the variants are
   multiple equally valid ways of writing the same thing, such as
   traditional and simplified Chinese characters.  Some languages
   written in Latin characters with accents and diacritical marks allow
   the marks to be omitted in some situations, such as French which
   often omits accents on capital letters.  Due to the difficulty of
   representing accented characters in ASCII systems, many users have
   informally used unaccented characters in DNS names, even when they
   are not linguistically equivalent to the accented versions.

   The proper handing of variant names has been a topic of extensive
   debate and research, with little consensus reached on how to handle
   them, or even what characters are variants of each other.  Many
   people would like variant names to behave "the same", for a diverse
   range of meanings of "same."  In some cases it is a textual
   similarity, such as variants having corresponding DNS records, in
   some it is functional similarity, such as variant names resolving to
   the same web server, or the same page in a web server, while in
   others it is user experience similarity, such as names resolving to
   web pages which while not identical are perceived by human users as
   equivalent.

Levine & Hoffman         Expires April 02, 2013                 [Page 2]
Internet-Draft                TLD variants                  October 2012

   This document provides a snapshot of variant handling in the top
   level domains managed by ICANN, so called gTLDs (generic TLDs) and
   sTLDs (sponsored TLDs), as of late 2012.  We chose those domains
   because ICANN requires each TLD to describe its IDN and variant
   practices, and the TLD zone files are available for inspection, to
   verify what actually goes into the zones.

2.  Terminology

   We use some terminology that has become fairly well agreed when
   discussing variant names.

   Bundle: A set of names that are considered equivalent.  The rules for
      what constitutes a bundle vary among languages, and for the same
      language can vary among TLDs.

   Preferred variant: In a bundle, one variant is identified as
      preferred.  Typically it is the variant that best matches the way
      that words are written in natural language.  Preferred variants
      are both language and country specific.  For example, in some
      Chinese-speaking countries, the preferred variant is simplified
      characters, while in others it is traditional characters.

   Blocking: When one name (or sometimes several names) in a bundle are
      registered in a TLD, the rest of the names in the bundle are often
      blocked, meaning that nobody can register them, or sometimes that
      nobody other than the same registrant can register them.

   Parallel NS: Multiple names in a bundle are provisioned in the TLD
      with identical NS records, so they all are handled by the same
      name servers.

   DNAME: The DNAME [RFC6672] DNS record creates a shadow tree of DNS
      records, roughly as though there were a CNAME in the shadow tree
      pointing to each name in the target tree.  DNAMEs have been used
      both as second-level names, to provide resolution for several
      names in a bundle, and as first-level names, to provide resolution
      for every name under a TLD.

3.  Base documents

   ICANN has published a variety of documents on variant management.
   The most important are the "Guidelines for the Implementation of
   Internationalized Domain Names" issued in Version 1.0 [G1] and
   Version 3.0 [G3].

   TLDs are supposed to register an IDN practices document with IANA for
   each language in which the TLD accepts IDN registrations, to be
   entered in an IANA registry [IANAIDN].  The practices document lists
   the Unicode characters allowed in names in the language, which
   characters are considered equivalent, and which of an equivalent
   group is preferred.  Some TLDs have been more diligent than others at

Levine & Hoffman         Expires April 02, 2013                 [Page 3]
Internet-Draft                TLD variants                  October 2012

   keeping the registry up to date.

   Some of the ICANN agreements with each TLD [ICANNAGREE] describe the
   TLD's IDN practices, but most don't.

4.  Domain practices

4.1.  AERO

   The .AERO TLD has no IDNs, and no rules or practices for them.

4.2.  ASIA

   The .ASIA domain accepts registrations in many Asian languages.  They
   have IANA tables for Japanese, Korean, and Chinese.  The IANA tables
   refer to their CJK IDN policies [ASIACJK], which say that applied-for
   and preferred IDN variants are "active and included in the zone."  No
   IDN publication mechanism is described in the documentation, but from
   inspection of the zonefile, it is clear that the zone is using
   parallel NS for variants.

4.3.  BIZ

   ICANN gave the registry (Neustar) non-specific permission to register
   in a letter in 2004 [TWOMEY04A].  The IDN rules were apparently
   discussed with ICANN, but not defined; see Appendix 9 of the registry
   agreement [ICANNBIZ9].

   They have about a dozen IANA tables.  No IDN publication mechanism is
   described, but from inspection it appears that variants are blocked.

4.4.  CAT

   The IDN rules are described in Appendix S Part VII.2 [ICANNCATS] of
   the ICANN agreement.  "Registry will take a very cautious approach in
   its IDN offerings.  IDNs will be bundled with the equivalent ASCII
   domains."  The only language is Catalan.  No IDN publication
   mechanism is described.

   Bundles consist of names with accented and unaccented vowels, and
   "ll" and the Catalan letter written as two L's with a dot in between.

   When a registrant registers an IDN, the registry also includes the
   ASCII version.  From inspection of the zonefile, the ASCII version is
   provisioned with NS, and the IDN is a DNAME pointing to the ASCII
   version.

4.5.  COM

   ICANN and Verisign have extensive correspondence about IDNs and
   variants, at including letters from Ben Turner [TURNER03] and Ed
   Lewis [LEWIS03].

Levine & Hoffman         Expires April 02, 2013                 [Page 4]
Internet-Draft                TLD variants                  October 2012

   The IANA registry has tables for several dozen languages, including
   archaic languages such as hieroglyphics and Aramaic.  Verisign
   publishes documents describing Scripts and Languages [VRSNLANG],
   Character Variants [VRSNCHAR], Registration Rules [VRSNRULES], and
   additional registration logic [VRSNADDL].

   In Chinese, variants are blocked (see [VRSNADDL].) In other languages
   there appears to be no bundling or blocking.

4.6.  COOP

   The .COOP TLD has no IDNs, and no rules or practices for them.

4.7.  INFO

   The IANA registry has tables for Danish, Hungarian, Lithuanian,
   Latvian, and Swedish from 2005.  The domain also has names in Greek,
   Russian, Arabic, and other languages but no IANA tables.

   The registry agreement Appendix 9 [ICANNINFO9] refers to a 2003
   letter from Paul Twomey [TWOMEY03] that refers to blocking variants.

4.8.  JOBS

   The .JOBS TLD has no IDNs, and no rules or practices for them.

4.9.  MOBI

   The zone file has about 22,000 IDNs.  The domain has no tables at
   IANA.  The registry agreement Appendix S [ICANNMOBIS] says that IDNs
   are provisioned according to [G1].

4.10.  MUSEUM

   The zone file has many of IDNs, spot checks find that many lame or
   dead.  A 2004 letter from Paul Twomey [TWOMEY04] refers to [G1].

   The registry has a detailed policy page [MUSEUMIDN].  IDNs are
   accepted in Latin and Hebrew scripts, with plans for Arabic, Chinese,
   Japanese, Korean, Cyrillic, and Greek.  They do no bundling or
   blocking, but names that may be confusable due to visual similarity
   are not allowed, apparently determined by manual inspection, which is
   practical due to the very small size of the domain.

4.11.  NAME

   The .NAME domain is now owned by Verisign, and has same long list of
   scripts as .COM and .NET.  http://www.icann.org/correspondence/
   twomey-to-rasmussen-15aug04.pdf refers to Appendix K of the
   agreement, but appendices are numbered.  Appendix 11 [ICANNNAME11] is
   about restrictions on names, but says nothing about IDNs.  The Letter
   above refers to [G1].

Levine & Hoffman         Expires April 02, 2013                 [Page 5]
Internet-Draft                TLD variants                  October 2012

4.12.  NET

   The domain is managed the same as .COM.

4.13.  ORG

   A 2003 letter from Paul Twomey [TWOMEY03A] refers to [G1].  The
   registry has a list of IDN languages [PIRIDN], all written in Latin
   script.  The practices for some but not all are registered with IANA,
   Since none of the languages do bundling, there is presumably no
   blocking.

4.14.  POST

   The .POST TLD appears to have no registrations at all yet.

4.15.  PRO

   The .PRO TLD has no IDNs, and no rules or practices for them.

4.16.  TEL

   The zone has many IDNs.  It is probably operating according to a 2004
   letter from Paul Twomey [TWOMEY04A] which didn not mention specific
   TLDs.  Its policy page [TELPOLICY] has links to IDN practices for 17
   languages, all but one of which are registered with IANA.  None of
   the Latin scripts do bundling or blocking.  The Japanese practices
   say that variants are blocked.  The Chinese table says:

      Therefore, in addition to the blocking mechanism, bundling is also
      implemented for the Chinese language IDNs.  When registering a
      Chinese language IDN (primary domain name) up to two additional
      variant domain names will be automatically registered.  The first
      variant will consist entirely of simplified Chinese characters
      that correspond to those comprising the primary domain name.  The
      second variant will consist exclusively of traditional Chinese
      characters that correspond to those comprising the primary domain
      name.

      The primary domain name together with the requested variants
      constitutes a bundle on which all operations are atomic.  For
      example, if the registrant adds a name server to the primary
      domain name, all names in the bundle will be associated with that
      new name server.

   The zone has no DNAME records, so the second paragraph strongly
   suggests parallel NS.

Levine & Hoffman         Expires April 02, 2013                 [Page 6]
Internet-Draft                TLD variants                  October 2012

   The .TEL TLD, intended as an online directory, does not allow
   registrants to enter arbitrary RR's in the zone.  Nearly all names
   have NS records pointing to Telnic's own name servers.  The A records
   all point to Telnic's own web server that shows directory
   information, A NAPTR record provides the telephone number for
   registrants for whom they have one.  Users can only directly
   provision MX records.  Except that there are 16 domains, none IDNs,
   that point to random other name servers and mostly appear to be
   parked.

4.17.  TRAVEL

   The .TRAVEL TLD has no IDNs, and no rules or practices for them.

4.18.  XXX

   The .XXX TLD has no IDNs, and no rules or practices for them.

5.  Note about the references

   Many of the references may appear to be incomplete.  This is due to
   bugs in the current version of XML2RFC.  Consult the XML for full
   names and URLs.

6.  References

   [ASIACJK]  ".ASIA CJK (Chinese Japanese Korean) IDN Policies", May
              2011.

   [G1]       "Guidelines for the Implementation of Internationalized
              Domain Names, Version 1.0", June 2003.

   [G3]       "Guidelines for the Implementation of Internationalized
              Domain Names, Version 3.0", Sept 2011.

   [IANAIDN]  "Repository of IDN Practices", .

   [ICANNAGREE]
              "ICANN Registry agreements", .

   [ICANNBIZ9]
              "Appendix 9 of ICANN .BIZ Registry agreement", Dec 2006.

   [ICANNCATS]
              "Appendix S of ICANN .CAT Registry agreement", Mar 2006.

   [ICANNINFO9]
              "Appendix 9 of ICANN .INFO Registry agreement", Dec 2006.

   [ICANNMOBIS]
              "Appendix S of ICANN .MOBI Registry agreement", Nov 2005.

   [ICANNNAME11]

Levine & Hoffman         Expires April 02, 2013                 [Page 7]
Internet-Draft                TLD variants                  October 2012

              "Appendix 11 of ICANN .NAME Registry agreement", Mar 2011.

   [LEWIS03]  Lewis, E., "Letter from Ed Lewis to Paul Twomey", Oct
              2003.

   [MUSEUMIDN]
              "Internationalized Domain Names (IDN) in .museum -
              Policies and terms of use", Jan 2009.

   [PIRIDN]   "Expanding Multi-Lingual Options in Domain Name
              Versatility", Jan 2009.

   [RFC1034]  Mockapetris, P., "Domain names - concepts and facilities",
              STD 13, RFC 1034, November 1987.

   [RFC1035]  Mockapetris, P., "Domain names - implementation and
              specification", STD 13, RFC 1035, November 1987.

   [RFC5890]  Klensin, J., "Internationalized Domain Names for
              Applications (IDNA): Definitions and Document Framework",
              RFC 5890, August 2010.

   [RFC6672]  Rose, S. and W. Wijngaards, "DNAME Redirection in the
              DNS", RFC 6672, June 2012.

   [TELPOLICY]
              ".TEL Policies", Jan 2009.

   [TURNER03]
              Turner, B., "Letter from Ben Turner to Paul Twomey", Nov
              2003.

   [TWOMEY03A]
              Twomey, P., "Letter from Paul Twomey to Edward Viltz", Oct
              2003.

   [TWOMEY03]
              Twomey, P., "Letter from Paul Twomey to Ram Mohan", Aug
              2003.

   [TWOMEY04A]
              Twomey, P., "Letter from Paul Twomey to Richard Tindal",
              July 2004.

   [TWOMEY04B]
              Twomey, P., "Letter from Paul Twomey to Geir Rasmussen",
              Aug 2004.

   [TWOMEY04]
              Twomey, P., "Letter from Paul Twomey to Cary Karp", Jan
              2004.

   [VRSNADDL]
              "Additional Logic", .

Levine & Hoffman         Expires April 02, 2013                 [Page 8]
Internet-Draft                TLD variants                  October 2012

   [VRSNCHAR]
              "Character Variants", .

   [VRSNLANG]
              "Scripts and Languages", .

   [VRSNRULES]
              "Registration Rules", .

Authors' Addresses

   John Levine
   Taughannock Networks
   PO Box 727
   Trumansburg, NY 14886
   
   Phone: +1 831 480 2300
   Email: standards@taugh.com
   URI:   http://jl.ly

   Paul Hoffman
   VPN Consortium
   
   Email: paul.hoffman@vpnc.org>

Levine & Hoffman         Expires April 02, 2013                 [Page 9]