Those Troublesome Characters: A Registry of Unicode Code Points Needing Special Consideration When Used in Network Identifiers

Document Type Expired Internet-Draft (individual)
Authors Asmus Freytag  , John Klensin  , Andrew Sullivan 
Last updated 2019-01-01 (latest revision 2018-06-30)
Stream (None)
Intended RFC status (None)
Expired & archived
pdf htmlized bibtex
Stream Stream state (No stream defined)
Consensus Boilerplate Unknown
RFC Editor Note (None)
IESG IESG state Expired
Telechat date
Responsible AD (None)
Send notices to (None)

This Internet-Draft is no longer active. A copy of the expired Internet-Draft can be found at


Unicode's design goal is to be the universal character set for all applications. The goal entails the inclusion of very large numbers of characters. It is also focused on written language in general; special provisions have always been needed for identifiers. The sheer size of the repertoire increases the possibility of accidental or intentional use of characters that can cause confusion among users, particularly where linguistic context is ambiguous, unavailable, or impossible to determine. A registry of code points that can be sometimes especially problematic may be useful to guide system administrators in setting parameters for allowable code points or combinations in an identifier system, and to aid applications in creating security aids for users.


Asmus Freytag (
John Klensin (
Andrew Sullivan (

(Note: The e-mail addresses provided for the authors of this Internet-Draft may no longer be valid.)