Preparation of Internationalized Strings ("stringprep")
RFC 3454
Document | Type |
RFC - Proposed Standard
(January 2003; Errata)
Obsoleted by RFC 7564
Was draft-hoffman-stringprep (individual in int area)
|
|
---|---|---|---|
Authors | Paul Hoffman , Marc Blanchet | ||
Last updated | 2020-01-21 | ||
Stream | IETF | ||
Formats | plain text html pdf htmlized with errata bibtex | ||
Stream | WG state | (None) | |
Document shepherd | No shepherd assigned | ||
IESG | IESG state | RFC 3454 (Proposed Standard) | |
Consensus Boilerplate | Unknown | ||
Telechat date | |||
Responsible AD | Erik Nordmark | ||
IESG note | published | ||
Send notices to | (None) |
Network Working Group P. Hoffman Request for Comments: 3454 IMC & VPNC Category: Standards Track M. Blanchet Viagenie December 2002 Preparation of Internationalized Strings ("stringprep") Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract This document describes a framework for preparing Unicode text strings in order to increase the likelihood that string input and string comparison work in ways that make sense for typical users throughout the world. The stringprep protocol is useful for protocol identifier values, company and personal names, internationalized domain names, and other text strings. This document does not specify how protocols should prepare text strings. Protocols must create profiles of stringprep in order to fully specify the processing options. Table of Contents 1. Introduction....................................................3 1.1 Terminology..................................................4 1.2 Using stringprep in protocols................................4 2. Preparation Overview............................................6 3. Mapping.........................................................7 3.1 Commonly mapped to nothing...................................7 3.2 Case folding.................................................8 4. Normalization...................................................9 5. Prohibited Output..............................................10 5.1 Space characters............................................11 5.2 Control characters..........................................11 5.3 Private use.................................................12 Hoffman & Blanchet Standards Track [Page 1] RFC 3454 Preparation of Internationalized Strings December 2002 5.4 Non-character code points...................................12 5.5 Surrogate codes.............................................13 5.6 Inappropriate for plain text................................13 5.7 Inappropriate for canonical representation..................13 5.8 Change display properties or deprecated.....................13 5.9 Tagging characters..........................................14 6. Bidirectional Characters.......................................14 7. Unassigned Code Points in Stringprep Profiles..................15 7.1 Categories of code points...................................16 7.2 Reasons for difference between stored strings and queries...17 7.3 Versions of applications and stored strings.................18 8. References.....................................................19 8.1 Normative references........................................19 8.2 Informative references......................................19 9. Security Considerations........................................19 9.1 Stringprep-specific security considerations.................19 9.2 Generic Unicode security considerations.....................20 10. IANA Considerations...........................................21 11. Acknowledgements..............................................22 A. Unicode repertoires............................................23 A.1 Unassigned code points in Unicode 3.2.......................23 B. Mapping Tables.................................................31 B.1 Commonly mapped to nothing..................................31 B.2 Mapping for case-folding used with NFKC.....................32 B.3 Mapping for case-folding used with no normalization.........61 C. Prohibition tables.............................................78 C.1 Space characters............................................78 C.1.1 ASCII space characters..................................78 C.1.2 Non-ASCII space characters..............................79 C.2 Control characters..........................................79 C.2.1 ASCII control characters................................79 C.2.2 Non-ASCII control characters............................79 C.3 Private use.................................................80 C.4 Non-character code points...................................80 C.5 Surrogate codes.............................................80Show full document text