Skip to main content

Japanese Character Encoding for Internet Messages
RFC 1468

Document Type RFC - Informational (June 1993)
Authors Erik M. van der Poel , Mark Crispin , Dr. Jun Murai Ph.D.
Last updated 2013-03-02
RFC stream Internet Engineering Task Force (IETF)
Formats
Additional resources ietf.cnri.reston.va.us%3A~/ietf-mail-archive/822ext/%2A
IESG Responsible AD (None)
Send notices to (None)
RFC 1468
Network Working Group                                           J. Murai
Request for Comments: 1468                               Keio University
                                                              M. Crispin
                                                       Panda Programming
                                                         E. van der Poel
                                                               June 1993

           Japanese Character Encoding for Internet Messages

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard.  Distribution of this memo is
   unlimited.

Introduction

   This document describes the encoding used in electronic mail [RFC822]
   and network news [RFC1036] messages in several Japanese networks. It
   was first specified by and used in JUNET [JUNET]. The encoding is now
   also widely used in Japanese IP communities.

   The name given to this encoding is "ISO-2022-JP", which is intended
   to be used in the "charset" parameter field of MIME headers (see
   [MIME1] and [MIME2]).

Description

   The text starts in ASCII [ASCII], and switches to Japanese characters
   through an escape sequence. For example, the escape sequence ESC $ B
   (three bytes, hexadecimal values: 1B 24 42) indicates that the bytes
   following this escape sequence are Japanese characters, which are
   encoded in two bytes each.  To switch back to ASCII, the escape
   sequence ESC ( B is used.

   The following table gives the escape sequences and the character sets
   used in ISO-2022-JP messages. The ISOREG number is the registration
   number in ISO's registry [ISOREG].

       Esc Seq    Character Set                  ISOREG

       ESC ( B    ASCII                             6
       ESC ( J    JIS X 0201-1976 ("Roman" set)    14
       ESC $ @    JIS X 0208-1978                  42
       ESC $ B    JIS X 0208-1983                  87

   Note that JIS X 0208 was called JIS C 6226 until the name was changed

Murai, Crispin & van der Poel                                   [Page 1]
RFC 1468   Japanese Character Encoding for Internet Messages   June 1993

   on March 1st, 1987. Likewise, JIS C 6220 was renamed JIS X 0201.

   The "Roman" character set of JIS X 0201 [JISX0201] is identical to
   ASCII except for backslash () and tilde (~). The backslash is
   replaced by the Yen sign, and the tilde is replaced by overline. This
   set is Japan's national variant of ISO 646 [ISO646].

   The JIS X 0208 [JISX0208] character sets consist of Kanji, Hiragana,
   Katakana and some other symbols and characters. Each character takes
   up two bytes.

   For further details about the JIS Japanese national character set
   standards, refer to [JISX0201] and [JISX0208].  For further
   information about the escape sequences, see [ISO2022] and [ISOREG].

   If there are JIS X 0208 characters on a line, there must be a switch
   to ASCII or to the "Roman" set of JIS X 0201 before the end of the
   line (i.e., before the CRLF). This means that the next line starts in
   the character set that was switched to before the end of the previous
   line.

   Also, the text must end in ASCII.

   Other restrictions are given in the Formal Syntax below.

Formal Syntax

   The notational conventions used here are identical to those used in
   RFC 822 [RFC822].

   The * (asterisk) convention is as follows:

       l*m something

   meaning at least l and at most m somethings, with l and m taking
   default values of 0 and infinity, respectively.

   message             = headers 1*( CRLF *single-byte-char *segment
                         single-byte-seq *single-byte-char )
                                           ; see also [MIME1] "body-part"
                                           ; note: must end in ASCII

   headers             = <see [RFC822] "fields" and [MIME1] "body-part">

   segment             = single-byte-segment / double-byte-segment

   single-byte-segment = single-byte-seq 1*single-byte-char

Murai, Crispin & van der Poel                                   [Page 2]
RFC 1468   Japanese Character Encoding for Internet Messages   June 1993

   double-byte-segment = double-byte-seq 1*( one-of-94 one-of-94 )

   single-byte-seq     = ESC "(" ( "B" / "J" )

   double-byte-seq     = ESC "$" ( "@" / "B" )

   CRLF                = CR LF

                                                    ; ( Octal, Decimal.)

   ESC                 = <ISO 2022 ESC, escape>     ; (    33,      27.)

   SI                  = <ISO 2022 SI, shift-in>    ; (    17,      15.)

   SO                  = <ISO 2022 SO, shift-out>   ; (    16,      14.)

   CR                  = <ASCII CR, carriage return>; (    15,      13.)

   LF                  = <ASCII LF, linefeed>       ; (    12,      10.)

   one-of-94           = <any one of 94 values>     ; (41-176, 33.-126.)

   7BIT                = <any 7-bit value>          ; ( 0-177,  0.-127.)

   single-byte-char    = <any 7BIT, including bare CR & bare LF, but NOT
                          including CRLF, and not including ESC, SI, SO>

MIME Considerations

   The name given to the JUNET character encoding is "ISO-2022-JP". This
   name is intended to be used in MIME messages as follows:

       Content-Type: text/plain; charset=iso-2022-jp

   The ISO-2022-JP encoding is already in 7-bit form, so it is not
   necessary to use a Content-Transfer-Encoding header. It should be
   noted that applying the Base64 or Quoted-Printable encoding will
   render the message unreadable in current JUNET software.

   ISO-2022-JP may also be used in MIME Part 2 headers.  The "B"
   encoding should be used with ISO-2022-JP text.

Background Information

   The JUNET encoding was described in the JUNET User's Guide [JUNET]
   (JUNET Riyou No Tebiki Dai Ippan).

   The encoding is based on the particular usage of ISO 2022 announced

Murai, Crispin & van der Poel                                   [Page 3]
RFC 1468   Japanese Character Encoding for Internet Messages   June 1993

   by 4/1 (see [ISO2022] for details). However, the escape sequence
   normally used for this announcement is not included in ISO-2022-JP
   messages.

   The Kana set of JIS X 0201 is not used in ISO-2022-JP messages.

   In the past, some systems erroneously used the escape sequence ESC (
   H in JUNET messages. This escape sequence is officially registered
   for a Swedish character set [ISOREG], and should not be used in ISO-
   2022-JP messages.

   Some systems do not distinguish between ESC ( B and ESC ( J or
   between ESC $ @ and ESC $ B for display. However, when relaying a
   message to another system, the escape sequences must not be altered
   in any way.

   The human user (not implementor) should try to keep lines within 80
   display columns, or, preferably, within 75 (or so) columns, to allow
   insertion of ">" at the beginning of each line in excerpts. Each JIS
   X 0208 character takes up two columns, and the escape sequences do
   not take up any columns. The implementor is reminded that JIS X 0208
   characters take up two bytes and should not be split in the middle to
   break lines for displaying, etc.

   The JIS X 0208 standard was revised in 1990, to add two characters at
   the end of the table. Although ISO 2022 specifies special additional
   escape sequences to indicate the use of revised character sets, it is
   suggested here not to make use of this special escape sequence in
   ISO-2022-JP text, even if the two characters added to JIS X 0208 in
   1990 are used.

   For further information about Japanese character encodings such as PC
   codes, FTP locations of implementations, etc, see "Electronic
   Handling of Japanese Text" [JPN.INF].

References

   [ASCII] American National Standards Institute, "Coded character set
   -- 7-bit American national standard code for information
   interchange", ANSI X3.4-1986.

   [ISO646] International Organization for Standardization (ISO),
   "Information technology -- ISO 7-bit coded character set for
   information interchange", International Standard, Ref. No. ISO/IEC
   646:1991.

   [ISO2022] International Organization for Standardization (ISO),
   "Information processing -- ISO 7-bit and 8-bit coded character sets

Murai, Crispin & van der Poel                                   [Page 4]
RFC 1468   Japanese Character Encoding for Internet Messages   June 1993

   -- Code extension techniques", International Standard, Ref. No. ISO
   2022-1986 (E).

   [ISOREG] International Organization for Standardization (ISO),
   "International Register of Coded Character Sets To Be Used With
   Escape Sequences".

   [JISX0201] Japanese Standards Association, "Code for Information
   Interchange", JIS X 0201-1976.

   [JISX0208] Japanese Standards Association, "Code of the Japanese
   graphic character set for information interchange", JIS X 0208-1978,
   -1983 and -1990.

   [JPN.INF] Ken R. Lunde <lunde@adobe.com>, "Electronic Handling of
   Japanese Text", March 1992,
   msi.umn.edu(128.101.24.1):pub/lunde/japan[123].inf

   [JUNET] JUNET Riyou No Tebiki Sakusei Iin Kai (JUNET User's Guide
   Drafting Committee), "JUNET Riyou No Tebiki (Dai Ippan)" ("JUNET
   User's Guide (First Edition)"), February 1988.

   [MIME1] Borenstein N., and N. Freed, "MIME (Multipurpose
   Internet Mail Extensions): Mechanisms for Specifying and
   Describing the Format of Internet Message Bodies", RFC 1341,
   Bellcore, Innosoft, June 1992.

   [MIME2] Moore, K., "Representation of Non-ASCII Text in Internet
   Message Headers", RFC 1342, University of Tennessee, June 1992.

   [RFC822] Crocker, D., "Standard for the Format of ARPA Internet
   Text Messages", STD 11, RFC 822, UDEL, August 1982.

   [RFC1036] Horton M., and R. Adams, "Standard for Interchange of USENET
   Messages", RFC 1036, AT&T Bell Laboratories, Center for Seismic
   Studies, December 1987.

Acknowledgements

   Many people assisted in drafting this document. The authors wish to
   thank in particular Akira Kato, Masahiro Sekiguchi and Ken'ichi
   Handa.

Security Considerations

   Security issues are not discussed in this memo.

Murai, Crispin & van der Poel                                   [Page 5]quot;
   might have multiple locale-specific values in some client
   registrations to facilitate use in different locations.

   To specify the languages and scripts, BCP47 [RFC5646] language tags
   are added to client metadata member names, delimited by a #
   character.  Since JSON [RFC7159] member names are case sensitive, it
   is RECOMMENDED that language tag values used in Claim Names be
   spelled using the character case with which they are registered in
   the IANA Language Subtag Registry [IANA.Language].  In particular,
   normally language names are spelled with lowercase characters, region
   names are spelled with uppercase characters, and languages are
   spelled with mixed case characters.  However, since BCP47 language
   tag values are case insensitive, implementations SHOULD interpret the
   language tag values supplied in a case insensitive manner.  Per the
   recommendations in BCP47, language tag values used in metadata member
   names should only be as specific as necessary.  For instance, using
   "fr" might be sufficient in many contexts, rather than "fr-CA" or
   "fr-FR".

Richer, et al.         Expires September 25, 2015              [Page 11]
Internet-Draft       OAuth 2.0 Dynamic Registration           March 2015

   For example, a client could represent its name in English as
   ""client_name#en": "My Client"" and its name in Japanese as
   ""client_name#ja-Jpan-JP":
   "\u30AF\u30E9\u30A4\u30A2\u30F3\u30C8\u540D"" within the same
   registration request.  The authorization server MAY display any or
   all of these names to the resource owner during the authorization
   step, choosing which name to display based on system configuration,
   user preferences or other factors.

   If any human-readable field is sent without a language tag, parties
   using it MUST NOT make any assumptions about the language, character
   set, or script of the string value, and the string value MUST be used
   as-is wherever it is presented in a user interface.  To facilitate
   interoperability, it is RECOMMENDED that clients and servers use a
   human-readable field without any language tags in addition to any
   language-specific fields, and it is RECOMMENDED that any human-
   readable fields sent without language tags contain values suitable
   for display on a wide variety of systems.

   Implementer's Note: Many JSON libraries make it possible to reference
   members of a JSON object as members of an object construct in the
   native programming environment of the library.  However, while the
   "#" character is a valid character inside of a JSON object's member
   names, it is not a valid character for use in an object member name
   in many programming environments.  Therefore, implementations will
   need to use alternative access forms for these claims.  For instance,
   in JavaScript, if one parses the JSON as follows, "var j =
   JSON.parse(json);", then as a workaround the member "client_name#en-
   us" can be accessed using the JavaScript syntax "j["client_name#en-
   us"]".

2.3.  Software Statement

   A software statement is a JSON Web Token (JWT) [JWT] that asserts
   metadata values about the client software as a bundle.  A set of
   claims that can be used in a software statement are defined in
   Section 2.  When presented to the authorization server as part of a
   client registration request, the software statement MUST be digitally
   signed or MACed using JWS [JWS] and MUST contain an "iss" (issuer)
   claim denoting the party attesting to the claims in the software
   statement.  It is RECOMMENDED that software statements be digitally
   signed using the "RS256" signature algorithm, although particular
   applications MAY specify the use of different algorithms.  It is
   RECOMMENDED that software statements contain the "software_id" claim
   to allow authorization servers to correlate different instances of
   software using the same software statement.

   For example, a software statement could contain the following claims:

Richer, et al.         Expires September 25, 2015              [Page 12]
Internet-Draft       OAuth 2.0 Dynamic Registration           March 2015

   {
    "software_id": "4NRB1-0XZABZI9E6-5SM3R",
    "client_name": "Example Statement-based Client",
    "client_uri": "https://client.example.net/"
   }

   The following non-normative example JWT includes these claims and has
   been asymmetrically signed using RS256:

   Line breaks are for display purposes only

   eyJhbGciOiJSUzI1NiJ9.
   eyJzb2Z0d2FyZV9pZCI6IjROUkIxLTBYWkFCWkk5RTYtNVNNM1IiLCJjbGll
   bnRfbmFtZSI6IkV4YW1wbGUgU3RhdGVtZW50LWJhc2VkIENsaWVudCIsImNs
   aWVudF91cmkiOiJodHRwczovL2NsaWVudC5leGFtcGxlLm5ldC8ifQ.
   GHfL4QNIrQwL18BSRdE595T9jbzqa06R9BT8w409x9oIcKaZo_mt15riEXHa
   zdISUvDIZhtiyNrSHQ8K4TvqWxH6uJgcmoodZdPwmWRIEYbQDLqPNxREtYn0
   5X3AR7ia4FRjQ2ojZjk5fJqJdQ-JcfxyhK-P8BAWBd6I2LLA77IG32xtbhxY
   fHX7VhuU5ProJO8uvu3Ayv4XRhLZJY4yKfmyjiiKiPNe-Ia4SMy_d_QSWxsk
   U5XIQl5Sa2YRPMbDRXttm2TfnZM1xx70DoYi8g6czz-CPGRi4SW_S2RKHIJf
   IjoI3zTJ0Y2oe0_EJAiXbL6OyF9S5tKxDXV8JIndSA

   The means by which a client or developer obtains a software statement
   are outside the scope of this specification.  Some common methods
   could include a client developer generating a client-specific JWT by
   registering with a software API publisher to obtain a software
   statement for a class of clients.  The software statement is
   typically distributed with all instances of a client application.

   The criteria by which authorization servers determine whether to
   trust and utilize the information in a software statement are beyond
   the scope of this specification.

   In some cases, authorization servers MAY choose to accept a software
   statement value directly as a client identifier in an authorization
   request, without a prior dynamic client registration having been
   performed.  The circumstances under which an authorization server
   would do so, and the specific software statement characteristics
   required in this case, are beyond the scope of this specification.

3.  Client Registration Endpoint

   The client registration endpoint is an OAuth 2.0 endpoint defined in
   this document that is designed to allow a client to be registered
   with the authorization server.  The client registration endpoint MUST
   accept HTTP POST messages with request parameters encoded in the
   entity body using the "application/json" format.  The client

Richer, et al.         Expires September 25, 2015              [Page 13]
Internet-Draft       OAuth 2.0 Dynamic Registration           March 2015

   registration endpoint MUST be protected by a transport-layer security
   mechanism, as described in Section 5.

   The client registration endpoint MAY be an OAuth 2.0 protected
   resource and accept an initial access token in the form of an OAuth
   2.0 [RFC6749] access token to limit registration to only previously
   authorized parties.  The method by which the initial access token is
   obtained by the client or developer is generally out-of-band and is
   out of scope for this specification.  The method by which the initial
   access token is verified and validated by the client registration
   endpoint is out of scope for this specification.

   To support open registration and facilitate wider interoperability,
   the client registration endpoint SHOULD allow registration requests
   with no authorization (which is to say, with no initial access token
   in the request).  These requests MAY be rate-limited or otherwise
   limited to prevent a denial-of-service attack on the client
   registration endpoint.

3.1.  Client Registration Request

   This operation registers a client with the authorization server.  The
   authorization server assigns this client a unique client identifier,
   optionally assigns a client secret, and associates the metadata
   provided in the request with the issued client identifier.  The
   request includes any client metadata parameters being specified for
   the client during the registration.  The authorization server MAY
   provision default values for any items omitted in the client
   metadata.

   To register, the client or developer sends an HTTP POST to the client
   registration endpoint with a content type of "application/json".  The
   HTTP Entity Payload is a JSON [RFC7159] document consisting of a JSON
   object and all requested client metadata values as top-level members
   of that JSON object.

   For example, if the server supports open registration (with no
   initial access token), the client could send the following
   registration request to the client registration endpoint:

Richer, et al.         Expires September 25, 2015              [Page 14]
Internet-Draft       OAuth 2.0 Dynamic Registration           March 2015

   The following is a non-normative example request not using an initial
   access token (with line wraps within values for display purposes
   only):

     POST /register HTTP/1.1
     Content-Type: application/json
     Accept: application/json
     Host: server.example.com

     {
      "redirect_uris":[
        "https://client.example.org/callback",
        "https://client.example.org/callback2"],
      "client_name":"My Example Client",
      "client_name#ja-Jpan-JP":
         "\u30AF\u30E9\u30A4\u30A2\u30F3\u30C8\u540D",
      "token_endpoint_auth_method":"client_secret_basic",
      "logo_uri":"https://client.example.org/logo.png",
      "jwks_uri":"https://client.example.org/my_public_keys.jwks",
      "example_extension_parameter": "example_value"
     }

   Alternatively, if the server supports authorized registration, the
   developer or the client will be provisioned with an initial access
   token.  (The method by which the initial access token is obtained is
   out of scope for this specification.)  The developer or client sends
   the following authorized registration request to the client
   registration endpoint.  Note that the initial access token sent in
   this example as an OAuth 2.0 Bearer Token [RFC6750], but any OAuth
   2.0 token type could be used by an authorization server.

Richer, et al.         Expires September 25, 2015              [Page 15]
Internet-Draft       OAuth 2.0 Dynamic Registration           March 2015

   The following is a non-normative example request using an initial
   access token and registering a JWK set by value (with line wraps
   within values for display purposes only):

     POST /register HTTP/1.1
     Content-Type: application/json
     Accept: application/json
     Authorization: Bearer ey23f2.adfj230.af32-developer321
     Host: server.example.com

     {
      "redirect_uris":["https://client.example.org/callback",
         "https://client.example.org/callback2"],
      "client_name":"My Example Client",
      "client_name#ja-Jpan-JP":
         "\u30AF\u30E9\u30A4\u30A2\u30F3\u30C8\u540D",
      "token_endpoint_auth_method":"client_secret_basic",
      "policy_uri":"https://client.example.org/policy.html",
      "jwks":{"keys":[{
         "e": "AQAB",
         "n": "nj3YJwsLUFl9BmpAbkOswCNVx17Eh9wMO-_AReZwBqfaWFcfG
   HrZXsIV2VMCNVNU8Tpb4obUaSXcRcQ-VMsfQPJm9IzgtRdAY8NN8Xb7PEcYyk
   lBjvTtuPbpzIaqyiUepzUXNDFuAOOkrIol3WmflPUUgMKULBN0EUd1fpOD70p
   RM0rlp_gg_WNUKoW1V-3keYUJoXH9NztEDm_D2MQXj9eGOJJ8yPgGL8PAZMLe
   2R7jb9TxOCPDED7tY_TU4nFPlxptw59A42mldEmViXsKQt60s1SLboazxFKve
   qXC_jpLUt22OC6GUG63p-REw-ZOr3r845z50wMuzifQrMI9bQ",
         "kty": "RSA"
      }]},
      "example_extension_parameter": "example_value"
     }

3.1.1.  Client Registration Request Using a Software Statement

   In addition to JSON elements, client metadata values MAY also be
   provided in a software statement, as described in Section 2.3.  The
   authorization server MAY ignore the software statement if it does not
   support this feature.  If the server supports software statements,
   client metadata values conveyed in the software statement MUST take
   precedence over those conveyed using plain JSON elements.

   Software statements are included in the requesting JSON object using
   this OPTIONAL member:

   software_statement
      A software statement containing client metadata values about the
      client software as claims.

Richer, et al.         Expires September 25, 2015              [Page 16]
Internet-Draft       OAuth 2.0 Dynamic Registration           March 2015

   In the following example, some registration parameters are conveyed
   as claims in a software statement from the example in Section 2.3,
   while some values specific to the client instance are conveyed as
   regular parameters (with line wraps within values for display
   purposes only):

     POST /register HTTP/1.1
     Content-Type: application/json
     Accept: application/json
     Host: server.example.com

     {
       "redirect_uris
RFC 1468   Japanese Character Encoding for Internet Messages   June 1993

Authors' Addresses

   Jun Murai
   Keio University
   5322 Endo, Fujisawa
   Kanagawa 252 Japan

   Fax: +81 466 49 1101
   EMail: jun@wide.ad.jp

   Mark Crispin
   Panda Programming
   6158 Lariat Loop NE
   Bainbridge Island, WA 98110-2098
   USA

   Phone: +1 206 842 2385
   EMail: MRC@PANDA.COM

   Erik M. van der Poel
   A-105 Park Avenue
   4-4-10 Ohta, Kisarazu
   Chiba 292 Japan

   Phone: +81 438 22 5836
   Fax:   +81 438 22 5837
   EMail: erik@poel.juice.or.jp

Murai, Crispin & van der Poel                                   [Page 6]