Javascript disabled? Like other modern websites, the IETF Datatracker relies on Javascript. Please enable Javascript for full functionality.
Unified User-Agent String
draft-karcz-uuas-01

Versions:
Document	Type	Active Internet-Draft (individual)
	Author	Mateusz Karcz
	Last updated	2021-11-24 (Latest revision 2014-11-10)
	RFC stream	(None)
	Intended RFC status	(None)
	Formats	txt htmlized pdf bibtex bibxml
Stream	Stream state	(No stream defined)
	Consensus boilerplate	Unknown
	RFC Editor Note	(None)
IESG	IESG state	I-D Exists
	Telechat date	(None)
	Responsible AD	(None)
	Send notices to	(None)
Email authors IPR References Referenced by Nits Search email archive
draft-karcz-uuas-01
Independent                                                     M. Karcz
Internet-Draft                                                UKLO Tczew
Updates: 7231 (if approved)                            November 10, 2014
Intended status: Experimental
Expires: May 14, 2015

                       Unified User-Agent String
                          draft-karcz-uuas-01

Abstract

   User-Agent is a HTTP request-header field. It contains information
   about the user agent originating the request, which is often used by
   servers to help identify the scope of reported interoperability
   problems, to work around or tailor responses to avoid particular user
   agent limitations, and for analytics regarding browser or operating
   system use. Over the years contents of this field got complicated
   and ambiguous. That was the reaction for sending altered version of
   websites to web browsers other than popular ones. During the
   development of the WWW, authors of the new web browsers used to
   construct User-Agent strings similar to Netscape's one. Nowadays
   contents of the User-Agent field are much longer than 15 years ago.
   This Memo proposes the Uniform User-Agent String as a way to simplify
   the User-Agent field contents, while maintaining the previous
   possibility of their use.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 14, 2015.

Karcz                     Expires May 14, 2015                  [Page 1]
Internet-Draft          Unified User-Agent String          November 2014

Copyright Notice

   Copyright (c) 2014 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may not be modified, and derivative works of it may not
   be created, except to format it for publication as an RFC or to
   translate it into languages other than English.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Conformance . . . . . . . . . . . . . . . . . . . . . . .   3
     1.2.  Syntax Notation . . . . . . . . . . . . . . . . . . . . .   3
       1.2.1.  Whitespaces . . . . . . . . . . . . . . . . . . . . .   3
   2.  Use of the User-Agent strings . . . . . . . . . . . . . . . .   3
   3.  Definition of Proposed Format . . . . . . . . . . . . . . . .   3
     3.1.  Standard String . . . . . . . . . . . . . . . . . . . . .   4
     3.2.  Regular String  . . . . . . . . . . . . . . . . . . . . .   4
     3.3.  Web Browser String  . . . . . . . . . . . . . . . . . . .   5
   4.  ABNF Definition of UUAS . . . . . . . . . . . . . . . . . . .   7
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   8
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   Nowadays User-Agent strings are long, complicated and often
   ambiguous. (e.g. "Mozilla/4.0 (compatible; MSIE 6.0; X11; Linux
   i686; en) Opera 8.01" - it is Opera Browser, but it can be read as
   Internet Explorer or Netscape Navigator.) This document specifies a
   new, easy and clear format of Unified User-Agent String (UUAS), which
   allows simple distinction between user agents, maintaining most of
   the features of the existing solutions.

Karcz                     Expires May 14, 2015                  [Page 2]
Internet-Draft          Unified User-Agent String          November 2014

1.1.  Conformance

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

1.2.  Syntax Notation

   This specification uses the Augmented Backus-Naur Form (ABNF)
   notation of [RFC5234]. Section 4 contains a full syntax definition
   of the Unified User-Agent String.

1.2.1.  Whitespaces

   This specification uses two rules to denote the use of linear
   whitespace: OWS (optional whitespace) and RWS (required whitespace).
   They are defined in Section 3.2.3 of [RFC7230].

2.  Use of the User-Agent strings

   Generally, the User-Agent header field was intended for statistical
   purposes. However, in mid-90. during the "browser wars" data
   provided by this field became used to alter the content of the
   resources before sending them to the user, or even to prevent users
   of particular browser the access to resources. To avoid these
   protections, software vendors started to change their identifiers in
   a way resembling User-Agent strings of the most popular browsers.
   During the years it has made these identifiers much more complicated,
   ambiguous and difficult to parse.

   Nowadays User-Agent strings are still used for statistical purposes,
   but also for avoiding limitations of particular implementations.
   However, in modern browsers these limitations greatly decreased and
   "user agent spoofing" is now unnecessary.  Unfortunately, there are a
   lot of websites still discriminating particular web browsers.

   Unified User-Agent String is intended to propose a way for
   simplifying, clarifying and standarizing the content of User-Agent
   HTTP header field. Furthermore, if it becomes widespread, it will be
   able to reduce the practice of "user agent spoofing" and
   discrimination of particular groups of the Internet users.

3.  Definition of Proposed Format

   This document proposes a formal definition of three types of User-
   Agent string: standard string, regular string and web browser string.

Karcz                     Expires May 14, 2015                  [Page 3]
Internet-Draft          Unified User-Agent String          November 2014

     User-Agent = uuas
     uuas = standard-string / regular-string / browser-string

   Standard string is intended to maintain backward compatibility with
   existing implementions and it is the same simple format as defined in
   [RFC7230].

   Regular string introduces a degree of standardization making every
   theoretical UUAS parser able to obtain information from it.

   Web browser string is designed for modern graphical web browsers and
   proposes a set of signatures, which should form together a clear and
   unequivocal application identifier.

3.1.  Standard String

   The standard User-Agent string MUST be generated in conformance with
   Section 5.5.3 of [RFC7231]. The standard User-Agent string consists
   of one or more product identifiers, each followed by zero or more
   comments (Section 3.2 of [RFC7230]), which together identify the user
   agent software.

   Standard string syntax definition:

     standard-string = product *( RWS ( product / comment ) )

   The product identifiers and comments SHOULD be listed in decreasing
   order of their significance. Each of them consists of a name and
   OPTIONAL version number.

   In the standard string a sender SHOULD limit generated product
   identifiers to what is necessary to identify the product; a sender
   MUST NOT generate advertising or other nonessential information
   within the product identifier. A sender SHOULD NOT place non-
   version-related information in version number part of product
   identifier.  In the standard string successive versions of the same
   product SHOULD differ only in the version part of the identifier.

   Example:

     CERN-LineMode/2.15 libwww/2.17b3

3.2.  Regular String

   Regular Unified User-Agent String is intended for request senders
   other than graphical web browsers and general web crawlers. It MUST
   provide a signature of the operating system or platform (eg. in case
   of runtime environments) used to generate the request at the first

Karcz                     Expires May 14, 2015                  [Page 4]
Internet-Draft          Unified User-Agent String          November 2014Nottingham                  Standards Track                     [Page 3]
RFC 5005               Feed Paging and Archiving          September 2007

   Example: Atom-formatted Complete Feed

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom"
    xmlns:fh="http://purl.org/syndication/history/1.0">
    <title>NetMovies Queue</title>
    <subtitle>The DVDs you'll receive next.</subtitle>
    <link href="http://example.org/"/>
    <fh:complete/>
    <link rel="self"
     href="http://netmovies.example.org/jdoe/queue/index.atom"/>
    <updated>2003-12-13T18:30:02Z</updated>
    <author>
      <name>John Doe</name>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
    <entry>
      <title>Casablanca</title>
      <link href="http://netmovies.example.org/movies/Casablanca"/>
      <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
      <updated>2003-12-13T18:30:02Z</updated>
      <summary>Here's looking at you, kid...</summary>
    </entry>
   </feed>

   This specification does not address duplicate entries in complete
   feeds.

3.  Paged Feeds

   A paged feed is a set of linked feed documents that together contain
   the entries of a logical feed, without any guarantees about the
   stability of each document's contents.

   Paged feeds are lossy; that is, it is not possible to guarantee that
   clients will be able to reconstruct the contents of the logical feed
   at a particular time.  Entries may be added or changed as the pages
   of the feed are accessed, without the client becoming aware of them.

   Therefore, clients SHOULD NOT present paged feeds as coherent or
   complete, or make assumptions to that effect.

   Paged feeds can be useful when the number of entries is very large,
   infinite, or indeterminate.  Clients can "page" through the feed,
   only accessing a subset of the feed's entries as necessary.

Nottingham                  Standards Track                     [Page 4]
RFC 5005               Feed Paging and Archiving          September 2007

   For example, a search engine might make query results available as a
   paged feed, so that queries with very large result sets do not
   overwhelm the server, the network, or the client.

   The feed documents in a paged feed are tied together with the
   following link relations:

   o  "first" - A URI that refers to the furthest preceding document in
      a series of documents.

   o  "last" - A URI that refers to the furthest following document in a
      series of documents.

   o  "previous" - A URI that refers to the immediately preceding
      document in a series of documents.

   o  "next" - A URI that refers to the immediately following document
      in a series of documents.

   Paged feed documents MUST have at least one of these link relations
   present, and should contain as many as practical and applicable.

   Example: Atom-formatted Paged Feed

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom">
    <title>Example Feed</title>
    <link href="http://example.org/"/>
    <link rel="self" href="http://example.org/index.atom"/>
    <link rel="next" href="http://example.org/index.atom?page=2"/>
    <updated>2003-12-13T18:30:02Z</updated>
    <author>
      <name>John Doe</name>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
    <entry>
      <title>Atom-Powered Robots Run Amok</title>
      <link href="http://example.org/2003/12/13/atom03"/>
      <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
      <updated>2003-12-13T18:30:02Z</updated>
      <summary>Some text.</summary>
    </entry>
   </feed>

   This specification does not address duplicate entries in paged feeds.

Nottingham                  Standards Track                     [Page 5]
RFC 5005               Feed Paging and Archiving          September 2007

4.  Archived Feeds

   An archived feed is a set of feed documents that can be combined to
   accurately reconstruct the entries of a logical feed.

   Unlike paged feeds, archived feeds enable clients to do this without
   losing entries.  This is achieved by publishing a single subscription
   document and (potentially) many archive documents.

   A subscription document is a feed document that always contains the
   most recently added or changed entries available in the logical feed.

   Archive documents are feed documents that contain less recent entries
   in the feed.  The set of entries contained in an archive document
   published at a particular URI SHOULD NOT change over time.  Likewise,
   the URI for a particular archive document SHOULD NOT change over
   time.

   The following link relations are used to tie subscription and
   archived feeds together:

   o  "prev-archive" - A URI that refers to the immediately preceding
      archive document.

   o  "next-archive" - A URI that refers to the immediately following
      archive document.

   o  "current" - A URI that, when dereferenced, returns a feed document
      containing the most recent entries in the feed.

   Subscription documents and archive documents MUST have a "prev-
   archive" link relation, unless there are no preceding archives
   available.  Archive documents SHOULD also have a "next-archive" link
   relation, unless there are no following archives available.

   Archive documents SHOULD indicate their associated subscription
   documents using the "current" link relation.

   Archive documents SHOULD also contain an fh:archive element in their
   head sections to indicate that they are archives. fh:archive is an
   empty element; this specification does not define any content for it.

Nottingham                  Standards Track                     [Page 6]
RFC 5005               Feed Paging and Archiving          September 2007

   Example: Atom-formatted Subscription Document

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom">
    <title>Example Feed</title>
    <link href="http://example.org/"/>
    <link rel="self" href="http://example.org/index.atom"/>
    <link rel="prev-archive"
     href="http://example.org/2003/11/index.atom"/>
    <updated>2003-12-13T18:30:02Z</updated>
    <author>
      <name>John Doe</name>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
    <entry>
      <title>Atom-Powered Robots Run Amok</title>
      <link href="http://example.org/2003/12/13/atom03"/>
      <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
      <updated>2003-12-13T18:30:02Z</updated>
      <summary>Some text.</summary>
    </entry>
   </feed>

   Example: Atom-formatted Archive Document

   <?xml version="1.0" encoding="utf-8"?>
   <feed xmlns="http://www.w3.org/2005/Atom"
    xmlns:fh="http://purl.org/syndication/history/1.0">
    <title>Example Feed</title>
    <link rel="current" href="http://example.org/index.atom"/>
    <link rel="self" href="http://example.org/2003/11/index.atom"/>
    <fh:archive/>
    <link rel="prev-archive"
     href="http://example.org/2003/10/index.atom"/>
    <updated>2003-11-24T12:00:00Z</updated>
    <author>
      <name>John Doe</name>
    </author>
    <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
    <entry>
      <title>Atom-Powered Robots Scheduled To Run Amok</title>
      <link href="http://example.org/2003/11/24/robots_coming"/>
      <id>urn:uuid:cdef5c6d5-gff8-4ebb-assa-80dwe44efkjo</id>
      <updated>2003-11-24T12:00:00Z</updated>
      <summary>Some text from an old, different entry.</summary>
    </entry>
   </feed>

Nottingham                  Standards Track                     [Page 7]
RFC 5005               Feed Paging and Archiving          September 2007

   In this example, the feed archives are split into monthly chunks, and
   the subscription document points to the most recent complete archive
   "http://example.org/2003/11/index.atom" using the "prev-archive"
   relation.  That document, in turn points to the previous archive
   "http://example.org/2003/10/index.atom", and so on.  Note that the
   "2003/11" archive does not have a "next-archive" relation, because it
   is the most recent complete archive; although another archive
   ("2003/12") may be under construction, it would be an error to link
   to it before completion.

4.1.  Publishing Archived Feeds

   The requirement that archive documents be stable allows clients to
   safely assume that if they have retrieved one in the past, it will
   not meaningfully change in the future.  As a result, if an archive
   document's contents are changed, some clients may not become aware of
   the changes.

   Therefore, if a publisher requires a change to be visible to all
   users (e.g., correcting factual errors), they should consider
   publishing the revised entry in the subscription document, in
   addition to (or instead of) the appropriate archive document.
   Conversely, unimportant changes (e.g., spelling corrections) might be
   only effected in archive documents.

   Publishers SHOULD construct their feed documents in such a way as to
   make duplicate removal unambiguous (see Section 4.2).

   Publishers are not required to make all archive documents available;
   they may refuse to serve (e.g., with HTTP status code 403 or 410) or
   be unable to serve (e.g., with HTTP status code 404) an archive
   document.

4.2.  Consuming Archived Feeds

   Typically, clients will "subscribe" to an archived feed by polling
   the subscription document for recent changes.  If a URI contained in
   the prev-archive link relation has not been processed in the past,
   the client can "catch up" with any missed entries by dereferencing it
   and adding the contained entries to the logical feed.  This process
   should be repeated recursively until the client encounters a prev-
   archive link relation that has been processed (the end of the archive
   is indicated by a missing prev-archive link relation) or an error is
   encountered.

   If duplicate entries are found, clients SHOULD consider only the most
   recently updated entry to be part of the logical feed.  If duplicate
   entries have the same update time-stamp, or no time-stamps are

Nottingham                  Standards Track                     [Page 8]
RFC 5005               Feed Paging and Archiving          September 2007

   available, the entry sourced from the most recently updated feed
   document SHOULD replace all other duplicates of that entry.

   In Atom-formatted archived feeds, two entries are duplicates if they
   have the same atom:id element.  The update time of an entry is
   determined by its atom:updated element, and likewise the update time
   of a feed document is determined by its feed-level atom:updated
   element.

   Clients SHOULD warn users when they are not able to reconstruct the
   entire logical feed (e.g., by alerting the user that an archive
   document is unavailable, or displaying pseudo-entries that inform the
   user that some entries may be missing).

5.  IANA Considerations

   This specification defines the following new relations that have been
   added to the Link Relations registry:

      o  Attribute Value: prev-archive
      o  Description: A URI that refers to the immediately
         preceding archive document.
      o  Expected display characteristics: none
      o  Security considerations: See [RFC5005]

      o  Attribute Value: next-archive
      o  Description: A URI that refers to the immediately
         following archive document.
      o  Expected display characteristics: none
      o  Security considerations: See [RFC5005]

   Additionally, the "previous," "next", and "current" link relations
   should be updated to refer to this document.

6.  Security Considerations

   Feeds using this mechanism have the same security considerations as
   Atom [1].  Encryption and authentication security services can be
   obtained by encrypting and/or signing the feed, as described in [1],
   and may also be obtained through channel-based mechanisms (e.g., TLS
   [6], HTTP authentication [7]) and/or transport (e.g., IPsec [8]).

   Feeds using these mechanisms could be crafted in such a way as to
   cause a client to initiate excessive (or even an unending sequence
   of) network requests, causing denial of service (either to the
   client, the target server, and/or intervening networks).  Clients can
   mitigate this risk by requiring user intervention after a certain
   number of requests, or by limiting requests either according to a

Nottingham                  Standards Track                     [Page 9]
RFC 5005               Feed Paging and Archiving          September 2007

   hard limit, or with heuristics.  Servers can mitigate this risk by
   denying requests that they consider abusive (e.g., by closing the
   connection or generating an error).

   Clients should be mindful of resource limits when storing feed
   documents.  To reiterate, they are not required to always store or
   reconstruct the feed when conforming to this specification; they only
   need to inform the user when the reconstructed feed is not complete.

   This specification does not define what it means when a logical
   feed's component feed documents have different security mechanisms
   applied.

7.  References

7.1.  Normative References

   [1]  Nottingham, M., Ed. and R. Sayre, Ed., "The Atom Syndication
        Format", RFC 4287, December 2005.

   [2]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [3]  Bray, T., Hollander, D., and A. Layman, "Namespaces in XML",
        World Wide Web Consortium First Edition REC-xml-names-19990114,
        January 1999,
        <http://www.w3.org/TR/1999/REC-xml-names-19990114>.

   [4]  Tobin, R. and J. Cowan, "XML Information Set (Second Edition)",
        World Wide Web Consortium Recommendation REC-xml-infoset-
        20040204, February 2004,
        <http://www.w3.org/TR/2004/REC-xml-infoset-20040204>.

   [5]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
        Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
        January 2005.

7.2.  Informative References

   [6]  Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS)
        Protocol Version 1.1", RFC 4346, April 2006.

   [7]  Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S.,
        Leach, P., Luotonen, A., and L. Stewart, "HTTP Authentication:
        Basic and Digest Access Authentication", RFC 2617, June 1999.

Nottingham                  Standards Track                    [Page 10]
RFC 5005               Feed Paging and Archiving          September 2007

   [8]  Kent, S. and K. Seo, "Security Architecture for the Internet
        Protocol", RFC 4301, December 2005.

   [9]  Winer, D., "RSS 2.0 Specification", 2005,
        <http://www.rssboard.org/rss-specification>.

Nottingham                  Standards Track                    [Page 11]
RFC 5005               Feed Paging and Archiving          September 2007

Appendix A.  Acknowledgements

   The author would like to thank the following people for their
   contributions, comments, and help: Danny Ayers, Thomas Broyer, Lisa
   Dusseault, Stefan Eissing, David Hall, Bill de Hora, Vidya Narayanan,
   Aristotle Pagaltzis, John Panzer, Dave Pawson, Garrett Rooney, Robert
   Sayre, James Snell, Henry Story, and Franklin Tse.

   Any errors herein remain the author's, not theirs.

Appendix B.  Use in RSS 2.0

   As previously noted, while this specification's extensions are
   described in terms of the Atom feed format, they are also useful in
   similar formats.  This informative appendix demonstrates how they can
   be used in an RSS 2.0-formatted [9] feed.

   In RSS 2.0-formatted feeds, two entries are duplicates if they have
   the same guid element.  The update time of an entry is not defined by
   RSS 2.0, but the feed-level update time can be determined by the
   lastBuildDate element, if present.

   RSS 2.0-formatted Complete Feed

   <?xml version="1.0&

   position in the comment after the first product identifier. After
   this signature the regular string MAY contain any comments and next
   product identifiers.  Only this information MUST be provided, because
   this format is designed for cases, when the server does not need to
   know the exact parameters of the application originating the request.
   In such cases this string can be applicable in statistical purposes
   or in adapting the server's response to capabilities of particular
   software platforms (eg. for indicating the need for adding carriage
   returns before the newlines).

   Regular string syntax definition:

     regular-string = product RWS "(" os [ sc 1*ctext ] ")"
                      *( RWS ( product / comment ) )

   Regular Unified User-Agent Strings are syntactically compliant with
   the standard definition.

   Example:

     Wget/1.11.1 (Red Hat modified)

3.3.  Web Browser String

   Web Browser User-Agent String is a format of this field-value
   intended for identifying modern graphical web browsers, which are
   compatible with HTML5, CSS3 or other modern web technologies. Web
   browser string MUST contain "Mozilla/5.0" tag at the beginning for
   historical reasons. This helps avoid the recognition of browsers as
   very old ones. Web Browser UUAS MUST also contain "Gecko" tag. This
   can avoid delivering impaired versions of websites to modern but not
   Gecko-based client applications. It is also in conformance with
   Section 6.6.1.1 of [W3C.REC-html5-20141028].

   Web browser string syntax definition:

     browser-string = Mozilla-tag
                      RWS "(" *( signature sc ) os
                      *( sc signature ) [ sc language ]
                      *( sc signature ) [ sc rvtag ] ")"
                      RWS Gecko-string
                      *( RWS ( product / comment ) )

   Like regular string, Web Browser Unified User-Agent String MUST
   provide information about software platform. Fields contained
   between brakets (comments) SHOULD be separated by semicolons with
   optional space. Application MAY also include language tag in its

Karcz                     Expires May 14, 2015                  [Page 5]
Internet-Draft          Unified User-Agent String          November 2014

   User-Agent string. Then it MUST be a Language-Tag in accordance with
   [RFC5646].

   Due to the fact that the application originating the request cannot
   provide its version info in the first product identifier, it SHOULD
   place its version number in the separate revision tag.

   Of course, a sender can add to the string any valid product
   identifiers and comments, but this Memo is intended to simplify and
   clarify this element of the protocol. In the web browser string
   there MUST be at least one signature allowing to identify particular
   client application product. Also the order of platform, language and
   revision signatures MUST NOT be changed.

   This type of UUAS SHOULD be also used by general web crawlers. It
   can help avoid certain unfair practices relying on delivering other
   resources to web browsers, other to web crawlers.

   Example:

     Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko

Karcz                     Expires May 14, 2015                  [Page 6]
Internet-Draft          Unified User-Agent String          November 2014

4.  ABNF Definition of UUAS

   ; Unified User-Agent String general definition
   User-Agent = uuas
   uuas = standard-string / regular-string / browser-string

   ; Standard string, as described in [RFC7231]
   standard-string = product *( RWS ( product / comment ) )

   ; Regular string, recommended for non-browsers
   regular-string = product RWS "(" os [ sc 1*ctext ] ")"
                    *( RWS ( product / comment ) )

   ; String recommended for web browsers and crawlers
   browser-string = Mozilla-tag
                    RWS "(" *( signature sc ) os
                    *( sc signature ) [ sc language ]
                    *( sc signature ) [ sc rvtag ] ")"
                    RWS Gecko-string
                    *( RWS ( product / comment ) )

   ; Tags and signatures definitions
   signature = product / 1*schar
   os = 1*schar
   language = <Language-Tag, see [RFC5646], Section 2.1>
   rvtag = "rv:" OWS token
   Mozilla-tag = "Mozilla/5.0"
   Gecko-string = Gecko-tag
                  / ( product RWS "(" *ctext
                  RWS Gecko-tag [ RWS 1*ctext ] ")" )
   Gecko-tag = ["like "] "Gecko" ["/20100101"]

   ; Additional definitions
   product = <product, see [RFC7231], Section 5.5.3>
   comment = <comment, see [RFC7230], Section 3.2.6>
   ctext = <ctext, see [RFC7230], Section 3.2.6>
   schar = tchar / HTAB / SP / obs-text
   token = <token, see [RFC7230], Section 3.2.6>
   tchar = <tchar, see [RFC7230], Section 3.2.6>
   obs-text = <obs-text, see [RFC7230], Section 3.2.6>
   sc = ";" OWS
   OWS = <OWS, see [RFC7230], Section 3.2.3>
   RWS = <RWS, see [RFC7230], Section 3.2.3>

Karcz                     Expires May 14, 2015                  [Page 7]
Internet-Draft          Unified User-Agent String          November 2014

5.  Security Considerations

   Implementations are encouraged not to use the product tokens of other
   implementations in order to declare compatibility or identity with
   them beyond the scope prescribed in this document, as this
   circumvents the purpose of the User-Agent field.

   A user agent SHOULD NOT generate a User-Agent field containing
   needlessly fine-grained detail and SHOULD limit the addition of
   subproducts by third parties. Overly detailed User-Agent strings
   increase request latency and the risk of a user being identified
   against their wishes. In theory, this can make it easier for an
   attacker to exploit known security holes; in practice, attackers tend
   to try all potential holes regardless of the software being used.
   But when User-Agent string is combined with other characteristics of
   the application, particularly if the client application sends
   excessive details about the user's system or extensions, the risk of
   successful attack gets higher.

   As User-Agent strings are text data, they can be used to carry out
   attacks by causing buffer overfows or changing formatting strings.
   Implementers should secure their applications against such practices.

   Data provided by User-Agent header field can be used to discriminate
   the users of particular client applications by preventing them
   accessing the requested resources or replacing them with false ones.

6.  IANA Considerations

   This document has no actions for IANA.

7.  Acknowledgments

   I would like to thank my English teacher, who devoted her time to
   conduct a linguistic revision of this Memo.

8.  References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", RFC 2119, BCP 14, March 1997.

   [RFC5646]  Phillips, A. and M. Davis, "Tags for Identifying
              Languages", RFC 5646, September 2009.

   [RFC5234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", RFC 5234, STD 68, January 2008.

Karcz                     Expires May 14, 2015                  [Page 8]
Internet-Draft          Unified User-Agent String          November 2014

   [RFC7230]  Fielding, R. and J. Reschke, "Hypertext Transfer Protocol
              (HTTP/1.1): Message Syntax and Routing", RFC 7230, June
              2014.

   [RFC7231]  Fielding, R. and J. Reschke, "Hypertext Transfer Protocol
              (HTTP/1.1): Semantics and Content", RFC 7231, June 2014.

   [W3C.REC-html5-20141028]
              Hickson, I., Berjon, R., Faulkner, S., Leithead, T., Doyle
              Navara, E., O'Connor, E., and S. Pfeiffer, "HTML5", World
              Wide Web Consortium Recommendation REC-html5-20141028,
              October 2014,
              <http://www.w3.org/TR/2014/REC-html5-20141028/>.

Author's Address

   Mateusz Karcz
   Uniwersyteckie Katolickie Liceum Ogolnoksztalcace w Tczewie
   6 Wodna Street
   Tczew, PM  83-100
   PL

   Email: mateusz.karcz(at)interia.eu

Karcz                     Expires May 14, 2015                  [Page 9]
Unified User-Agent String draft-karcz-uuas-01

Unified User-Agent String
draft-karcz-uuas-01