Network Working Group                                Ned Freed
Internet Draft                                     Keith Moore
<draft-ietf-mailext-acc-url-01.txt>   Allan Cargille, WG Chair

                    Definition of the URL
                MIME External-Body Access-Type

                        November 1995


                     Status of this Memo

This document is an Internet-Draft.  Internet-Drafts are
working documents of the Internet Engineering Task Force
(IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-
Drafts.

Internet-Drafts are draft documents valid for a maximum of six
months. Internet-Drafts may be updated, replaced, or obsoleted
by other documents at any time.  It is not appropriate to use
Internet-Drafts as reference material or to cite them other
than as a "working draft" or "work in progress".

To learn the current status of any Internet-Draft, please
check the 1id-abstracts.txt listing contained in the
Internet-Drafts Shadow Directories on ds.internic.net (US East
Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast),
or munnari.oz.au (Pacific Rim).


1.  Abstract

This memo defines a new access-type for message/external-body
MIME parts for Uniform Resource Locators (URLs).  URLs provide
schemes to access external objects via a growing number of
protocols, including HTTP, Gopher, and TELNET.  An initial set
of URL schemes are defined in RFC 1738.


2.  Introduction

The Multipurpose Internet Message Extensions (MIME) define a
facility whereby an object can contain a reference or pointer
to some form of data rather than the actual data itself. This











Internet Draft         URL Access-Type           November 1995


facility is embodied in the message/external-body media type
defined in RFC 1521.  Use of this facility is growing as a
means of conserving bandwidth when large objects are sent to
large mailing lists.

Each message/external-body reference must specify a mechanism
whereby the actual data can be retrieved.  These mechanisms
are called access types, and RFC 1521 defines an initial set
of access types: "FTP", "ANON-FTP", "TFTP", "LOCAL-FILE", and
"MAIL-SERVER".

Uniform Resource Locators, or URLs, also provide a means by
which remote data can be retrieved automatically.  Each URL
string begins with a scheme specification, which in turn
specifies how the remaining string is to be used in
conjunction with some protocol to retrieve the data. However,
URL schemes exist for protocol operations that have no
corresponding MIME message/external-body access type.
Registering an access type for URLs therefore provides
message/external-body with access to the retrieval mechanisms
of URLs that are not currently available as access types.  It
also provides access to any future mechanisms for which URL
schemes are developed.

This access type is only intended for use with URLs that
actually retreive something. Other URL mechansisms, e.g.
mailto, may not be used in this context.


3.  Definition of the URL Access-Type

The URL access-type is defined as follows:

 (1)   The name of the access-type is URL.

 (2)   A new message/external-body content-type parameter is
       used to actually store the URL string. The name of the
       parameter is also "URL", and this parameter is
       mandatory for this access-type. The syntax and use of
       this parameter is specified in the next section.

 (3)   The phantom body area of the message/external-body is
       not used and should be left blank.







                       Expires May 1996               [Page 2]


Internet Draft         URL Access-Type           November 1995


For example, the following message illustrates how the URL
access-type is used:

  Content-type: message/external-body; access-type=URL;
                URL="http://www.foo.com/file"

  Content-type: text/html
  Content-Transfer-Encoding: binary

  THIS IS NOT REALLY THE BODY!


3.1.  Syntax and Use of the URL parameter

Using the ANBF notations and definitions of RFC 822 and RFC
1521, the syntax of the URL parameter Is as follows:

  URL-parameter := <"> URL-word *(*LWSP-char URL-word) <">

  URL-word := token
              ; Must not exceed 40 characters in length

The syntax of an actual URL string is given in RFC 1738.  URL
strings can be of any length and can contain arbitrary
character content.  This presents problems when URLs are
embedded in MIME body part headers that are wrapped according
to RFC 822 rules. For this reason they are transformed into a
URL-parameter for inclusion in a message/external-body
content-type specification as follows:

 (1)   A check is made to make sure that all occurrences of
       SPACE, CTLs, double quotes, backslashes, and 8-bit
       characters in the URL string are already encoded using
       the URL encoding scheme specified in RFC 1738. Any
       unencoded occurrences of these characters must be
       encoded.  Note that the result of this operation is
       nothing more than a different representation of the
       original URL.

 (2)   The resulting URL string is broken up into substrings
       of 40 characters or less.

 (3)   Each substring is placed in a URL-parameter string as a
       URL-word, separated by one or more spaces.  Note that
       the enclosing quotes are always required since all URLs





                       Expires May 1996               [Page 3]


Internet Draft         URL Access-Type           November 1995


       contain one or more colons, and colons are tspecial
       characters [RFC 1521].

Extraction of the URL string from the URL-parameter is even
simpler: The enclosing quotes and any linear whitespace are
removed and the remaining material is the URL string.

The follow example shows how a long URL is handled:

  Content-type: message/external-body; access-type=URL;
                URL="ftp://ftp.deepdirs.org/1/2/3/4/5/6/7/
                     8/9/10/11/12/13/14/15/16/17/18/20/21/
                     file.html"

  Content-type: text/html
  Content-Transfer-Encoding: binary

  THIS IS NOT REALLY THE BODY!

Some URLs may provide access to multiple versions of the same
object in different formats. The HTTP URL mechanism has this
capability, for example.  However, applications may not expect
to receive something whose type doesn't agree with that
expressed in the message/external-body, and may in fact have
already made irrevocable choices based on this information.

Due to these considerations, the following restriction is
imposed:  When URLs are used in the context of an access-type
only those versions of an object whose content-type agrees
with that specified by the inner message/external-body header
can be retrieved and used.


4.  Security Considerations

The security considerations of using URLs in the context of a
MIME access-type are no different from the concerns that arise
from their use in other contexts. The specific security
considerations associated with each type of URL are discussed
in the URL's defining document.

Note that the Content-MD5 field can be used in conjunction
with any message/external-body access-type to provide an
integrity check. This insures that the referenced object
really is what the message originator intended it to be. This





                       Expires May 1996               [Page 4]


Internet Draft         URL Access-Type           November 1995


is not a signature service and should not be confused with
one, but nevetheless is quite useful in many situations.


5.  Acknowledgements

The authors are grateful for the feedback and review provided
by John Beck and John Klensin.


6.  References

[RFC-822]
     Crocker, D., "Standard for the Format of ARPA Internet
     Text Messages", STD 11, RFC 822, UDEL, August 1982.

[RFC-1521]
     Borenstein, N. and Freed, N., "MIME (Multipurpose
     Internet Mail Extensions): Mechanisms for Specifying and
     Describing the Format of Internet Message Bodies", RFC
     1521, Bellcore, Innosoft, September, 1993.

[RFC-1590]
     Postel, J., "Media Type Registration Procedure", RFC
     1590, USC/Information Sciences Institute, March 1994.

[RFC-1738]
     Berners-Lee, T., Masinter, L., McCahill, M., "Uniform
     Resource Locators (URL)", December 1994.





















                       Expires May 1996               [Page 5]


Internet Draft         URL Access-Type           November 1995


7.  Author Addresses

Ned Freed
Innosoft International, Inc.
1050 East Garvey Avenue South
West Covina, CA 91790
USA
 tel: +1 818 919 3600           fax: +1 818 919 3614
 email: ned@innosoft.com

Keith Moore
Computer Science Dept.
University of Tennessee
107 Ayres Hall
Knoxville, TN 37996-1301
USA
 email: moore@cs.utk.edu

































                       Expires May 1996               [Page 6]