Skip to main content

CIP Index Object Format for SOIF Objects
draft-ietf-find-cip-soif-02

The information below is for an old version of the document that is already published as an RFC.
Document Type
This is an older version of an Internet-Draft that was ultimately published as RFC 2655.
Authors Mic Bowman , Ted Hardie , Darren R. Hardy , Dr. Mike F. Schwartz , Duane Wessels
Last updated 2020-01-21 (Latest revision 1997-10-31)
RFC stream Internet Engineering Task Force (IETF)
Intended RFC status Experimental
Formats
Additional resources Mailing list discussion
Stream WG state (None)
Document shepherd (None)
IESG IESG state Became RFC 2655 (Experimental)
Consensus boilerplate Unknown
Telechat date (None)
Responsible AD (None)
Send notices to (None)
draft-ietf-find-cip-soif-02
#x27;s
     information.

Gatherer-Version
     Version number of the Gatherer.

Update-Time
     The time that Gatherer updated the content summary for the object.

Keywords
     Searchable keywords extracted from the object.

Last-Modification-Time
     The time that the object was last modified.

MD5
     MD5 16-byte checksum of the object.

Refresh-Rate
     The number of seconds after Update-Time when the summary object is
     to be re-generated.  Defaults to 1 month.

Time-to-Live
     The number of seconds after Update-Time when the summary object is
     no longer valid.  Defaults to 6 months.

Title
     Title of the object.

Type
     The object's type. Some example types are:

             Archive
             Audio
             Awk
             Backup
             Binary
             C
             CHeader
             Command
             Compressed
             CompressedTar
             Configuration
             Data
             Directory
             DotFile
             Dvi
             FAQ
             FYI
             Font
             FormattedText
             GDBM
             GNUCompressed
             GNUCompressedTar
             HTML
             Image
             Internet-Draft
             MacCompressed
             Mail
             Makefile
             ManPage
             Object
             OtherCode
             PCCompressed
             Patch
             Perl
             PostScript
             RCS
             README
             RFC
             SCCS
             ShellArchive
             Tar
             Tcl
             Tex
             Text
             Troff
             Uuencoded
             WaisSource

Update-Time
     The time that the summary object was last updated.
     REQUIRED field, no default.

URL-References
     Any URL references present within HTML objects.

Appendix B.

Proposed Attributes for a "CIP-HINT" Template Type

Attribute-Identifier-List
     A comma-delimited list whose entries take the form
     Template-Type:Attribute .  This list identifies the
     attributes against which queries are supported.  Because
     of the current limitation on Identifiers, this list
     must be in ASCII.

Source
     The URI of the service which created some or all of the
     index objects to which this hint applies.  Note that this
     service may be and often is distinct from the server which
     provides query access to those objects.

Total-Object-Count
     The total number of index objects in the collection for
     which the Hint applies.  This should be a positive integer.

Weightlist-[Attribute-Identifier]
     This construction allows the HINT to contain a weighted
     list of values for a specific Attribute-Identifier.  There
     may be as many Weightlist entries as there Attribute-Identifiers
     in the Attribute-Identifier-List.  Each Weightlist entry takes
     the form of Value;Object-Count, where the object count is
     a positive integer representing the number of objects within
     the collection which contain that value. Weightlists are comma-
     delimited.  Should a Value contain a comma, it should be escaped 
     when incorporated into the weightlist.
     
Threshold-[Attribute-Identifier]
     If a server wishes not to report infrequently occurring Values in
     a specific Weightlist, it may declare a threshold under which it
     will not report Values.

Certification-Type
     The type of Certification used for this object

Certification
     The Value of the Certification.

Date
     The Date at which the hint was generated

Example:

@CIP-HINT{ http://nic.nasa.gov:80/Harvest/brokers/NASA/
Attribute-Identifier-list{49}:    DOCUMENT:Author, DOCUMENT:Keywords, IMAGE:Subject
Source-1{45}: http://nic.nasa.gov/Harvest/gatherers/Eureka/
Source-2{46}: http://techreports.larc.nasa.gov/cgi-bin/NTRS/
Total-Object-Count{5}:    10000
Weightlist-[IMAGE:Subject]{40}:   Shuttle;100, Planet;227, Moon;15, Sun;33
Threshold-[IMAGE:Subject]{2}:     10
Weightlist-[DOCUMENT:Author]{49}: Grizzard;12, Aldrin\, Buzz;15, Aldrin\, James;45,
Threshold-[DOCMENT:Author]{1}:    5
Certification-Type{13}:   PGP-Signature
Certification{51}: mQCNAzFNm5QAAEEALUBOolOWKpby+=YtmtBxUZWQgSGFyZGllID
Date{29}:  Sun, 05 Jan 1997 08:33:33 GMT
}

Appendix C.

A "Dublin-Core" Template Type [Ref. 8,9]

TITLE
     The name given to the resource by the CREATOR or PUBLISHER.

CREATOR
     The person(s) or organization(s) primarily responsible for the
     intellectual content of the resource.  For example, authors in the
     case of written documents, artists, photographers, or illustrators
     in the case of visual resources.

SUBJECT
     The topic of the resource, or keywords or phrases that describe
     the subject or content of the resource.  The intent of the
     specification of this element is to promote the use of controlled
     vocabularies and keywords.  This element might well include
     scheme-qualified classification data (for example, Library of
     Congress Classification Numbers or Dewey Decimal numbers) or
     scheme-qualified controlled vocabularies (such as Medical Subject
     Headings or Art and Architecture Thesaurus descriptors) as well.

DESCRIPTION
     A textual description of the content of the resource, including
     abstracts in the case of document-like objects or content
     descriptions in the case of visual resources.  Future metadata
     collections might well include computational content description
     (spectral analysis of a visual resource, for example) that may not
     be embeddable in current network systems.  In such a case this
     field might contain a link to such a description rather than the
     description itself.

PUBLISHER
     The entity responsible for making the resource available in its
     present form, such as a publisher, a university department, or a
     corporate entity.   The intent of specifying this field is to
     identify the entity that provides access to the resource.
     
CONTRIBUTOR 
     Person(s) or organization(s) in addition to those specified in the
     CREATOR element who have made significant intellectual contributions
     to the resource but whose contribution is secondary to the
     individuals or entities specifed in the CREATOR element (for
     example, editors, transcribers, illustrators, and convenors).

DATE
     The date the resource was made available in its present form.  The
     recommended best practice is an 8 digit number in the form YYYYMMDD
     as defined by ANSI X3.30-1985. In this scheme, the date element for
     the day this is written would be 19961203, or December 3, 1996.
     Many other schema are possible, but if used, they should be
     identified in an unambiguous manner.
   
TYPE
     The category of the resource, such as home page, novel, poem, working
     paper, technical report, essay, dictionary.  It is expected that
     RESOURCE TYPE will be chosen from an enumerated list of types.

FORMAT
     The data representation of the resource, such as text/html, ASCII,
     Postscript file,  executable application, or JPEG image.  The intent
     of specifying this element is to provide information necessary to
     allow people or machines to make decisions about the usability of
     the encoded data (what hardware and software might be required to
     display or execute it, for example).  As with RESOURCE TYPE, FORMAT
     will be assigned from enumerated lists such as registered Internet
     Media Types (MIME types).  In principal, formats can include
     physical media such as books, serials, or other non-electronic media. 

IDENTIFIER
     String or number used to uniquely identify the resource.  Examples
     for networked resources include URLs and URNs (when implemented).
     Other globally-unique identifiers,such as International Standard
     Book Numbers (ISBN) or other formal names would also be candidates
     for this element.

SOURCE 
     The work, either print or electronic, from which this resource
     is derived, if applicable. For example, an html encoding of a
     Shakespearean sonnet might identify the paper version of the
     sonnet from which the electronic version was transcribed.

LANGUAGE
     Language(s) of the intellectual content of the resource.  Where
     practical, the content of this field should coincide with the
     NISO Z39.53 three character codes for written languages. 

RELATION
     Relationship to other resources.  The intent of specifying this
     element is to provide a means to express relationships among
     resources that have formal relationships to others, but exist as
     discrete resources themselves.  For example, images in a document,
     chapters in a book, or items in a collection.  A formal
     specification of RELATION is currently under development.  Users
     and developers should understand that use of this element should
     be currently considered experimental.

COVERAGE
     The spatial locations and temporal durations characteristic of the
     resource.    Formal specification of COVERAGE is currently under
     development. Users and developers should understand that use of
     this element should be currently considered experimental.

RIGHTS
     The content of this element is intended to be a link (a URL or
     other suitable URI as appropriate) to a copyright notice, a
     rights-management statement, or perhaps a server that would
     provide such information in a dynamic way.  The intent of
     specifying this field is to allow providers a means to associate
     terms and conditions or copyright statements with a resource or
     collection of resources.   No assumptions should be made by users
     if such a field is empty or not present.

Example:

@Dublin-Core-1 { ftp://ds.internic.net/internet-drafts/draft-kunze-dc-00.txt
TITLE{52}:      Dublin Core Metadata for Simple Resource Description    
CREATOR-1{9}:   S. Weibel
CREATOR-2{8}:   J. Kunze
CREATOR-3{9}:   C. Lagoze
SUBJECT{44}:    The Dublin Core Set of Elements for Metadata
DESCRIPTION{46}:        Reference description of Dublin Core elements.
PUBLISHER{31}:  Internet Engineering Task Force
CONTRIBUTOR-1{11}:      Nick Arnett
CONTRIBUTOR-2{15}:      Eliot Christian
CONTRIBUTOR-3{14}:      Martijn Koster
CONTRIBUTOR-4{18}:      Christian Mogensen
CONTRIBUTOR-5{14}:      Timothy Niesen
CONTRIBUTOR-6{11}:      Andrew Wood
CONTRIBUTOR-7{10}:      Mic Bowman
CONTRIBUTOR-8{11}:      Dan Connoly
CONTRIBUTOR-9{15}:      Michael Mauldin
CONTRIBUTOR-10{12}:     Wick Nichols
DATE{16}:       February 9, 1997
TYPE{14}:       Internet draft
FORMAT{4}:      Text
IDENTIFIER:{21} draft-kunze-dc-00.txt
SOURCE{41}:     http://purl.oclc.org/metadata/dublin_core
LANGUAGE{3}:    eng
RELATION{24}:   Draft Reference Standard
COVERAGE{22}:   Expires August 8, 1997
RIGHTS{58}:     Unlimited Distribution; readers must not cite as standard.
}