IMAP4 Extension for Fuzzy Search
RFC 6203
|
Document |
Type |
|
RFC - Proposed Standard
(March 2011; No errata)
|
|
Author |
|
Timo Sirainen
|
|
Last updated |
|
2015-10-14
|
|
Stream |
|
IETF
|
|
Formats |
|
plain text
html
pdf
htmlized
bibtex
|
|
Reviews |
|
|
Stream |
WG state
|
|
(None)
|
|
Document shepherd |
|
No shepherd assigned
|
IESG |
IESG state |
|
RFC 6203 (Proposed Standard)
|
|
Consensus Boilerplate |
|
Unknown
|
|
Telechat date |
|
|
|
Responsible AD |
|
Alexey Melnikov
|
|
Send notices to |
|
(None)
|
Internet Engineering Task Force (IETF) T. Sirainen
Request for Comments: 6203 March 2011
Category: Standards Track
ISSN: 2070-1721
IMAP4 Extension for Fuzzy Search
Abstract
This document describes an IMAP protocol extension enabling a server
to perform searches with inexact matching and assigning relevancy
scores for matched messages.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc6203.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Sirainen Standards Track [Page 1]
RFC 6203 IMAP4 FUZZY Search March 2011
1. Introduction
When humans perform searches in IMAP clients, they typically want to
see the most relevant search results first. IMAP servers are able to
do this in the most efficient way when they're free to internally
decide how searches should match messages. This document describes a
new SEARCH=FUZZY extension that provides such functionality.
2. Conventions Used in This Document
In examples, "C:" indicates lines sent by a client that is connected
to a server. "S:" indicates lines sent by the server to the client.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [KEYWORDS].
3. The FUZZY Search Key
The FUZZY search key takes another search key as its argument. The
server is allowed to perform all matching in an implementation-
defined manner for this search key, including ignoring the active
comparator as defined by [RFC5255]. Typically, this would be used to
search for strings. For example:
C: A1 SEARCH FUZZY (SUBJECT "IMAP break")
S: * SEARCH 1 5 10
S: A1 OK Search completed.
Besides matching messages with a subject of "IMAP break", the above
search may also match messages with subjects "broken IMAP", "IMAP is
broken", or anything else the server decides that might be a good
match.
This example does a fuzzy SUBJECT search, but a non-fuzzy FROM
search:
C: A2 SEARCH FUZZY SUBJECT work FROM user@example.com
S: * SEARCH 1 4
S: A2 OK Search completed.
How the server handles multiple separate FUZZY search keys is
implementation-defined.
Fuzzy search algorithms might change, or the results of the
algorithms might be different from search to search, so that fuzzy
searches with the same parameters might give different results for
1) the same user at different times, 2) different users (searches
Sirainen Standards Track [Page 2]
RFC 6203 IMAP4 FUZZY Search March 2011
executed simultaneously), or 3) different users (searches executed at
different times). For example, a fuzzy search might adapt to a
user's search habits in an attempt to give more relevant results (in
a "learning" manner). Such differences can also occur because of
operational decisions, such as load balancing. Clients asking for
"fuzzy" really are requesting search results in a not-necessarily-
deterministic way and need to give the user appropriate warning about
that.
4. Relevancy Scores for Search Results
Servers SHOULD assign a search relevancy score for each matched
message when the FUZZY search key is given. Relevancy scores are
given in the range 1-100, where 100 is the highest relevancy. The
relevancy scores SHOULD use the full 1-100 range, so that clients can
show them to users in a meaningful way, e.g., as a percentage value.
As the name already indicates, relevancy scores specify how relevant
to the search the matched message is. It's not necessarily the same
Show full document text