IMAP4 Extension for Fuzzy Search
Draft of message to be sent after approval:
From: The IESG <email@example.com> To: IETF-Announce <firstname.lastname@example.org> Cc: Internet Architecture Board <email@example.com>, RFC Editor <firstname.lastname@example.org>, morg mailing list <email@example.com>, morg chair <firstname.lastname@example.org> Subject: Protocol Action: 'IMAP4 Extension for Fuzzy Search' to Proposed Standard (draft-ietf-morg-fuzzy-search-03.txt) The IESG has approved the following document: - 'IMAP4 Extension for Fuzzy Search' (draft-ietf-morg-fuzzy-search-03.txt) as a Proposed Standard This document is the product of the Message Organization Working Group. The IESG contact persons are Alexey Melnikov and Peter Saint-Andre. A URL of this Internet Draft is: http://datatracker.ietf.org/doc/draft-ietf-morg-fuzzy-search/
Technical Summary This document describes an IMAP protocol extension enabling servers to perform searches with inexact matching and assigning relevancy scores for matched messages. This allows more flexible searching, as well as optimization in server processing. Working Group Summary One working group participant thinks there are too many IMAP extensions already, and we don't need this one. That view has been considered, but several working group participants plan to implement this, and some already have. There is broad agreement in the working group that this extension has value. Document Quality At least two working group participants have implemented this, and others have said they plan to. Many working group participants have reviewed and discussed it; none merit special mention. Personnel Barry Leiba is the document shepherd. Alexey Melnikov is the Responsible Area Director. RFC Editor Note Please add the following paragraph at the end of section 3: Fuzzy search algorithms might change, or the results of the algorithms might be different from search to search, so that fuzzy searches with the same parameters might give different results at different times, for different users, or both. For example, a fuzzy search might adapt to a user's search habits in an attempt to give more relevant results (in a "learning" manner). Such differences can also occur because of operational decisions, such as load balancing. Clients asking for "fuzzy" really are requesting search results in a not necessarily deterministic way, and need to give the user appropriate warning about that. In Section 4, please update the 2nd line of the example to read: OLD: S: * ESEARCH (TAG "B2") ALL 1,5,10 RELEVANCY (4 99 42) NEW: S: * ESEARCH (TAG "B1") ALL 1,5,10 RELEVANCY (4 99 42) (i.e. the TAG value should be "B1", not "B2") Please rename the title of the section 6 to read: OLD: 6. Extensions to SORT NEW 6. Extensions to SORT and SEARCH In Section 6, please replace the 6th paragraph to read: OLD: To limit the number of returned messages, use the PARTIAL return option. For example this returns the 10 most relevant messages: NEW: Furthermore, if the server advertises the CONTEXT=SORT (or CONTEXT=SEARCH) capability, then the client can limit the number of returned messages to a SORT (or a SEARCH) by using the PARTIAL return option. For example this returns the 10 most relevant messages: In Section 8, 1st paragraph: OLD: Implementation of this extension might enable a denial-of-service attack if the implementation isn't careful to prevent them. Fuzzy search engines are often complex with non-obvious disk space, memory and/or CPU usage patterns. Implementors should test at least the behavior of large messages that contain very long words and/or unique random strings. Also very long search keys might cause excessive memory or CPU usage. NEW: Implementation of this extension might enable denial-of-service attacks against server resources. Servers MAY limit the resources that a single search (or a single user) may use. Additionally, implementors should be aware of the following: Fuzzy search engines are often complex with non-obvious disk space, memory and/or CPU usage patterns. Server implementors should at least test the fuzzy-search behavior with large messages that contain very long words and/or unique random strings. Also very long search keys might cause excessive memory or CPU usage.