IPFIX Working Group                                            E. Boschi
Internet-Draft                                               B. Trammell
Intended status: Experimental                             Hitachi Europe
Expires: July 16, 2009                                  January 12, 2009


                     IP Flow Anonymisation Support
                     draft-boschi-ipfix-anon-02.txt

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on July 16, 2009.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.

Abstract

   This document describes anonymisation techniques for IP flow data and
   the export of anonymised data using the IPFIX protocol.  It provides



Boschi & Trammell         Expires July 16, 2009                 [Page 1]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   a categorization of common anonymisation schemes and defines the
   parameters needed to describe them.  It provides guidelines for the
   implementation of anonymised data export and storage over IPFIX, and
   describes an Options-based method for anonymization metadata export
   within the IPFIX protocol, providing the basis for the definition of
   information models for configuring anonymisation techniques within an
   IPFIX Metering or Exporting Process, and for reporting the technique
   in use to an IPFIX Collecting Process.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  IPFIX Protocol Overview  . . . . . . . . . . . . . . . . .  3
     1.2.  IPFIX Documents Overview . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Categorisation of Anonymisation Techniques . . . . . . . . . .  4
   4.  Anonymisation of IP Flow Data  . . . . . . . . . . . . . . . .  6
     4.1.  IP Address Anonymisation . . . . . . . . . . . . . . . . .  7
       4.1.1.  Truncation . . . . . . . . . . . . . . . . . . . . . .  7
       4.1.2.  Random Permutation . . . . . . . . . . . . . . . . . .  7
       4.1.3.  Prefix-preserving Pseudonymisation . . . . . . . . . .  7
     4.2.  Timestamp Anonymisation  . . . . . . . . . . . . . . . . .  8
       4.2.1.  Precision Degradation  . . . . . . . . . . . . . . . .  8
       4.2.2.  Enumeration  . . . . . . . . . . . . . . . . . . . . .  8
       4.2.3.  Random Time Shifts . . . . . . . . . . . . . . . . . .  8
     4.3.  Counter Anonymisation  . . . . . . . . . . . . . . . . . .  8
       4.3.1.  Precision Degradation  . . . . . . . . . . . . . . . .  9
       4.3.2.  Binning  . . . . . . . . . . . . . . . . . . . . . . .  9
       4.3.3.  Random Noise Addition  . . . . . . . . . . . . . . . .  9
     4.4.  Anonymisation of Other Flow Fields . . . . . . . . . . . .  9
   5.  Applying Anonymisation Techniques to IPFIX Export and
       Storage  . . . . . . . . . . . . . . . . . . . . . . . . . . .  9
     5.1.  Arrangement of Processes in IPFIX Anonymisation  . . . . . 10
     5.2.  IPFIX-Specific Anonymisation Guidelines  . . . . . . . . . 11
       5.2.1.  Anonymisation of Header Data . . . . . . . . . . . . . 11
       5.2.2.  Anonymisation of Options Data  . . . . . . . . . . . . 12
   6.  Parameters for the Description of Anonymisation Techniques . . 13
   7.  Anonymisation Metadata Support in IPFIX  . . . . . . . . . . . 13
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 14
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 14
   10. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 14
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 14
     11.2. Informative References . . . . . . . . . . . . . . . . . . 14
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15





Boschi & Trammell         Expires July 16, 2009                 [Page 2]


Internet-Draft        IP Flow Anonymisation Support         January 2009


1.  Introduction

   The standardisation of an IP flow information export protocol
   [RFC5101] and associated representations removes a technical barrier
   to the sharing of IP flow data across organizational boundaries and
   with network operations, security, and research communities for a
   wide variety of purposes.  However, with wider dissemination comes
   greater risks to the privacy of the users of networks under
   measurement, and to the security of those networks.  While it is not
   a complete solution to the issues posed by distribution of IP flow
   information, anonymisation is an important tool for the protection of
   privacy within network measurement infrastructures.

   This document presents a mechanism for representing anonymised data
   within IPFIX and guidelines for using it.  It begins with a
   categorization of anonymisation techniques.  It then describes
   applicability of each technique to commonly anonymisable fields of IP
   flow data, organized by information element data type and semantics
   as in [RFC5102]; enumerates the parameters required by each of the
   applicable anonymisation techniques; and provides guidelines for the
   use of each of these techniques in accordance with best practices in
   data protection.  Finally, it specifies a mechanism for exporting
   anonymised data and binding anonymisation metadata to templates using
   IPFIX Options.

1.1.  IPFIX Protocol Overview

   In the IPFIX protocol, { type, length, value } tuples are expressed
   in templates containing { type, length } pairs, specifying which {
   value } fields are present in data records conforming to the
   Template, giving great flexibility as to what data is transmitted.
   Since Templates are sent very infrequently compared with Data
   Records, this results in significant bandwidth savings.  Various
   different data formats may be transmitted simply by sending new
   Templates specifying the { type, length } pairs for the new data
   format.  See [RFC5101] for more information.

   The IPFIX information model [RFC5102] defines a large number of
   standard Information Elements which provide the necessary { type }
   information for Templates.  The use of standard elements enables
   interoperability among different vendors' implementations.
   Additionally, non-standard enterprise-specific elements may be
   defined for private use.

1.2.  IPFIX Documents Overview

   "Specification of the IPFIX Protocol for the Exchange of IP Traffic
   Flow Information" [RFC5101] and its associated documents define the



Boschi & Trammell         Expires July 16, 2009                 [Page 3]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   IPFIX Protocol, which provides network engineers and administrators
   with access to IP traffic flow information.

   "Architecture for IP Flow Information Export"
   [I-D.ietf-ipfix-architecture] defines the architecture for the export
   of measured IP flow information out of an IPFIX Exporting Process to
   an IPFIX Collecting Process, and the basic terminology used to
   describe the elements of this architecture, per the requirements
   defined in "Requirements for IP Flow Information Export" [RFC3917].
   The IPFIX Protocol document [RFC5101] then covers the details of the
   method for transporting IPFIX Data Records and Templates via a
   congestion-aware transport protocol from an IPFIX Exporting Process
   to an IPFIX Collecting Process.

   "Information Model for IP Flow Information Export" [RFC5102]
   describes the Information Elements used by IPFIX, including details
   on Information Element naming, numbering, and data type encoding.
   Finally, "IPFIX Applicability" [I-D.ietf-ipfix-as] describes the
   various applications of the IPFIX protocol and their use of
   information exported via IPFIX, and relates the IPFIX architecture to
   other measurement architectures and frameworks.

   Additionally, the "Specification of the IPFIX File Format"
   [I-D.ietf-ipfix-file] describes a file format based upon the IPFIX
   Protocol for the storage of flow data.

   This document references the Protocol and Architecture documents for
   terminology, and extends the IPFIX Information Model to provide new
   Information Elements for anonymisation metadata.  The anonymisation
   techniques described herein are equally applicable to the IPFIX
   Protocol and data stored in IPFIX Files.


2.  Terminology

   Terms used in this document that are defined in the Terminology
   section of the IPFIX Protocol [RFC5101] document are to be
   interpreted as defined there.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


3.  Categorisation of Anonymisation Techniques

   Anonymisation modifies a data set in order to protect the identity of
   the people or entities described by the data set from disclosure.



Boschi & Trammell         Expires July 16, 2009                 [Page 4]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   With respect to network traffic data, anonymisation generally
   attempts to preserve some set of properties of the network traffic
   useful for a given application or applications, while ensuring the
   data cannot be traced back to the specific networks, hosts, or users
   generating the traffic.

   Anonymisation may be broadly classified according to two properties:
   recoverability and countability.  All anonymisation techniques map
   the real space of identifiers or values into a separate, anonymised
   space, according to some function.  A technique is said to be
   recoverable when the function used is invertible or can otherwise be
   reversed and a real identifier can be recovered from a given
   replacement identifier.

   Countability compares the dimension of the anonymised space (N) to
   the dimension of the real space (M), and denotes how the count of
   unique values is preserved by the anonymisation function.  If the
   anonymised space is smaller than the real space, then the function is
   said to generalise the input, mapping more than one input point to
   each anonymous value (e.g., as with aggregation).  By definition,
   generalisation is not recoverable.

   If the dimensions of the anonymised and real spaces are the same,
   such that the count of unique values is preserved, then the function
   is said to be a direct substitution function.  If the dimension of
   the anonymised space is larger, such that each real value maps to a
   set of anonymised values, then the function is said to be a set
   substitution function.  Note that with set substitution functions,
   the sets of anonymised values are not necessarily disjoint.  Either
   direct or set substitution functions are said to be one-way if there
   exists no method for recovering the real data point from an
   anonymised one.

   This classification is summarised in the table below.

   +------------------------+-----------------+------------------------+
   | Recoverability /       | Recoverable     | Non-recoverable        |
   | Countability           |                 |                        |
   +------------------------+-----------------+------------------------+
   | N < M                  | N.A.            | Generalisation         |
   | N = M                  | Direct          | One-way Direct         |
   |                        | Substitution    | Substitution           |
   | N > M                  | Set             | One-way Set            |
   |                        | Substitution    | Substitution           |
   +------------------------+-----------------+------------------------+






Boschi & Trammell         Expires July 16, 2009                 [Page 5]


Internet-Draft        IP Flow Anonymisation Support         January 2009


4.  Anonymisation of IP Flow Data

   Due to the restricted semantics of IP flow data, there are a
   relatively limited set of specific anonymisation techniques available
   on flow data, though each falls into the broad categories above.
   Each type of field that may commonly appear in a flow record may have
   its own applicable specific techniques.

   While anonymisation is generally applied at the resolution of single
   fields within a flow record, attacks against anonymisation use entire
   flows and relationships between hosts and flows within a given data
   set.  Therefore, fields which may not necessarily be identifying by
   themselves may be anonymised in order to increase the anonymity of
   the data set as a whole.

   Of all the fields in an IP flow record, only IP addresses directly
   identify entities in the real world.  Each IP address is associated
   with an interface on a network host, and can potentially be
   identified with a single user.  Additionally, IP addresses are
   structured identifiers; that is, partial IP address prefixes may be
   used to identify networks just as full IP addresses identify hosts.
   This makes anonymisation of IP addresses particularly important.

   Port numbers identify abstract entities (applications) as opposed to
   real-world entities, but they can be used to classify hosts and user
   behavior.  Passive port fingerprinting, both of well-known and
   ephemeral ports, can be used to determine the operating system
   running on a host.  Relative data volumes by port can also be used to
   determine the host's function (workstation, web server, etc.); this
   information can be used to identify hosts and users.

   While not identifiers in and of themselves, timestamps and counters
   can reveal the behavior of the hosts and users on a network.  Any
   given network activity is recognizable by a pattern of relative time
   differences and data volumes in the associated sequence of flows,
   even without host address information.  They can therefore be used to
   identify hosts and users.  Timestamps and counters are also
   vulnerable to traffic injection attacks, where traffic with a known
   pattern is injected into a network under measurement, and this
   pattern is later identified in the anonymised data set.

   The simplest and most extreme form of anonymisation, which can be
   applied to any field of a flow record, is black-marker anonymisation,
   or complete deletion of a given field.  Note that black-marker
   anonymisation is equivalent to simply not exporting the field(s) in
   question.

   While black-marker anonymisation completely protects the data in the



Boschi & Trammell         Expires July 16, 2009                 [Page 6]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   deleted fields from the risk of disclosure, it also reduces the
   utility of the anonymised data set as a whole.  Techniques that
   retain some information while reducing (though not eliminating) the
   disclosure risk will be extensively discussed in the following
   sections; note that the techniques specifically applicable to IP
   addresses, timestamps, and counters will be discussed in separate
   sections.

4.1.  IP Address Anonymisation

   The following table gives an overview of the schemes for IP address
   anonymization described in this document and their categorization.

   +-------------------------------+-------------------+---------------+
   | Scheme                        | Action            | Reversibility |
   +-------------------------------+-------------------+---------------+
   | Truncation                    | Generalisation    | N             |
   | Random Permutation            | Direct            | Y/N           |
   |                               | Substitution      |               |
   | Prefix-preserving             | Direct            | Y             |
   | Pseudonymisation              | Substitution      |               |
   +-------------------------------+-------------------+---------------+

   Note that random permutations might be either reversible or not,
   depending on the function used.

4.1.1.  Truncation

   Truncation removes "n" of the least significant bits from an IP
   address.  Note that truncating 8 bits would replace an IP address
   with the corresponding class C network address.

4.1.2.  Random Permutation

   Random permutation replaces each IP address with a unique address
   randomply selected from the set of possible IP addresses.  The
   permutation function is implementable using a hash table to ensure
   uniqueness.

4.1.3.  Prefix-preserving Pseudonymisation

   Prefix-preserving pseudonymisation preserves the structure of subnets
   at each level while anonymising IP addresses.  If two real IP
   addresses match on a prefix of "n" bits, the two anonymised IP
   addresses will match on a prefix of "n" bits as well.






Boschi & Trammell         Expires July 16, 2009                 [Page 7]


Internet-Draft        IP Flow Anonymisation Support         January 2009


4.2.  Timestamp Anonymisation

   [TODO: introductory text]

   +-----------------------+---------------------------+---------------+
   | Scheme                | Action                    | Reversibility |
   +-----------------------+---------------------------+---------------+
   | Precision Degradation | Generalisation            | N             |
   | Enumeration           | Direct or Set             | Y             |
   |                       | Substitution              |               |
   | Random Shifts         | Direct Substitution       | Y             |
   +-----------------------+---------------------------+---------------+

4.2.1.  Precision Degradation

   Precision Degradation removes the most precise components of a
   timestamp, accounting all events occurring in each given interval
   (e.g. one millisecond for millisecond level degradation) as
   simultaneous.  This has the effect of potentially collapsing many
   timestamps into one.  With this technique time precision is reduced,
   and sequencing may be lost, but the information at which time the
   event occurred is preserved.

4.2.2.  Enumeration

   Enumeration keeps the chronological order in which events occurred
   while eliminating time information.  Timestamps are substituted by
   equidistant timestamps (or numbers) starting from an randomly chosen
   start value.

4.2.3.  Random Time Shifts

   Random Time Shifts keep the information on how far apart two events
   are from each other.  This is achieved by shifting all timestamps by
   the same random number.  Note that random time shifts also preserve
   chronological order.

4.3.  Counter Anonymisation

   Counters (such as packet and octet volumes per flow) are subject to
   fingerprinting and injection attacks against anonymisation, as
   timestamps are, but relative magnitudes of activity can be useful for
   certain analysis tasks.  [TODO: more intro text]








Boschi & Trammell         Expires July 16, 2009                 [Page 8]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   +-----------------------+---------------------------+---------------+
   | Scheme                | Action                    | Reversibility |
   +-----------------------+---------------------------+---------------+
   | Precision Degradation | Generalisation            | N             |
   | Binning               | Generalisation            | N             |
   | Random noise addition | Direct or Set             | N             |
   |                       | Substitution              |               |
   +-----------------------+---------------------------+---------------+

4.3.1.  Precision Degradation

   As with precision degradation in timestamps, precision degradation of
   counters removes lower-order bits of the counters, treating all the
   counters in a given range as having the same value.  Depending on the
   precision reduction, this loses information about the relationships
   between sizes of similarly-sized flows, but keeps relative magnitude
   information.

4.3.2.  Binning

   Binning can be seen as a special case of precision degradation; the
   operation is identical, except for in precision degradation the
   counter ranges are uniform, and in binning they need not be.  For
   example, a common counter binning scheme for packet counters could be
   to bin values 1-2 together, and 3-infinity together, thereby
   separating potentially completely-opened TCP connections from
   unopened ones.  Binning schemes are generally chosen to keep
   precisely the amount of information required in a counter for a given
   analysis task

4.3.3.  Random Noise Addition

   Random noise addition adds a random amount to a counter in each flow;
   this is used to keep relative magnitude information and minimize the
   disruption to size relationship information while avoiding
   fingerprinting attacks against anonymization.

4.4.  Anonymisation of Other Flow Fields

   [TODO: as section 4.1]


5.  Applying Anonymisation Techniques to IPFIX Export and Storage

   When exporting or storing anonymised flow data using IPFIX, certain
   interactions between the IPFIX Protocol and the anonymisation
   techniques in use must be considered; these are treated in the
   subsections below.



Boschi & Trammell         Expires July 16, 2009                 [Page 9]


Internet-Draft        IP Flow Anonymisation Support         January 2009


5.1.  Arrangement of Processes in IPFIX Anonymisation

   Anonymisation may be applied to IPFIX data at three stages within a
   the collection infrastructure: on initial export, at a mediator, or
   after collection, as shown in Figure 1.  Each of these locations has
   specific considerations and applicability.


                       +--------------------+
                       | IPFIX File Storage |
                       +--------------------+
                         ^
                         | (Anonymised after collection)
                         |
               +=======================================+
               | Collecting Process                    |
               +=======================================+
                 ^                                   ^
                 | (Anonymised at mediator)          |
                 |                                   |
               +=============================+       |
               | Mediator                    |       |
               +=============================+       |
                 ^                                   |
                 |    (Anonymised on initial export) |
                 |                                   |
               +=======================================+
               | Exporting Process                     |
               +=======================================+

                Figure 1: Potential Anonymisation Locations

   Anonymisation is generally performed before the wider dissemination
   or repurposing of a flow data set, e.g., adapting operational
   measurement data for research.  Therefore, direct anonymisation of
   flow data on initial export is only applicable in certain restricted
   circumstances: when the Exporting Process is "publishing" data to a
   Collecting Process directly, and the Exporting Process and Collecting
   Process are operated by different entities.  Note that certain
   guidelines in Section 5.2.1 with respect to timestamp anonymisation
   may not apply in this case, as the Collecting Process may be able to
   deduce certain timing information from the time at which each Message
   is received.

   A much more flexible arrangement is to anonymise data within a
   Mediator [I-D.ietf-ipfix-mediators-framework].  Here, original data
   is sent to a Mediator, which performs the anonymisation function and
   re-exports the anonymised data.  Such a Mediator could be located at



Boschi & Trammell         Expires July 16, 2009                [Page 10]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   the administrative domain boundary of the initial Exporting Process
   operator, exporting anonymised data to other consumers outside the
   organisation.  In this case, the original Exporter SHOULD use TLS as
   specified in [RFC5101] to secure the channel to the Mediator, and the
   Mediator should follow the guidelines in Section 5.2, to mitigate the
   risk of original data disclosure.

   When data is to be published as an anonymised data set in an IPFIX
   File [I-D.ietf-ipfix-file], the anonymisation may be done at the
   final Collecting Process before storage and dissemination, as well.
   In this case, the Collector should follow the guidelines in
   Section 5.2, especially as regards File-specific Options in
   Section 5.2.2

   Note that anonymisation may occur at more than one location within a
   given collection infrastructure, to provide varying levels of
   anonymisation reversal risk and utility for specific purposes.

5.2.  IPFIX-Specific Anonymisation Guidelines

   In implementing and deploying the anonymisation techniques described
   in this document, care must be taken that data structures supporting
   the operation of the protocol itself do not leak data that could be
   used to reverse the anonymisation applied to the flow data.  Such
   data structures may appear in the header, or within the data stream
   itself, especially as options data.  Each of these and their impact
   on specific anonymisation techniques is noted in a separate
   subsection below.

5.2.1.  Anonymisation of Header Data

   Each IPFIX Message contains a Message Header; within this Message
   Header are contained two fields which may be used to break certain
   anonymisation techniques: the Export Time, and the Observation Domain
   ID

   Export of IPFIX Messages containing anonymised timestamp data where
   the original Export Time Message header has some relationship to the
   anonymised timestamps SHOULD anonymise the Export Time header field
   using an equivalent technique, if possible.  Otherwise, relationships
   between export and flow time could be used to partially or totally
   reverse timestamp anonymisation.

   The similarity in size between an Observation Domain ID and an IPv4
   address (32 bits) may lead to a temptation to use an IPv4 interface
   address on the Metering or Exporting Process as the Observation
   Domain ID.  If this address bears some relation to the IP addresses
   in the flow data (e.g., shares a network prefix with internal



Boschi & Trammell         Expires July 16, 2009                [Page 11]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   addresses) and the IP addresses in the flow data are anonymised in a
   structure-preserving way, then the Observation Domain ID may be used
   to break the IP address anonymisation.  Use of an IPv4 interface
   address on the Metering or Exporting Process as the Observation
   Domain ID is NOT RECOMMENDED in this case.

   [EDITOR'S NOTE: We might want to see if anyone is actually doing this
   with IPFIX.  The example comes from other network measurement tools
   (e.g.  Argus) which default to using an IPv4 address as a sensor ID.]

5.2.2.  Anonymisation of Options Data

   IPFIX uses the Options mechanism to export, among other things,
   metadata about exported flows and the flow collection infrastructure.
   As with the IPFIX Message Header, certain Options recommended in
   [RFC5101] and the IPFIX File Format [I-D.ietf-ipfix-file] containing
   flow timestamps and network addresses of Exporting and Collecting
   Processes may be used to break certain anonymisation techniques; care
   should be taken while using them with anonymised data export and
   storage.

   The Exporting Process Reliability Statistics Options Template,
   recommended in [RFC5101], contains an Exporting Process ID field,
   which may be an exportingProcessIPv4Address Information Element or an
   exportingProcessIPv6Address Information Element.  If the Exporting
   Process address bears some relation to the IP addresses in the flow
   data (e.g., shares a network prefix with internal addresses) and the
   IP addresses in the flow data are anonymised in a structure-
   preserving way, then the Exporting Process address may be used to
   break the IP address anonymisation.  Exporting Processes exporting
   anonymised data in this situation SHOULD mitigate the risk of attack
   either by omitting Options described by the Exporting Process
   Reliability Statistics Options Template, or by anonymising the
   Exporting Process address using a similar technique to that used to
   anonymise the IP addresses in the exported data.

   Similarly, the Export Session Details Options Template and Message
   Details Options Template specified for the IPFIX File Format
   [I-D.ietf-ipfix-file] may contain the exportingProcessIPv4Address
   Information Element or the exportingProcessIPv6Address Information
   Element to identify an Exporting Process from which a flow record was
   received, and the collectingProcessIPv4Address Information Element or
   the collectingProcessIPv6Address Information Element to identify the
   Collecting Process which received it.  If the Exporting Process or
   Collecting Process address bears some relation to the IP addresses in
   the flow data (e.g., shares a network prefix with internal addresses)
   and the IP addresses in the flow data are anonymised in a structure-
   preserving way, then the Exporting Process or Collecting Process



Boschi & Trammell         Expires July 16, 2009                [Page 12]


Internet-Draft        IP Flow Anonymisation Support         January 2009


   address may be used to break the IP address anonymisation.  Since
   these Options Templates are primarily intended for storing IPFIX
   Transport Session data for auditing, replay, and testing purposes, it
   is NOT RECOMMENDED that storage of anonymised data include these
   Options Templates in order to mitigate the risk of attack.

   The Message Details Options Template specified for the IPFIX File
   Format [I-D.ietf-ipfix-file] also contains the
   collectionTimeMilliseconds Information Element.  As with the Export
   Time Message Header field, if the exported flow data contains
   anonymised timestamp information, and the collectionTimeMilliseconds
   Information Element in a given Message has some relationship to the
   anonymised timestamp information, then this relationship can be
   exploited to reverse the timestamp anonymisation.  Since this Options
   Template is primarily intended for storing IPFIX Transport Session
   data for auditing, replay, and testing purposes, it is NOT
   RECOMMENDED that storage of anonymised data include this Options
   Template in order to mitigate the risk of attack.

   Since the Time Window Options Template specified for the IPFIX File
   Format [I-D.ietf-ipfix-file] refers to the timestamps within the flow
   data to provide partial table of contents information for an IPFIX
   File, care must be taken to ensure that Options described by this
   template are written using the anonymised timestamps instead of the
   original ones.


6.  Parameters for the Description of Anonymisation Techniques

   [TODO: see corresponding section of draft-ietf-psamp-sample-tech for
   the proposed structure of this section.]


7.  Anonymisation Metadata Support in IPFIX

   [TODO: Here we'll describe how the information specified above can be
   transmitted on the wire using an option template.  The idea is to
   scope the option to the Template ID and for each field specify which
   are anonymised, providing info on the output characteristics of the
   technique, and which ones aren't.]

   [EDITOR'S NOTE: Multiple anon. techniques applied on an IE at the
   same time is indicated with multiple elements of the same type (in
   application order as in PSAMP)]

   [EDITOR'S NOTE: for blackmarking we'll recommend not to export the
   information at all following the data protection law principle that
   only necessary information should be exported.]



Boschi & Trammell         Expires July 16, 2009                [Page 13]


Internet-Draft        IP Flow Anonymisation Support         January 2009


8.  Security Considerations

   [TODO: write this section.]


9.  IANA Considerations

   This document contains no actions for IANA.


10.  Acknowledgments

   We thank Paul Aitken for his comments and insight, and the PRISM
   project for its support of this work.


11.  References

11.1.  Normative References

   [RFC5101]  Claise, B., "Specification of the IP Flow Information
              Export (IPFIX) Protocol for the Exchange of IP Traffic
              Flow Information", RFC 5101, January 2008.

   [RFC5102]  Quittek, J., Bryant, S., Claise, B., Aitken, P., and J.
              Meyer, "Information Model for IP Flow Information Export",
              RFC 5102, January 2008.

11.2.  Informative References

   [I-D.ietf-ipfix-as]
              Zseby, T., "IPFIX Applicability", draft-ietf-ipfix-as-12
              (work in progress), July 2007.

   [I-D.ietf-ipfix-architecture]
              Sadasivan, G., "Architecture for IP Flow Information
              Export", draft-ietf-ipfix-architecture-12 (work in
              progress), September 2006.

   [I-D.ietf-ipfix-file]
              Trammell, B., Boschi, E., Mark, L., Zseby, T., and A.
              Wagner, "Specification of the IPFIX File Format",
              draft-ietf-ipfix-file-03 (work in progress), October 2008.

   [I-D.ietf-ipfix-mediators-framework]
              Kobayashi, A., Nishida, H., and B. Claise, "IPFIX
              Mediation: Framework",
              draft-ietf-ipfix-mediators-framework-01 (work in



Boschi & Trammell         Expires July 16, 2009                [Page 14]


Internet-Draft        IP Flow Anonymisation Support         January 2009


              progress), November 2008.

   [RFC3917]  Quittek, J., Zseby, T., Claise, B., and S. Zander,
              "Requirements for IP Flow Information Export (IPFIX)",
              RFC 3917, October 2004.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.


Authors' Addresses

   Elisa Boschi
   Hitachi Europe
   c/o ETH Zurich
   Gloriastrasse 35
   8092 Zurich
   Switzerland

   Phone: +41 44 632 70 57
   Email: elisa.boschi@hitachi-eu.com


   Brian Trammell
   Hitachi Europe
   c/o ETH Zurich
   Gloriastrasse 35
   8092 Zurich
   Switzerland

   Phone: +41 44 632 70 13
   Email: brian.trammell@hitachi-eu.com



















Boschi & Trammell         Expires July 16, 2009                [Page 15]