SIPCLF                                                      G. Salgueiro
Internet-Draft                                             Cisco Systems
Intended status: Standards Track                              V. Gurbani
Expires: September 6, 2011                     Bell Labs, Alcatel-Lucent
                                                             A. B. Roach
                                                                 Tekelec
                                                           March 5, 2011


Format for the Session Initiation Protocol (SIP) Common Log Format (CLF)
                draft-salgueiro-sipclf-indexed-ascii-03

Abstract

   The SIPCLF Workgroup has defined a common log format framework for
   Session Initiation Protocol (SIP) servers.  This common log format
   mimics the wildly successful event logging mechanism found in well-
   known web servers like Apache and web proxies like Squid.  This
   document proposes an indexed text encoding format for the SIP Common
   Log Format (CLF) that retains the key advantages of a text-based
   format, while significantly increasing processing performance over a
   purely text-based implementation.  This file format adheres to the
   SIP CLF data model and provides an effective encoding scheme for all
   mandatory and optional fields that appear in a SIP CLF record.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 6, 2011.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal



Salgueiro, et al.       Expires September 6, 2011               [Page 1]


Internet-Draft             Format for SIP CLF                 March 2011


   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  Format . . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
   4.  Example Record . . . . . . . . . . . . . . . . . . . . . . . .  9
   5.  Text Tool Considerations . . . . . . . . . . . . . . . . . . . 10
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 11
   7.  Operational Guidance . . . . . . . . . . . . . . . . . . . . . 11
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 11
   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     10.1.  Normative References  . . . . . . . . . . . . . . . . . . 11
     10.2.  Informative References  . . . . . . . . . . . . . . . . . 12
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12


























Salgueiro, et al.       Expires September 6, 2011               [Page 2]


Internet-Draft             Format for SIP CLF                 March 2011


1.   Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   [RFC3261] defines additional terms used in this document that are
   specific to the SIP domain such as "proxy"; "registrar"; "redirect
   server"; "user agent server" or "UAS"; "user agent client" or "UAC";
   "back-to-back user agent" or "B2BUA"; "dialog"; "transaction";
   "server transaction".

   This document uses the term "SIP Server" that is defined to include
   the following SIP entities: user agent server, registrar, redirect
   server, a SIP proxy in the role of user agent server, and a B2BUA in
   the role of a user agent server.


2.  Introduction

   The extensive list of benefits and the widespread adoption of the
   Apache Common Log Format (CLF) has prompted the development of a
   functionally equivalent event logging mechanism for the Session
   Initiation Protocol [RFC3261] (SIP).  Implementing a logging scheme
   for SIP is a considerable challenge.  This is due in part to the fact
   that the behavior of a SIP entity is more complex as compared to an
   HTTP entity.  Additionally, there are shortcomings to the purely
   text-based HTTP Common Log Format that need to be addressed in order
   to allow for real-time inspection of SIP log files.  Experience with
   Apache Common Log Format has shown that dealing with large quantities
   of log data can be very processor intensive, as doing so necessarily
   requires reading and parsing every byte in the log file(s) of
   interest.

   An implementation independent framework for the SIP CLF has been
   defined in [I-D.ietf-sipclf-problem-statement].  This memo describes
   an indexed text file format for logging SIP messages received and
   sent by SIP clients, servers, and proxies that adheres to the data
   model presented in Section 8 of [I-D.ietf-sipclf-problem-statement].
   This document defines a format that is no more difficult to generate
   by logging entities, while being radically faster to process.  In
   particular, the format is optimized for both rapidly scanning through
   log records, as well as quickly locating commonly accessed data
   fields.

   Further, the format proposed by this document retains the key
   advantage of being human readable and able to be processed using the
   various Unix text processing tools, such as sed, awk, perl, cut, and



Salgueiro, et al.       Expires September 6, 2011               [Page 3]


Internet-Draft             Format for SIP CLF                 March 2011


   grep.


3.  Format

   The Common Log Format for the Session Initiation Protocol
   [I-D.ietf-sipclf-problem-statement] defines a data model to which
   this format adheres.  Each SIP CLF record MUST consist of all the
   mandatory data model elements outlined in Section 8.1 of
   [I-D.ietf-sipclf-problem-statement].

   The following figure depicts the format whereby all SIP CLF records
   are encoded:


      0          7 8        15 16       23 24         31
      +-----------+-----------+-----------+-----------+
      |  Version  |           Record Length           | 0 - 3
      +-----------+-----------+-----------+-----------+
      |       Record Length (cont)        |    0x2C   | 4 - 7
      +-----------+-----------+-----------+-----------+
      |            Flags Field            |    0x2C   | 8 - 11
      +-----------+-----------+-----------+-----------+
      |              CSeq Pointer (Hex)               | 12 - 15
      +-----------+-----------+-----------+-----------+
      |      Response Status-Code Pointer (Hex)       | 16 - 19
      +-----------+-----------+-----------+-----------+
      |              R-URI Pointer (Hex)              | 20 - 23
      +-----------+-----------+-----------+-----------+
      |   Destination IP address:port Pointer (Hex)   | 24 - 27
      +-----------+-----------+-----------+-----------+
      |     Source IP address:port Pointer (Hex)      | 28 - 31
      +-----------+-----------+-----------+-----------+
      |             To URI Pointer (Hex)              | 32 - 35
      +-----------+-----------+-----------+-----------+
      |             To Tag Pointer (Hex)              | 36 - 39
      +-----------+-----------+-----------+-----------+
      |            From URI Pointer (Hex)             | 40 - 43
      +-----------+-----------+-----------+-----------+
      |            From Tag Pointer (Hex)             | 44 - 47
      +-----------+-----------+-----------+-----------+
      |             Call-Id Pointer (Hex)             | 48 - 51
      +-----------+-----------+-----------+-----------+
      |           Server-Txn Pointer (Hex)            | 52 - 55
      +-----------+-----------+-----------+-----------+
      |           Client-Txn Pointer (Hex)            | 56 - 59
      +-----------+-----------+-----------+-----------+
      |            TLV Start Pointer (Hex)            | 60 - 63



Salgueiro, et al.       Expires September 6, 2011               [Page 4]


Internet-Draft             Format for SIP CLF                 March 2011


      +-----------+-----------+-----------+-----------+
      |    0x0A   |                                   | 64 - 67
      +-----------+                                   +
      |                   Timestamp                   | 68 - 71
      +                                   +-----------+
      |                                   |    0x2E   | 72 - 75
      +-----------+-----------+-----------+-----------+
      |         Fractional Seconds        |    0x09   | 76 - 79
      +-----------+-----------+-----------+-----------+
      |                                               |
      |                                               |
      |                Mandatory Fields               |
      |                                               |
      |                                               |
      +-----------+-----------+-----------+-----------+ \
      |    0x09   |             Tag (Hex)             |  \
      +-----------+-----------+-----------+-----------+   \    Repeated
      | Tag (cont)|    0x2C   |     Length (Hex)      |    \   as many
      +-----------+-----------+-----------+-----------+     >  times as
      |     Length (cont)     |    0x2C   |           |    /   necessary
      +-----------+-----------+-----------+           +   /
      |                     Value                     |  /
      +-----------+-----------+-----------+-----------+ /
      |    0x0A   |
      +-----------+


   Figure 1: SIP Common Log Format

   Note that indications of "hexadecimal encoded" indicate that the
   value is to be written out in human-readable base-16 numbers using
   the ASCII characters 0x30 through 0x39 ('0' through '9') and 0x41
   through 0x46 ('A' through 'F').  Similarly, indications of "decimal
   encoded" indicate that the value is to be written out in human
   readable base-10 number using the ASCII characters 0x30 through 0x39
   ('0' through '9').  In both encodings, numbers always take up the
   number of bytes indicated, and are padded on the left with ASCII '0'
   characters to fill the entire space.

   First, a 64-byte header indicates meta-data about the record.

   Version (1 byte):  0x41 for this document; hexadecimal encoded.

   Record Length (6 bytes):  Hexadecimal encoded total length of this
      log record, including "Flags" and "Record Length" fields, and
      terminating line-feed.





Salgueiro, et al.       Expires September 6, 2011               [Page 5]


Internet-Draft             Format for SIP CLF                 March 2011


   Flags Field (3 bytes):

      byte 1 -   Request/Response flag

         R = request
         r = response

      byte 2 -   Retransmission flag

         o = original transmission
         d = duplicate transmission
         s = server is stateless [i.e., retransmissions are not
         detected]

      byte 3 -   Sent/Received flag

         u = received UDP mesage
         t = received TCP mesage
         l = received TLS mesage
         U = sent UDP mesage
         T = sent TCP mesage
         L = sent TLS mesage

   Bytes 12 through 59 contain hexadecimal encoded pointers that point
   to the values of variable-length mandatory fields.  Note that there
   are no delimiters between these pointer values -- they are packed
   together as a single, 52-character hexadecimal encoded string.The
   "Pointer" fields indicate absolute byte values within the record, and
   MUST be >=80.  They point to the start of the corresponding value
   within the "Mandatory Fields" area.

   When a given mandatory field is not applicable in the SIP CLF record
   (i.e. a particular entity is not able to log that field because it
   does not make sense for the role the entity is playing in the SIP
   ecosystem) then it MUST be encoded as a single ASCII dash (0x2D) and
   will consequently have a corresponding length field value of 1.  The
   final pointer, "TLV Start Pointer," points to the ASCII Tab (0x09)
   character for the first entry in the Tag/Length/Value area; if no
   such entries are present, this value is set to zero.

   CSeq:  The Command Sequence header field, including the CSeq number
      and method name.

   Response Status-Code:  Set to the value of the SIP response status
      code for responses.  Set to a single ASCII dash (0x2D) for
      requests.





Salgueiro, et al.       Expires September 6, 2011               [Page 6]


Internet-Draft             Format for SIP CLF                 March 2011


   R-URI:  The Request-URI in the start line (mandatory in request),
      including any URI parameters.

   Destination IP address:port  The IP address of the downstream server,
      including the port number.  The port number MUST be separated from
      the IP address by a single ':'.

   Source IP address:port  The IP address of the upstream client,
      including the port number over which the SIP message was received.
      The port number MUST be separated from the IP address by a single
      ':'.

   To URI:  Value of the URI in the To header field.

   To Tag:  Value of the tag parameter (if present) in the To header
      field.

   From URI:  Value of the URI in the From header field.

   From Tag:  Value of the tag parameter in the From header field.

   Whilst one may question the value of the From URI in light of
   [RFC4474], the From URI, nonetheless, imparts some information.  For
   one, the From tag is important and, in the case of a REGISTER
   request, the From URI can provide information on whether this was a
   third-party registration or a first-party one.

   Call-Id:  The value of the Call-ID header field.

   Server-Txn:  Server transaction identification code - the transaction
      identifier associated with the server transaction.
      Implementations can reuse the server transaction identifier (the
      topmost branch-id of the incoming request, with or without the
      magic cookie), or they could generate a unique identification
      string for a server transaction (this identifier needs to be
      locally unique to the server only.)  This identifier is used to
      correlate ACKs and CANCELs to an INVITE transaction; it is also
      used to aid in forking.  (See Section 9 of
      [I-D.ietf-sipclf-problem-statement] for usage.)

   Client-Txn:  Client transaction identification code - this field is
      used to associate client transactions with a server transaction
      for forking proxies or B2BUAs.  Upon forking, implementations can
      reuse the value they inserted into the topmost Via header's branch
      parameter, or they can generate a unique identification string for
      the client transaction.  (See Section 9 of
      [I-D.ietf-sipclf-problem-statement] for usage.)




Salgueiro, et al.       Expires September 6, 2011               [Page 7]


Internet-Draft             Format for SIP CLF                 March 2011


   Following the pointers, several fixed-length fields are encoded.  As
   before, all fields are completely filled, pre-pending values with '0'
   characters as necessary.

   Timestamp (10 bytes):  Date and time of the request or response
      represented as the number of seconds since the Unix epoch (i.e.
      seconds since midnight, January 1st, 1970, GMT).  Decimal encoded.

   Fractional Seconds (6 bytes):  Fractional seconds portion of the
      Timestamp field to millisecond accuracy.  Decimal encoded.

   Mandatory Field Data:  Contains actual values for the mandatory
      fields.  This data MUST appear in the order listed, and each field
      MUST be present.  Fields are separated by a single ASCII Tab
      character (0x09).  Any tab characters present in the data to be
      written will be replaced by an ASCII space character (0x20) prior
      to being logged.  If a given mandatory field is not present then
      it MUST be encoded as a horizontal dash ("-").

   After the "Mandatory Fields" section, the OPTIONAL Tag/Length/Value
   groups appear zero or more times.  These TLV groups allow SIP CLF
   implementers the flexibility to extend the logging capability of the
   indexed-ASCII representation beyond just the mandatory log elements.
   The location within the SIP CLF record is indicated by the "TLV Start
   Pointer" field.  This "TLV Start Pointer" field MUST be set to 0x0000
   if the OPTIONAL TLV groups are not implemented.

   Tag Field (4 bytes):  indicates the type of value coded by this TLV;
      hexadecimal encoded.  Currently defined tags are:

      0x0000 -  Contact value (can be repeated) Contains entire value of
         Contact header field

      0x0001 -  Remote Host (mandatory in request) The DNS name of the
         IP address from which the message was received (if "Sent/
         Received flag" is set to "r").  The DNS name of the IP address
         to which the message is being sent (if "Sent/Received flag" is
         set to "s")

      0x0002 -  Authenticated User Contains the user name by which the
         user has been authenticated

      0x0003 -  Complete SIP Message (SHOULD be omitted by default)
         Contains complete SIP message.  Can be repeated multiple times
         to accommodate SIP messages that exceed 65535 bytes in length.






Salgueiro, et al.       Expires September 6, 2011               [Page 8]


Internet-Draft             Format for SIP CLF                 March 2011


   Length Field (2 bytes):  indicates the length of the value coded in
      this TLV, hexadecimal encoded.  This length does NOT include the
      TLV header.

   Value Field (0 to 65535 bytes):  contains the actual value of this
      TLV.  As with the mandatory fields, ASCII Tab characters (0x09)
      are replaced with ASCII space characters (0x20).


4.  Example Record

   The following SIP message is an INVITE request sent by a SIP client:


       INVITE sip:192.168.217.74 SIP/2.0
       To: <sip:192.168.217.74>
       Call-ID: DL70dff590c1-1079051554@petermac.magor.local.
       From: "PeterM" <sip:1001@petermac.magor.local.:5060>;
               tag=DL88360fa5fc;epid=0x34619b0
       CSeq: 1 INVITE
       Max-Forwards: 70
       Via: SIP/2.0/TCP 192.168.217.117:5060;
               branch=z9hG4bK-1f6be070c4-DL
       Contact: "1001" <sip:1001@192.168.217.117:5060>
       Allow: INVITE,CANCEL,ACK,OPTIONS,INFO,SUBSCRIBE,NOTIFY,BYE,
               MESSAGE,UPDATE,REFER
       Supported: replaces,norefersub
       User-Agent: Dylogic Mirial 7.0.33
       Content-Type: application/sdp
       Content-Length: 418

       v=0
       o=1001 1456139204 0 IN IP4 192.168.217.117
       s=-
       i=Dylogic Mirial 7.0.33
       c=IN IP4 192.168.217.117
       b=AS:2048
       t=0 0
       m=audio 13756 RTP/AVP 0 101
       a=rtpmap:0 PCMU/8000
       a=rtpmap:101 telephone-event/8000
       a=fmtp:101 0-16
       a=x-mpdp:192.168.217.117:13756
       m=video 13758 RTP/AVP 96
       a=rtpmap:96 H264/90000
       a=fmtp:96 profile-level-id=420015; max-mbps=47520; max-fs=1584;
               max-dpb=7680
       a=x-mpdp:192.168.217.117:13758



Salgueiro, et al.       Expires September 6, 2011               [Page 9]


Internet-Draft             Format for SIP CLF                 March 2011


   Shown below is approximately how this message would appear as a
   single record in a SIP CLF logging file if encoded according to the
   syntax described in this document.  Due to internet-draft
   conventions, this log entry has been split into seven lines, instead
   of the two lines that actually appear in a log file; and the tab
   characters have been padded out using spaces to simulate their
   appearance in a text terminal.


   A000120,Rou,0051005A005C006F0083009900AC00AE00D200DF010D01170000

   0000000000.010  1 INVITE        -       sip:192.168.217.74
   192.168.217.74:5060     192.168.217.117:56485
   sip:192.168.217.74      -       sip:1001@petermac.magor.local.:5060
   DL88360fa5fc    DL70dff590c1-1079051554@petermac.magor.local.
   server-tx       client-tx


   A Base64 encoded version of this log entry (without the changes
   required to format it for an internet-draft) is shown below:


        begin-base64 644 clf_record
QTAwMDEyMCxSb3UsMDA1MTAwNUEwMDVDMDA2RjAwODMwMDk5MDBBQzAwQUUwMEQyMDBERjAx
MEQwMTE3MDAwMAowMDAwMDAwMDAwLjAxMAkxIElOVklURQktCXNpcDoxOTIuMTY4LjIxNy43
NAkxOTIuMTY4LjIxNy43NDo1MDYwCTE5Mi4xNjguMjE3LjExNzo1NjQ4NQlzaXA6MTkyLjE2
OC4yMTcuNzQJLQlzaXA6MTAwMUBwZXRlcm1hYy5tYWdvci5sb2NhbC46NTA2MAlETDg4MzYw
ZmE1ZmMJREw3MGRmZjU5MGMxLTEwNzkwNTE1NTRAcGV0ZXJtYWMubWFnb3IubG9jYWwuCXNl
cnZlci10eAljbGllbnQtdHgK
        ====



5.  Text Tool Considerations

   This format has been designed to allow text tools to easily process
   logs without needing to understand the indexing format.  Index lines
   may be rapidly discarded by checking the first character of the line:
   index lines will always start with an alphabetical character, while
   field lines will start with a numerical character.

   Within a field line, script tools can quickly split fields at the tab
   characters.  The first 12 fields are positional, and the meaning of
   any subsequent fields can be determined by checking the first four
   characters of the field.  Alternately, these non-positional fields
   can be located using a regular expression.  For example, the "Contact
   value" in a request can be found by searching for the perl regex
   /\t0000,....,([^\t]*)/.



Salgueiro, et al.       Expires September 6, 2011              [Page 10]


Internet-Draft             Format for SIP CLF                 March 2011


   Note also that requests can be distinguished from responses by
   checking the third positional field -- for requests, it will always
   be set to "000"; any other value indicates a response.


6.  Security Considerations

   This document does not introduce any new security considerations
   beyond those in [I-D.ietf-sipclf-problem-statement].


7.  Operational Guidance

   SIP CLF log files will take up substantive amount of disk space
   depending on traffic volume at a processing entity and the amount of
   information being logged.  As such, any enterprise using SIP CLF
   should establish operational procedures for file rollovers as
   appropriate to the needs of the organization.

   Listing such operational guidelines in this document is out of scope
   for this work.


8.  IANA Considerations

   This document does not require any considerations from IANA.


9.  Acknowledgements

   TBD


10.  References

10.1.  Normative References

   [I-D.ietf-sipclf-problem-statement]
              Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O.
              Festor, "The Common Log Format (CLF) for the Session
              Initiation Protocol (SIP)",
              draft-ietf-sipclf-problem-statement-04 (work in progress),
              October 2010.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.





Salgueiro, et al.       Expires September 6, 2011              [Page 11]


Internet-Draft             Format for SIP CLF                 March 2011


10.2.  Informative References

   [I-D.gurbani-sipping-clf]
              Gurbani, V., Burger, E., Anjali, T., Abdelnur, H., and O.
              Festor, "The Common Log File (CLF) format for the Session
              Initiation Protocol (SIP)", draft-gurbani-sipping-clf-01
              (work in progress), March 2009.

   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
              A., Peterson, J., Sparks, R., Handley, M., and E.
              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
              June 2002.

   [RFC4474]  Peterson, J. and C. Jennings, "Enhancements for
              Authenticated Identity Management in the Session
              Initiation Protocol (SIP)", RFC 4474, August 2006.


Authors' Addresses

   Gonzalo Salgueiro
   Cisco Systems
   7200-12 Kit Creek Road
   Research Triangle Park, NC  27709
   US

   Email: gsalguei@cisco.com


   Vijay Gurbani
   Bell Labs, Alcatel-Lucent
   1960 Lucent Lane
   Rm 9C-533
   Naperville, IL  60563
   US

   Email: vkg@bell-labs.com


   Adam Roach
   Tekelec
   17210 Campbell Rd.
   Suite 250
   Dallas, TX  75252
   US

   Email: adam@nostrum.com




Salgueiro, et al.       Expires September 6, 2011              [Page 12]