The Common Log Format (CLF) for the Session Initiation Protocol (SIP): Framework and Information Model
draft-ietf-sipclf-problem-statement-13

Note: This ballot was opened for revision 10 and is now closed.

(Ralph Droms) Discuss

Discuss (2012-01-03 for -** No value found for 'p.get_dochistory.rev' **)
The exact purpose and intended use of this document is not clear to
me.  The document name include "problem-statement," the title includes
"Framework and Data Model," and the abstract concludes with the
sentence:

   We propose a common log file format
   for SIP servers that can be used uniformly by user agents, proxies,
   registrars, redirect servers as well as back-to-back user agents.

I don't know if the text in section 4 is referring to the standards
track CLF as defined in this document or an hypothetical CLF to be
defined based on the problem statement in section 3 and the data model
in section 8.  In fact, it seems that section 8 defines something more
than just a data model, as it defines mandatory elements in CLF
records, etc.

The document needs a clear statement about whether or not it is
defining the operational abstraction for the standards CLF.  If it is
defining that standards CLF, it needs to be a standards track document.
Comment (2012-01-03 for -** No value found for 'p.get_dochistory.rev' **)
No email
send info
Editorial observation: "CLF format" is redundant.

Process comment - the IESG Writeup Working Group Summary consists of
one sentence: "The problem statement was not contentious."  I can't
tell if this sentence refers to just the problem statement in section
3 or the entire document.  I note that the document name includes
"problem-statement" while the title includes "Framework and Data
Model."  Perhaps the goals and purposes of the document changed during
its development?  It would be helpful if the Working Group Summary gave
more detail about the background, development and purpose of the
document.

Niggling irritation - the first couple of motivations listed in
section 6 are relevant to the development of a CLF.  The remainder
don't actually depend on a CLF; a CLF might ease the development of
solutions to those problems.

Which representation format is used in the example in section 9.1, or
is the example an abstract representation independent of any specific
format like the one defined in draft-ietf-sipclf-format-03?

(David Harrington) Discuss

Discuss (2012-03-19 for -11)
updated for -11-

I don't think my concerns have been addressed.

1) I agree with the other IESG members who think this exceeds the scope of a problem statement. 

2) I agree with the DISCUSS that the shepherd writeup is lacking in useful detail. In my opinion this writeup is woefully inadequate to reflect the nature of the discussions that occurred, some contentious.

3) The document fails to include any summary of WG discussions of using existing IETF protocols. 

There was contention about whether the logging should be designed only for logging locally, or for local use and for being transported between systems. The WG discussed the need for secure transport of logged information. Existing standards already address this, e.g. syslog/TLS [RFC5425] and ipfix RFC5101]. This should be documented as part of the problem space.

The WG debated whether pre-filtering was a desirable feature, to control log growth rates, and bandwidth needed for transporting logs, and whether existing standards could be utilized for this purpose (ipfix offers a template approach that controls what data gets logged, and what data gets transported; syslog supports facility and severity values, and a config file can specify which records should be forwarded to a receiver). This need should be documented as part of the problem space.

4) The document describes wireshark as inappropriate because the wireshark libraries would need to have sipclf functionality implemented across multiple OSes and configurations. But the proposed solution seems to be to develop a whole new format, so any tools for parsing this new format would need to be developed from scratch, across multiple OSes and configurations. That strikes me as an odd problem statement to justify a new sipclf approach.

5) I support the Discusses about whether the sipclf mandatory fields in section 8 is normative. If this is not normative, then RFC2119 language is probably inappropriate. If it is normative, then it belongs in a standards track document.

6) Security considerations discusses the threats associated with stored logs. Existing standards for logging include capabilities for signing logs and detecting deletions and modifications [RFC5848], whether at rest or in transit. The potential need to secure logs at rest and in transit is part of the problem with logging SIP, and should be documented as part of the problem space.

7) New text to address my discuss has some inaccurate information that should be corrected. 
"A new problem
   arises due to the general nature of syslog: the disk file will
   contain log messages from many originators, not just SIP entities.
   This imposes an additional burden of discarding all extraneous
   records when analyzing the disk file for SIP CLF records of interest." 
and under drawbacks,
"because of the frequency and size of SIP log messages, it is not
   desirable to send every SIP CLF log message to the collector."

syslog is designed to allow specfication of facility and severity for filtering purposes, and the entries for specific facility/severity combinations can be directed to a local file or remote file. SIP messages can be directed to a SIP log file; specific event types can be selected for logging. So an operator can configure syslog to avoid the problem of having all syslog messages dumped into a single file that then becomes difficult to search. It seems to me that syslog actually handles this better then a sipclf that dumps all sip messages into a file. 
I note that there is nothing in the problem statement section of this document that specifies that a sipclf must be able to filter which information gets logged or does not. There is no mention of such filtering in the probem statement, except to disqualify existing standards.

The text disqualifies syslog because it cannot be easily parsed, but syslog messages are in text format (typically the ascii subset of utf-8), and the first byte is encoded as a product of the facility (application) and message severity, explicitly to allow fast discarding of records that are not of interest when parsing logs of mixed application entries and/or logs of different event types.

The text implies syslog is not easily searchable by command line tools. Under the syslog section, the text says "SIP CLF records are best stored in a log file that is easily searchable by command line tools." Syslog messages are deliberately written in a utf-8 text format, with clear delimiters defined in the ABNF, and most implementations store the messages in a text file that is easily accessible to command line tools. 

Syslog might not be best choice for this sipclf purposes, but this document seems full of misinformation to justify a solution that was decided on before any analysis of existing standards was done. I think it is a disservice to the community to publish a document with such misinformation. It is also a disservice to the community to waste resources reinventing wheels, such as a TLS transport for securely transporting sipclf files between hosts, when existing logging standards already offer these features. This document doesn't identify that need as part of the problem statement for sipclf, but apparently feels it is important enough to mention it in the security considerations. If logs need to be transferred between hosts, why is this not mentioned as an aspect of the problem to be solved?

As with many problem statements documents coming thruogh the IESG, I think this probem statement does a poor job of clearly describing the problems to be solved to justify a new standard protocol.

(Ron Bonica) Yes

(Dan Romascanu) Yes

Comment (2012-01-04 for -** No value found for 'p.get_dochistory.rev' **)
No email
send info
In the 'Operational guidance' section: 

> SIP CLF log files will take up substantive amount of disk space
   depending on traffic volume at a processing entity and the amount of
   information being logged.  As such, any enterprise using SIP CLF
   should establish operational procedures for file rollovers as
   appropriate to the needs of the organization.


I suggest to replace the word 'enterprise' with 'organization'. The issue is certainly present in all type of networks, not only in enterprise deployment as it may be mis-understood here. 

(Robert Sparks) Yes

(Jari Arkko) No Objection

Comment (2012-01-05 for -** No value found for 'p.get_dochistory.rev' **)
No email
send info
I think we need a log file format. And it needs to work on application level, because, as noted, layer 3 or 4 security makes it impractical to read off the contents of SIP messages.

I'm less convinced about the specific proposal here. The text on wireshark for instance seems to indicate some lack of information on various logging mechanisms in the Internet. (Wireshark is just a tool that operates on the more general pcap http://en.wikipedia.org/wiki/Pcap interface.)

I would guess that there are multiple needs in this space. One is for detailed SIP message logging for debugging and statistics purposes. For that purpose, a pcap-like recording of exact messages at the application layer might be more appropriate. Another need is for more high-level, CDR/billing/high-level statistics collection type of needs. There some summary of SIP events would be more appropriate. This would not necessarily record every message, and could even record some non-message events such as when the SIP entity gives up on trying to contact someone. The current design seems to be some mixture of these two kinds of approaches.

(Stewart Bryant) (was Discuss) No Objection

(Wesley Eddy) No Objection

Comment (2012-01-04 for -** No value found for 'p.get_dochistory.rev' **)
No email
send info
I support Stephen & Ralph's DISCUSS points

(Adrian Farrel) No Objection

Comment (2012-01-02 for -** No value found for 'p.get_dochistory.rev' **)
No email
send info
Although this document does not define a CLF for SIP, I am not clear why
the data model here is not normative as a Standards Track document.

---

I wonder if you could consider adding to Section 7 a discussion of the
migration / backward-compatiblity issues. Maybe these are no worse than
today, but it will certainly be the case that a log file will need to
contain some indication that it is in the CLF.

(Stephen Farrell) (was Discuss) No Objection

Comment (2011-12-31 for -11)
No email
send info
- What does "trace a call from one entity to another mean"?  I do
hope we're not proposing that lawful intercept is the primary
reason for this work. 

- You say a few things the CLF is not at the end of section 4,
would it be reasonable to add "The SIP CLF is not a tool for
supporting lawful intercept."

- Public access to the log is worse than network sniffing in at
least two respects - log access trumps TLS and also network 
sniffing is more restricted in time and place (topology).

(Russ Housley) No Objection

(Pete Resnick) No Objection

Comment (2012-01-04 for -** No value found for 'p.get_dochistory.rev' **)
No email
send info
Seems like others have the DISCUSS points well in hand.

It seems a bit goofy in section 8.1 to call out fields as "mandatory" and even say that they are "minimal information that MUST appear in any SIP CLF record", and then go on in section 9 to point out for some items: "When a given mandatory field is not applicable to a SIP entity, we use the horizontal dash ("-") to represent it." That would make the field pretty clearly non-mandatory.

(Peter Saint-Andre) (was Discuss) No Objection

Comment (2012-01-04 for -11)
No email
send info
The document states that the Wireshark format cannot be used because "if the SIP messages are exchanged over a TLS-oriented transport, Wireshark will be unable to decrypt them and render them as individual SIP headers." Is there a reason why SIP servers cannot use the Wireshark *format* as a CLF even if they cannot use the Wireshark *application* on the data sent over an encrypted channel?

A true nit: there is no such thing as "12:00 PM".

Another nit: I have never seen "pend" as a verb. I suggest "can be in a pending state" or somesuch.

When talking about URIs in Section 8, an informational reference to RFC 3986 would be appropriate.

The allowable values for the Message type field are 'R' (for Request) and 'r' (for response). Is it really a good idea to use values that differ only by case? (Also, the Directionality field has a value of 'r' -- another source of possible confusion; you might consider "o" and "i" for outbound and inbound instead of "s" and "r" for sent and received.)

(Martin Stiemerling) No Objection

Comment (2012-03-29 for -11)
No email
send info
I support Ralph's DISCUSS points.

(Sean Turner) No Objection