A Session Initiation Protocol (SIP) Response Code for Rejected Calls

Note: This ballot was opened for revision 08 and is now closed.

Warren Kumari Yes

Comment (2019-06-12 for -08)
I generally approach SIP documents with a sense of foreboding — they are often long, expect a large amount of knowledge outside my area of expertise, and require lots of reference chasing — but this was a really good read. It describes the problem well, it lays out the protocol clearly, and contains enough humor / snark to make reading it actually enjoyable. 

"Another value of the 607 rejection is presuming the proxy forwards
the response code to the User Agent Client (UAC), the calling UAC or
intervening proxies will also learn the user is not interested in
receiving calls from that sender."
I found this sentence really hard to parse -- I think adding a comma after "is" fixes it.

"An algorithm can be vulnerable to an algorithm subject to the base rate fallacy [BaseRate] rejecting the call."
Unparsable -- duplication? Perhaps just " An algorithm can be vulnerable to the base rate fallacy [BaseRate] rejecting the call."?

(Adam Roach) Yes

Comment (2019-06-11 for -08)

(Ignas Bagdonas) No Objection

Deborah Brungard No Objection

Alissa Cooper No Objection

Comment (2019-06-11 for -08)
Please respond to the Gen-ART review.

= Section 3 =

I'm wondering about the case where I have an AI-driven assistant on my client that listens for me to say "Please take me off your call list" and blocks all future calls from that caller. It seems like the 608 use case would apply (for the case of false positive voice recognition), but since the definition here limits the intermediary to an entity "in the network," this scenario is out of scope. Should it be?

= Section 3.5 =

"It is
   important to note the network element should be mindful of the media
   type requested by the UAC as it formulates the announcement.  For
   example, it would make sense for an INVITE that only indicated audio
   codecs in the Session Description Protocol (SDP) [RFC4566] to result
   in an audio announcement.  However, if the INVITE only indicated a
   real-time text codec and the network element can render the
   information in the requested media format, the network element MUST
   send the information in a text format, not an audio format."

I think the normative guidance here could be crisper, e.g., replacing the first sentence with "The network element SHOULD use a media format for its announcement for which the caller indicates support, if possible." I also don't understand why the second example uses normative MUST but the first example doesn't use normative language at all.

= Section 4.1 =

Using "bitbucket" in the examples seems like it sends the wrong message about the utility of the contact address.

Roman Danyliw No Objection

Comment (2019-06-11 for -08)
(1) Per Section 6 (Security Considerations), the risks of TEL URIs in the jCard given a malicious intermediary is helpful.  I’d recommend adding language around comparable risks with the url in the jcard (e.g., that this url could point to malicious content)

(2) Per Section 1, Nit.  [RFC7340] is referred to a technology.  However, specific draft is a requirements document.

Benjamin Kaduk No Objection

Comment (2019-06-11 for -08)
Do we want to give any references/examples for "some jurisdictions" or
"many jurisdictions"?

Section 1

                     An algorithm can be vulnerable to an algorithm
   subject to the base rate fallacy [BaseRate] rejecting the call.  [...]

nit: It sounds like these are different algorithms, so that "One
algorithm can be vulnerable to a separate algorithm, subject to the base
rate fallacy, erroneously rejecting the call" would be more clear.

Section 3.1

   If there is a Call-Info header field, it MUST have the 'purpose'
   parameter of 'jwscard'.  The value of the Call-Info header field MUST
   refer to a valid JSON Web Signature (JWS [RFC7515]) encoding of a
   jCard [RFC7095] object.

Do we need to say anything about what entity('s key) generates the
signature and/or what affects signature algorithm selection (e.g.,
a forward reference to Sections 3.2.x)?

Section 3.2.2

   The payload contains two JSON values.  The first JSON Web Token (JWT)
   claim that MUST be present is the iat (issued at) claim [RFC7519].
   The "iat" MUST be set to the date and time of the issuance of the 608
   response.  This mandatory component protects the response from replay

nit(?): Perhaps this protection is only "outside the scope of a narrow
window of time corresponding to the allowed RTT and any permitted time
skew", per Section 3.3.

                                      Call originators (at the UAC) can
   use the information returned by the jCard to contact the intermediary
   that rejected the call to appeal the intermediary's blocking of the
   call attempt.  What the intermediary does if the blocked caller
   contacts the intermediary is outside the scope of this document.

It seems like it is permissible for the intermediary to reject this new
call as well; can we get into some sort of recursion-like situation?

Section 3.5

                              However, if the INVITE only indicated a
   real-time text codec and the network element can render the
   information in the requested media format, the network element MUST
   send the information in a text format, not an audio format.

This usage of 2119 language seems odd to me, like it's calling out a
single special case for normative treatment but ignoring the general

Section 4.1

                                                   As such,
   implementations MUST NOT insert line breaks into the base64url
   encodings of the JOSE header or JWT.  This also means UACs MUST be
   prepared to receive arbitrarily long octet streams from the URI
   referenced by the Call-Info SIP header.

These (especially the MUST NOT) seem to just be restating requirements
from elsewhere and would not ordinarily need normative language to do

Section 6

   Another risk is for an attacker to flood a proxy that supports the
   sip.608 feature with INVITE requests that lack the sip.608 feature
   capability to direct the SDP to a victim's device.  [...]

This sentence is pretty long/convoluted and could probably be reworded
for clarity.

   Yet another risk is a malicious intermediary that generates a
   malicious 608 response with a jCard referring to a malicious agent.
   For example, the recipient of a 608 may receive a TEL URI in the
   vCard.  When the recipient calls that address, the malicious agent
   could ask for personally identifying information.  However, instead
   of using that information to verify the recipient's identity, they
   are phishing the information for nefarious ends.  As such, we
   strongly recommend the recipient validates to whom they are
   communicating with if asking to adjudicate an erroneously rejected
   call attempt.  Since we may also be concerned about intermediate
   nodes modifying contact information, we can address both issues with
   a single solution.  The remediation is to require the intermediary to
   sign the jCard.  [...]

The signature is not a panacea -- the recipient needs to verify that the
signature comes from a trustworthy (in some sense) entity, and that the
person they contact based on the jCard is the same entity or affiliated
with the entity that generated the signature.  I think this is not quite
the same thing as the SHAKEN/SHAKEN-like mechanisms for validating that
the signing entity matches the called entity that are mentioned in the
subsequent text.

                  However, if the intermediary does go that route, the
   intermediary MUST use a non-deterministic reference mechanism and be
   prepared to return dummy responses so that attackers attempting to
   glean call metadata by guessing calls will not get any actionable
   information from the HTTPS GET.

Thanks for mentioning this side channel!  I'd suggest to clarify that
the dummy responses are in response to URIs that might be (but are not)
URIs that would be found in the "url" field of the jCard.  (Assuming I'm
understanding the attack correctly, of course.)

(Suresh Krishnan) No Objection

(Mirja Kühlewind) No Objection

Comment (2019-06-05 for -08)
1) A couple of remarks on this sentence in Sec 3.1: 
“If an intermediary issues a 608 code and there are not indicators the
   calling party will use the contents of the Call-Info header field for
   malicious purposes (see Section 6), the intermediary MUST include a
   Call-Info header field in the response.”
  a) -> s/not/no/
  b) Maybe also add a “that” as it would make this long easier to read:
“If an intermediary issues a 608 code and there are no indicators that the
   calling party will use the contents of the Call-Info header field for
   malicious purposes (see Section 6), …
  c) After having read Section 6, I find this MUST rather strong. I was expecting more “concrete” instructions. I understand why you want to have a MUST here, but section 6 reads very much like a SHOULD.

2) Editorial: In this sentence in 3.4 I also think a “that” would help:
“The degenerate case is the intermediary is the only element that
   understands the semantics of the 608 response code.“

3) One more purely editorial comment: Short title is “Status Reject”, however, to be closer to the long title I would rather recommend something like “SIP Response Code for Rejected Calls”.

Barry Leiba No Objection

Comment (2019-06-12 for -08)
Thanks for this document.  I have only some editorial comments:

— Section 1 —

Here and in other sections you use “As such,” several times and always in a way that seems quite odd.  I suggest removing it (or maybe sometimes replacing it with “therefore” or “thus”).

   Some call blocking services may return responses such as 604

There needs to be a hyphen in “call-blocking”.

   other network elements might also interpret this to mean the user
   truly does not exist and might result in the user not being able to
   receive calls from anyone, even if wanted.

The “and” is wrong because the things it conjoins aren’t parallel.  Change “exist and” to “exist, which” to fix that.  I would also say, “even if the calls are wanted,” to make that clearer.

   Another value of the 607 rejection is presuming the proxy forwards
   the response code to the User Agent Client (UAC), the calling UAC or
   intervening proxies will also learn the user is not interested in
   receiving calls from that sender.

I found that odd and hard to understand until I realized that there’s a comma missing before “presuming”.  But I think a better fix is to change “presuming” to “that if”.

   downstream from the intermediary might interpret the 607 as coming
   from a user (human) that has marked the call as unwanted

We customarily refer to a human as “who”, rather than “that”.

   Integrity protecting the jCard with a cryptographic signature

Hyphenate “integrity-protecting”.

— Section 3 —

   For clarity, this section uses the term 'intermediary'

I don’t understand why the term adds clarity.  Whyso?

   the user's UAS (colloquially, but not
   necessarily, their phone).

I don’t thunk you mean “colloquially” here.  Maybe “commonly” or “usually”?

— Section 3.4 —

   life is good as the UAC will receive

You need a comma after “good”.

— Section 6 —

   remediation for this is for devices that insert a sip.608 feature
   capability only transmit media to what is highly likely to be the

Either change “is for” to “is that” or insert “to” before “only”.

   media in response to a STIR [RFC8224]-signed INVITE

It seems really weird to have a citation in the middle of the hyphenated compound “STIR-signed”.  Given that you cite “STIR [RFC8224]” in the previous paragraph, I would just remove the citation here.

   Presumably if the target did not request
   the media, the check will fail.

Why “presumably”?  Is the statement true or not?  (If the word stays, it needs a comma after it.)

(Alexey Melnikov) No Objection

Comment (2019-06-13 for -08)
Thank you for a well written document. I have one nit I would like to discuss:

4.1.  Full Exchange

   Given an INVITE (shamelessly taken from [SHAKEN]):

   INVITE sip:+12155550113@tel.one.example.net SIP/2.0
   Max-Forwards: 69
   Contact: <sip:+12155550112@[2001:db8::12]:50207;rinstance=9da3088f3>
   To: <sip:+12155550113@tel.one.example.net>
   From: "Alice" <sip:+12155550112@tel.two.example.net>;tag=614bdb40
   P-Asserted-Identity: "Alice"<sip:+12155550112@tel.two.example.net>,
   CSeq: 2 INVITE
   Content-Type: application/sdp
   Date: Tue, 16 Aug 2016 19:23:38 GMT
   Feature-Caps: *;+sip.608

This is not syntactically valid. Either you need a space in front of the above line, or maybe it would be better if you change the above 2 lines to read:

     Identity: eyJhbGciOiJFUzI1NiIsInR5cCI6InBhc3Nwb3J0IiwicHB0Ijoic2hha2VuIiwieDV1I

If the base64 encoded value is one line with no line breaks, you should consider pointing this out.

   Content-Length: 153

Alvaro Retana No Objection

Martin Vigoureux No Objection

Éric Vyncke No Objection

Comment (2019-06-10 for -08)
Thank you all for the work put into this document. I have 2 comments below


-- Section 1 --

The last paragraph of the introduction describes an attack that should probably be better located in the security section.

-- Section 4.1 --

In the example, I wonder whether the IPv6 syntax of "Via: SIP/2.0/UDP [2001:db8::177:60012];branch=z9hG4bK-524287-1" is correct. I would have expected "Via: SIP/2.0/UDP [2001:db8::17]:760012;branch=z9hG4bK-524287-1" but I am not a SIP expert.

Magnus Westerlund No Objection