Skip to main content

Minutes interim-2022-scim-02: Tue 08:00
minutes-interim-2022-scim-02-202209200800-01

Meeting Minutes System for Cross-domain Identity Management (scim) WG
Date and time 2022-09-20 15:00
Title Minutes interim-2022-scim-02: Tue 08:00
State Active
Other versions markdown
Last updated 2022-09-20

minutes-interim-2022-scim-02-202209200800-01

SCIM Virtual Interim
September 20, 2022

Chairs: Aaron Parecki, Nancy Cam-Winget

Notetakers: Joel Franusic

  • 08:04 Aaron Parecki: Introductions
    • Pam to present when she arrives

Presentations

times in PST

  • [08:06] Use Cases - 25min - Pamela Dingle

    • Has been going through the specs to find the terms. If we can
      use the terms in the spec in the use-cases, that's the best way
      to get the people reading the use-cases into the spirit of the
      spec
    • Shared a presentation on the research
    • Looking for guidance on where the research should end
    • Issue is external ID and how it's defined

      • "Provisioning Domain"
    • The external ID is defined as being relative and scoped to the
      provisioning domain

    • Issue: In the case where the provisioning client is pushing a
      resource to the SCIM server. The SCIM server may return a new
      external ID (?)

      • What is the provisioning ID that is returned
    • Phillip Hunt: There wasn't a consensus on defining the procedure
      for this

      • Some people will want to do something sophisiticated
      • The contract is saying that the entity that sets the
        external id can keep using it, so it doesn't have to keep
        using the SCIM server.
      • The contract is between the client and the server
      • You can't list the multiple external identifier
      • If a client is trying to contact SCIM in another domain,
        they should be able to use the external ID
      • It's not the greatest solution, but it's what was chosen
    • If client "A" defines an external id, can client "C" discover
      that external id?

      • Phillip Hunt: That behavior is undefined

        • You can count that the URI for a SCIM resource never
          changes
        • You can share that identifier between clients
        • We want identifiers to survive delete and resurection
        • His experince has been that external ID is rarely used
      • Matt Peterson: external ID has been confusing to implement

        • It doesn't add utility for the client
    • Has never heard of a SCIM server managing multiple external IDs
      per client

      • Phillip Hunt: It's a failsafe

        • There's a parallel group that went through this
        • Open ID ___ (Shared Signals?)
        • They needed a way to refernece objects over time
        • Google had a need for an agreement on identifiers
      • Danny Zollner: to the question of different clients
        negotating different external IDs

        • Has seen external IDs be negotation by enterprise
          applications where the external ID comes from the IDP
        • If you're moving from IDP "A" to "B" you may have
          different ways to identify a user, the external ID might
          be used to help
        • Agrees that it is a failsafe
        • Has never seen different clients return different
          external IDs
        • has always seen it as a "straight string"
    • Feels like we can do a "revisionist approach" where we define
      the use cases.

      • Wenting Tang: We see cases where a user profile is populated
        from multiple sources. Allows clients for easier
        identification of user
      • Danny Mayer: The organization would always have a unique ID,
        it didn't matter how many clients he had, they'd all use the
        same external ID. They would all share the same ID
        • You can only do this within the same ecosystem
        • If you have multiple ecosystems, it's a different ball
          game
    • That's the issue

    • Sounds like Okta is using it more faithfully to how it's defined
    • Will try to summarize this and send it to the list. Perhaps give
      a survey.
  • [08:28] Cursor-Based Pagination - 20min - Danny Zollner and Matt
    Peterson

    • https://datatracker.ietf.org/doc/draft-peterson-scim-cursor-pagination/
    • Matt Peterson presenting
    • The only draft published is the first rev.
    • Reason for the draft

      • Born from multiple SP implementations
      • Pagination strategy was already defined by existing
        implementations
      • If the underlying API supported index/offset it was easy
      • If the API supported only cursor, it was hard
      • No opportunity as a SCIM SP to change the code in the
        underlying implementation
      • The draft is a proposal for cursor based pagination
      • What are the scenarios where pagination is important? Large
        result sets. Client requesting all of the resources for
        users/groups so they can keep a local "cache"
      • As they implemented, the client would request all objects,
        then query from time to time, to keep this up to date
      • If all they need is to keep a client/server up to date can
        they get away with just change notification. Many people had
        continued interest in pagination in addition to change
        notification.
      • If we need pagination in SCIM, we need cursor based
        pagination
      • Two new query parameters proposed
      • Two new response metadata attributes proposed
    • Revision -01

      • wording of introduction
      • Minor grammar/punctuation
      • Make cursor query parameter mandatory (if you want the first
        page, supply an empty cursor)
    • Questions

      • Danny Mayer: If you do change notification, the change set
        may be large, so you'd need a cursor on the change set.
      • Phillip Hunt: Confused, change sets are always per object.
        There's no "set mode"

        • Matt: You're saying that if change sets are per object,
          there'd never be a set of changes
        • Phillip: It may only get complex with a group
        • Matt: What if a client didn't want to implement event
          notification? There's still interest in pagination
        • Phillip: Still confused. "max results" may be in
          conflict. If the objective is to sync data. Assuming
          that it's allowed. The best is to stream to disk. SQL
          server uses virtual memory so if several clients are making a
          cursor based request, that may crash the SQL server.
          Concern is that a lot of systems aren't doing monolithic
          architectures using a database
      • Aaron Parecki: Wants to hear from other feedback on other
        vendor's experience in the scim implementations especially
        in database (store/access) considerations.

      • Matt Peterson: We just reuse the cursor from the underlying
        database. It ended up being very useful not to have to
        translate from cursor based to pagination based. If you
        implment index based pagination on top of cursor based
        pagination, it can introduce the concern from Phillip in
        terms of resource starvation / memory depletion attacks.
      • Danny Mayer: Agrees with Matt. If you have 100k employees,
        you're going to have a problem. We shouldn't depend on the
        translation between server and indexing and how it's
        implemented behind the scenes.
      • Danny Zollner: (Orignally raised hand for something else)
        engineering team at Microsoft had requested [...] if they
        built a service provider. They'd prefer or want to require
        cursor based pagination. (original question) they'd like to
        find a way for an SP to only support cursor based
        pagination. (back to the original question) if they were to
        implement a SCIM server today based on what they've done,
        they'd want to go with cursor based pagination model.
      • Pamela Dingle: We have to keep in mind that we are
        creating/architecting an interface, not an implementation.
        There are ways to DDOS on both sides. We have a default
        method, we are adding a second method that may be
        lighterweight in some cases. Is that right?
      • Matt Peterson: That's right, looking for interoperability
        when using cursor based pagination. If they had an
        underlying application that was cursor based, they would
        have no way of adding SCIM support. That's why the proposed
        the draft, to tell the world "here's what we did" he isn't
        sure who's implemented the draft, but it hasn't just been
        them.
      • Nancy Cam-Winget: would be good to know what other vendors
        have implemented cursor pagination, based on this draft.

        • Knows that WSO2 has implemented
        • Has had enough questions on the mailing list to believe
          that others have implemented too
      • Aaron: Curious if someone has implemented without underlying
        database support for cursors

      • Matt: Unclear if that's been done
      • Danny Mayer: Dealing with 100k users is a big chore
    • Aaron Parecki: Time check [08:51] we have another 5 minutes or
      so

    • Matt Peterson: With no further questions, will submit the draft
      and get feedback on the mailing list.

      • (Matt) Closing: He can understand an argument that we don't
        need pagination at all. Using only max results. If you have
        a query that is larger than max results, you'll get data
        only up to max results. Boils down to: If we're going to
        have pagination, then we need index based and cursor based.
      • Phillip Hunt: Max results is an issue either way. If you can
        get the server to wave max results. The argument he heard
        was "I don't have enough memory to do that" - in that case,
        you have a disk. Cursor based paging means "I'm going to
        take up to days that you're going to need to keep that
        cursor" when he did LDAP, the biggest issue was clients that
        took days to run a query. If the use case was "I want
        pagination because it's better for the server" then he would
        support it. If it's to download the entire dataset, then
        this isn't the tool. There's the same issue with events.
        There's a need for import/export en-mass. Cursor seems to be
        the easiest way. One important consession to him, index
        based paging v. cursor based paging, he agrees that cursor
        based paging is easier.
      • Matt: Index based pagination isn't optional in the spec. If
        I want to implement pagination at all, if he has to
        implement then it needs to be index based. Example for Azure
        AD, if they want to put SCIM in front of an existing
        database, then there's often a strong preference for one
        type of pagination
      • Phillip Hunt: You're talking about a breaking change
      • Matt: Wants to do something that's interoperable. Hasn't
        written in a requirement for only cursor based pagination
        into the draft.
      • Phillip Hunt: That would be a whole new protocol
      • Matt: It's still difficult for some service providers to
        implement index based pagination, which is hard
    • Aaron Parecki: Any action items?

      • Matt Peterson: Resubmit the draft with the revisions that
        eliminate the ambiguity. Old one is 7 years old.
      • We are now behind schedule. Phillip can you do 18 minutes?
      • Philip: That's fine
  • [09:00] SCIM Profile for Security Event Tokens - 20min - Phillip
    Hunt

    • https://datatracker.ietf.org/doc/draft-ietf-scim-events/
    • Phillip Hunt: There's been an update to the draft.
      • Removed a lot of discusison on Kafka and message bus
        systems.
      • In reality a service provider domain and client domain will
        each have their own bus based systems or master/slave (jf:
        leader/follower) hirachy.
      • May have 100s of servers out there. Across admin boundries
        need to be able to share change events and security events
        from SCIM between systems. Using pushing or polling to
        deliver events. There will be a single gateway between
        domains, once the events cross the domains, something will
        determine where to send the event. Example: Salesforce
        sending events to Microsoft Azure. Salesforce won't need to
        know what's happening in Azure at all
      • Danny sent out an email saying there's no recovery mechanism
        in events
      • Only a short term mechanism
      • What should happen is that the systems would manage
        themselves. Example: If you're using Kafka, you could
        acknoledge that the message was received.
      • That was the design for recovery, it's under complete
        control of the client. Could keep it simple, or otherwise
      • Once you move from 50k entries to 500k entries. Change
        management becomes harder and harder, just like it becomes
        harder and harder with cursor pagination. What do you do
        with changes? Probably drop on a message bus, the republish
        to clients that are listening.
      • On the security events side, to maintain a change log on the
        SP side, or a recovery mechanism, it would blow up quickly.
        Not only do you have 500m events in your database, you have
        500m * (every client) ... if the receiver needs their own
        recovery mechanism, they can implement it themselves.
      • Idea is that SCIM events fit with the shared events
        framework. Then SCIM becomes a schema definition.
      • You still have use cases that paging nor scim events
        address: How do I get a full copy of the full dataset.
      • Is that something that should be address in this group at
        all?
      • This may seem complex, this was the decision of the SCIM
        working group to move it to the ___ group. People were
        defining their own security signals. It's a set of shared
        specs. Using the OpenID shared signals framework, in
        particular the one that Cisco/Duo developed.
      • Turning over the rest of the time to questions, or other
        things
      • Question:

        • Wenting Tang: His understanding is that SCIM is for
          identities. Wants to undestand the scope

          • Phillip Hunt: It's not for different resource types.
            SCIM events matches the SCIM protocol. A "create" in
            SCIM would make a "create" event. You can decide
            which type of events you're issuing events for.
        • Wenting Tang: ___ (jf: I missed the question)

          • Phillip: If you look at the RFC for security event
            token, it gives an example of a password reset. A
            password reset is a high level event. SCIM events
            match the PUT/PATCH/etc events.
          • Phillip: The spec just defines SCIM events.
          • [...]
          • If you read the spec, there's a way to pass some
            data, or all of the data. It shows how to pass
            standardized data or also custom data. It's a matter
            of how much you want it standardized. Similar to how
            SCIM has different user profile types, like the
            enterprise user profile.
  • [09:14] Roles and Entitlements - 10min - Danny Zollner

    • https://datatracker.ietf.org/doc/draft-zollner-scim-roles-entitlements-extension/
    • At IETF 114 he wrote a rev to the draft and sent to the working
      group mailing list.
    • Has had some feedback.
    • Looking for more feedback on a way to fetch roles.
    • Does anybody have thoughts on the draft? Comments/concerns?

      • Wenting Tang: Entitlement draft, right?

        • Question about the scope of the problem he's trying to
          address
        • Different downstream systems might have different
          roles/entitlements
        • A single attribute might not work for a system with lots
          of downstream systems.
      • Danny: Problem he was trying to address with the draft, for
        roles or entitlements. There is usually a finite set of
        roles or entitlements. You can't just make one up, it would
        get rejected.

      • It's an interop problem, if a client and SP are starting to
        work together. If a SCIM client wants to have a better
        chance of making better requests for creating/updating users
        with roles/entitlements
      • There's no way to get a list of acceptable values for
        roles/entitlements.
      • Goal was entirely to allow, without human interventions,
        connecting to hundreds of systems. Allow the SCIM client to
        know what the possible values are.
      • SCIM SP can partially accept values. Lots of SP
        implementations where if you don't provide valid data in a
        request, the request gets dropped.
      • It's a way to surface values that are already restricted
        elsewhere
    • Wenting Tang: Will only address course grained roles?

      • Danny: Roles tend to be course grained, as he's seen
        • Entitlements have been more like "feature assignments"
        • Entitlements might also be for finer grained
        • In his experience these aren't well defined or well
          understood
        • Draft is about discoverability of allowed values
  • [09:22] Referential Values - 10min - Danny Zollner

    • https://datatracker.ietf.org/doc/draft-zollner-scim-referential-value-location/
    • Not sure if this has been sent to the mailing list. Wrote this
      during IETF 114.
    • Trying to solve the same set of problems as the
      roles/entitlments
    • How to increase the rate of successful requests.
    • Not to be confused with reference attributes.
    • Example: The value for manager.value should be the "id" value
      for another user in the directory. Programatically, there's no
      way to determine that. There may be custom attributes (scheme
      extentions) like "job title" or even "role"

      • "For this attribute we will only accept this sort of values"
      • Cost center is another example
    • If there is a finite set of values that are acceptable for an
      attribute, it should be discoverable

    • Those are to say yes/no or true/false
    • There are sub-attributes to house those values
    • The schema URI for where those attributes are
    • Example: For the manager attribute, it's contstrainted by the
      user resource that exists, give a reference to the schema ID
      value.
    • Has seen ample use for this in the SCIM implementation that he's
      had to deal with. This would be useful at scale.
    • Will make sure to email the mailing list
    • "I yield my time"
  • [09:26] Aaron Parecki: For any open drafts, has submitted a
    proposal for IETF 115. That will be the next time that we get
    together for when we talk in person.

    • Any closing thoughts?
    • Nancy: Thanks for running this today