Javascript disabled? Like other modern websites, the IETF Datatracker relies on Javascript. Please enable Javascript for full functionality.

Minutes IETF116: pearg: Wed 04:00
minutes-116-pearg-202303290400-00

Versions:

Meeting Minutes		Privacy Enhancements and Assessments Research Group (pearg) RG
Date and time		2023-03-29 04:00 Meeting Local UTC
Title		Minutes IETF116: pearg: Wed 04:00
State		Active
Other versions		markdown
Last updated		2023-04-06

minutes-116-pearg-202303290400-00

Chair welcome ("PearG"):
Note well / Wear masks, in person.

Draft updates (5 mins)

RG draft statuses
- IP Address Privacy Considerations:
  - No recent updates since the last meeting, but updates coming
    soon
- Censorship:
  - Recent update
- Numeric IDs
  - Sent to RFC editor
- Safe Internet measurements:
  - Review
  - Maybe interesting for PPM, as well

Presentations (100 mins)

Interoperable Private Attribution (Martin Thomson) - 30 mins
- Attribtion: important piece of the ad industry
- Trains!
- Let's talk about the Tokyo subway system
- Actually, let's talk about identifiers, like access cards (e.g.,
  PASMO)
- Using passenger tracking for the purpose of capacity planning,
  performance, etc.
- Specifically, for systems that track when a person enters the
  system and when the person exits
- But logs are a privacy risk and can be used for other purposes,
  even if they are inherently pseudonymous - identities could be
  linked.
- Can we create a design that aggregates the data that's
  interesting, and provides individual privacy?
- One design is using tokens with buckets
- Tokens need to be:
  - anonymous
  - authenticated
  - time-delayed "opening"/redemption
  - ephemeral
- Moving on to advertising
- Attribution: information from one context and linking it in a
  different context
- Answer a question: "How many people saw the ad, then came to the
  show?"
- Understanding whether certain advertising is working:
  - good placment
  - creatives
  - how much to spend
  - how long to run campaigns
- Current, cross-context attribution allows linking people across
  contexts
- With advertising, the context is everything:
  - Whether an ad was shown, and if that ad was clicked
  - Was a product puchased, or not
  - where was the ad shown
- Interoperable Private Attribtion (IPA)
  - People have an identifier (significant protections against
    revealing the identifier)
  - Sites can request an encrypted and secret-share of that
    identifier
  - Sites have a view of the identifier, but it's not linkable
    cross-site
- Attribution in MPC (multi-party computation)
  - sites gather events
  - MPC decrypts identifiers and performs attribution
  - aggregated results are the output (histogram)
- MPC does not, itself, see the original query
- MPC:
  - Any computation if you only need addition and multiplication
  - It can be expensive
  - IPA uses a three-party, honest-majority threat model
- Differential Privacy
  - (epsilon, delta)-DP for hiding individual contributions
  - Every site gets a query budget that renews each epoch (e.g.,
    week)
  - This does provide leakage across time (epochs), more
    research needed in this area
  - Parameters are not fixed yet
- Client's encrypted identifiers are bound to a site, they are
  bound to:
  - the site that requested them
  - the epoch/week they are requested
  - the type of event: source (ad), trigger (purchase)
- IPA: advances and challenges
  - IPA's flexibility provides somewhat of a drop-in replacment
    for current anti-fraud systems
  - IPA's flexibility hurts accountability
    - Existing challenge in making the system auditable
  - MPC performance is a challenge, especially at the scale of
    10s of billions
- Status: Good progress, overall, but still requires research in
  some areas
- Currently running some synthetic trials
- Ongoing work in W3C working groups, protocol may come to PPM in
  the future
- Brian Trammel: MPC performance is a challenge. Computation or
  communication complexity?
- MT: A lot is algorithmic (linear), but some of that will likely
  improved, but much of it is communication cost. Originally,
  records were working on the order of ~40GB, but it's still
  mutli-gigabytes in size
- Chris Wood: 1) What was the MPC functionality you needed (as
  defined by the existing adtech industry), 2) Now that
  functionality is defined, and how you implement. How did you
  reach this design?
- MT: Need more time. Lots of people took the steps to get here.
  Apple's PCM took an initial approach. This is mostly about
  understanding how the advertising industry uses measurement as a
  core part of their processes. There is a "need" vs. "want"
  different of perspective by different parties, and those
  discussions are on-going. If you add cross-device attribution,
  it gets more complicated.
- CW: There is an academic research community that has spent a lot
  of time designing MPC protocols. There seems to be some overlap
  and collaboration opportunity here.
- Shivan: Who would run the servers in the MPC protocol?
- MT: We need to trust them to not collude - to be determined
- Jonathan Hoyland: If it's run by a third-party that is running
  an auction, what are the guarantees that they're actually
  running the MPC protocol
- MT: Currently leaning on the oversight / auditing.
- JH: Can the response include a proof?
- MT: Recently asked if Verifiable MPC was considered - but VMPC
  is not ready yet. So, "trust and verify" is the current approach
Secure Partitioning Protocols (Phillipp Schoppmann) - 20 mins
- Let's go more into details for scaling aggregation computations
- Billions of impressions from billions of clients
- ALl clients submit their reports to the MPC cluster
- MPC outputs the aggregate results
- Goals
  - When sharding the MPC cluster, every client must use the
    same shard
  - We need a private mechanism for mapping one client to the
    same shard
  - This should have low communication cost
  - "correctness" must not be affected
- Assumptions:
  - Bound on the number of contribitions
  - Many clients, fewer shards
- Blueprint: partitioning from distributed OPRFs
  - client has an index (i), and payload (v)
  - One server has an OPRF key (server 1)
  - Other server (server 2) will learn the result of OPRF
    computation
  - server 1 must add some padding queries
  - Server 2's output of OPRF is used for mapping client to
    target partition
- Dense Partitioning: OPRF Output = Shard ID
- If there are only a small set of shards, then this is reasonable
- Sparse Partitioning: OPRF Output = Random Client ID
  - Can the client's reports be aggregated before the MPC
    computation?
  - This doesn't result in creating a client identifier because
    server 1 pads the set of known client identifier if dummy
    values, so server 2 can't distinguish between real users and
    fake users
- How can the sparse histogram be private without seeing the
  actual histogram?
  - View the output of the OPRF as a histogram
  - Make sure frequency can't be linked to specific users
  - Choose a threshold, below threshold add dummy values, above
    threshold [..] (?)
- Conclusion: efficient for these use cases
- Next steps: Is there general interest? Are there other protocols
  where this might be useful? Are there other properties that are
  needed?
- Chris Patton: Definitely interesting, but maybe not as an
  independent draft
- PS: So, add this into individual drafts, instead of making a
  general purpose protocol
- CP: Yes
- Martin Thomson: The bounds seem to be fundemental. How confident
  are you that these are required costs?
- PS: The numbers are not the absolute lower bound, they are based
  on the curent design described in this presentation
- MT: IPA may not be able to set an upper bound on the number of
  contributions, for example due to a Sybil attack
- PS: While any party can create reports, but fraudulent reports
  may be able to be filtered downstream
DP3T: Deploying decentralized, privacy-preserving proximity tracing
(Wouter Lueks) - 25 mins
- D3-PT, started back in March 2020, first draft in May 2020,
  September 2020 - Summer 2021 working on presence tracing
- Non-traditional academic environment - scaling to millions of
  users on a small timescale
- Relying on existing infrastructure had a large impact
- The system was designed that they were purpose-built and
  couldn't be re-used for other purposes
- Risks associated with digital contact tracing:
  - Must embed social contact / graph
  - location tracing
  - medical information
  - social interactions
  - social control risk
- Time has shown what can go wrong with designs/deployments like
  this
  - Police departments in crime solving
  - data leaks
  - harassment of specific subgroups
- It is very important that systems should be designed with
  purpose-limitations in mind, so they can't be easily abused in
  other ways
- Relying on existing infrastructure, using phones with BTLE
  sending beacons
- Proximity can be derived based on the beacons they saw
- Exposure notification works by the set intersection of beacons
  the person (who tested positive) saw and all of the identifiers
  that another person broadcast
- The design of these beacon broadcasts required that the OS
  vendor must be involved
- While the design was relatively simple, relying on existing
  hardware made the situation more difficult/complicated
- The result of collaboration with Google/Apple, was the
  Google/Apple Exposure Notification (GAEN) Framework/API
- For full effect, you need privacy at all layers of the stack,
  including the bluetooth protocl stack
  - MAC address must rotate at the same time as the beacons
- Similarly, at the network layer, a network adversary can detect
  uploading the report of seen beacon identifiers (when reporting
  covid positive) - CH used dummy uploads to hide
- Lessons learned:
  - Purpose limitations
  - context matters (how/where they are deployed)
  - Privacy at all layers
- Tommy Pauly: More comment than questions: for privacy at all
  layers, Apple is routing upload report through iCPR
- WL: While this is great, there might be other sidechannels we
  need to look at
- XXX: How do you authenticate IDs?
- WL: There isn't any binding, but the upload requires knowing the
  underlying seed from which the beacon was derived
- Chris Wood: What would've an ideal interface looked like, and
  how would you've designed it differently?
- WL: The strictness provided protections, but it introduced
  challenges, as well. There isn't an easy answer.
LogPicker: Strengthening Certificate Transparency Against Covert
Adversaries (Alexandra Dirksen) - 25 mins
- HTTPS is mostly a default now (90%+ of all page loads are https
  in chrome)
- CAs are the trust anchors of the Web PKI
- There are recent illicit certificate creations, and seemingly
  increasing
  - WoSign
  - Digicert
  - Diginotar
  - Comodo
  - TurkTrust
- For rogue certificates, where you get a certificate for a domain
  that you don't own (e.g., HTTPS interception)
- In the attacker scenario, a covert attacker obtaining a rogue
  certificate
- Certificate transparency overview
- CT is still vulnerable to this attack
  - All logs belong to a CA vendor
  - First compromise was in 2020
  - vulnerable to collaboration attacks
  - vulnerable to split view attack
- Gossip is proposed as a mitigation for Split View attacks
- LogPicker: a decentralized approach
  - CA contacts one log (leader) from a large set of logs (log
    pool)
  - Leader then contacts the other logs in the pool
  - the pool then selects one log, at random
  - The selected log includes the certificate in its merkle tree
  - The logs that participated in choosing the log create a
    proof, and that proof is aggregated and sent back to the CA
    for inclusion in the certificate
- This design meets the goals
- Chris Wood: The log pool uses an election protocol?
- AD: Yes, two protocols
- CW: Have you looked at alternative solutions that use threshold
  signing?
- AD: The aggregated signature uses BLS, but which signature
  scheme is used is not strictly defined

Minutes IETF116: pearg: Wed 04:00 minutes-116-pearg-202303290400-00

Draft updates (5 mins)

Presentations (100 mins)

Minutes IETF116: pearg: Wed 04:00
minutes-116-pearg-202303290400-00