Challenges to Privacy in New Internet Applications: VoIP,

34 Slides519.50 KB

Challenges to Privacy in New Internet Applications: VoIP, IM, location-based services Prof. Henning Schulzrinne Computer Science Columbia University, New York ATIS Network Security Symposium and Workshop Washington, DC September 2004

Overview Email spam: a history of failed miracle cures New challenges emerging: VoIP unsolicited calls instant messaging Location and presence privacy Reputation management

Not just email Email is just first large-scale, open communication medium initially also “closed user groups” (DECmail, PROFS, UUnet, Fido, ) When does UBC occur? Single domain large number of independently operated domains Published or guessable addresses removes easy remote authentication as old as unlisted numbers conflict between usability and using addresses as communication keys Others emerging from closed user groups instant messaging (IM) VoIP and multimedia calls presence queries

The universe of message senders human user opt-in bulk communicati ons (known and unknown) mailing lists (forwarder) robots (event notification) machine human machine machine (EDI) human involvement

The problem is easy if you’re willing to make some minor assumptions: single administrative domain only previously-known senders (but how?) global public key infrastructure (PKI) only real human users, no lists

Communication challenges Joe-job “The act of faking a spam so that it appears to be from an innocent third party, in order to damage their reputation and possibly to trick their provider into revoking their Internet access. Named after Joes.com, which was victimized in this way by a spammer some years ago.” Phishing “The act of sending an e-mail to a user falsely claiming to be an established legitimate enterprise in an attempt to scam the user into surrendering private information that will be used for identity theft.” Spam, spim (unsolicited bulk communications) Nuisance communications

Tools available countermeasures From address blacklisting IP sender blacklisting Content filtering (Bayesian filters) MUA mail sender SMTP marked with header POP IMAP DNS SPF, SBL, MTA spam folder

Miracle cures Method does fails postage for email sender pays receiver for reading mail collection, socially unacceptable (job offer), mailing lists Haiku (e.g., Habeas) include copyrighted haiku in non-spam enforcement computation (e.g., Microsoft Penny Black) sender solves computational puzzle lists, bots Turing test (challengeresponse) automated senders Graylisting return temporary failure UBC gives up unreliable, delay Address hiding & spoofing prevents web crawlers from picking up addresses existing addresses single failure Bayesian content analysis detect spam/non-spam terms pictures, word spoofing, poisoning, IM, VoIP, Bonding third party (e.g., BSP) promises bond if domain spams enforcement

The UBC arms race IP blacklisting open relays From blacklisting RBL, SPEWS sender faking SPF, DMP, bot armies Bayesian filters pictures dictionary attacks

“We need a new mail/IM protocol” True: SMTP not designed for today’s hostile Internet no sender authentication no easy policy inclusion False: A new mail protocol is going to fix UBE/UBC Hard problems are ecosystem, not protocol: authentication – domains and individuals PKI (S/MIME, PGP) has never scaled current email certificates just certify ownership of email address help with whitelist, but not with unknown users too costly for true verification reputation accreditation

IETF MARID IETF working group for verifying sender “It would be useful for those maintaining domains and networks to be able to specify that individual hosts or nodes are authorized to act as MTAs for messages sent from those domains or networks. This working group will develop a DNS-based mechanism for storing and distributing information associated with that authorization.” related to IRTF ASRG (Anti-spam Research Group) DNS extensions, “purported responsible address”

MARID processing “Given an email message, and given an IP address from which it has been (or will be) received, is the SMTP client at that IP address authorized to send that email message?” client SMTP validation (CSV) Client authenticate d, authorized and accredited? Y extract purported responsible address (PRA) extract purported responsible domain (PRD) SPF: IP legal for PRD? Y N

MARID: Client SMTP Validation (CSV) EHLO aol.com from 64.12.187.24 authentication EHLO domain real? authorization Host authoriz ed to be MTA? accreditation Domain reputati on? A(aol.com) draft-ietf-marid-csv-intro IN A 64.12.187.24 SRV( client. smtp.aol.com) SRV weight 2 draft-ietf-marid-csv-csa PTR(aol.com) vouch.smtp.isgood.com TXT(aol.com.isgood.com) IN TXT MARID,1,A draft-ietf-marid-csv-dna

PRA (Purported responsible address) “Allows one to determine who appears to have most recently caused an e-mail message to be delivered. It does this by inspecting the headers in the message.” (draft-ietfmarid-pra) uses Resent-Sender, Resent-From, Sender, From RFC 2822 headers draft-ietf-maridsubmitter defines new MAIL parameter for SMTP [email protected] almamater.edu [email protected] S: 220 company.com.example ESMTP server ready C: EHLO almamater.edu.example S: 250-company.com.example S: 250-DSN S: 250-AUTH S: 250-SUBMITTER S: 250 SIZE C: MAIL FROM: [email protected] SUBMITTER [email protected] S: 250 [email protected] sender ok C: RCPT TO: [email protected]

SPF, Sender-ID SPF (sender policy framework) Verifies that most recent sender (e.g., mailing list forwarder) is authorized for its domain Does not prevent spam, but enables white and blacklisting Adds DNS TXT or SPF resource record (RR) for domain spf2.0/mfrom,pra mx a:192.1.2.0/28 –all “mail from MX server for example.com and from IP 192.1.2.0 are ok; all others are bad” HELO or EHLO SMTP connection MAIL FROM body delivery From:

Putting the tools together transitive trust model: intra-domain user, inter-domain domain/host-only authentication bpm.com SMTP server SPF CSV SMTP SMTPAUTH submission (password) [email protected] [email protected] accreditation: aol.com does not host spammers bpm.com verifies user identities (not yet)

What’s different about IM and VoIP? Higher nuisance factor combine the worst of email and phone telemarketing Close to zero cost call origination has no capacity limitation (unlike PSTN line limitation) can be originated in volume from residential broadband – not T1 required easy to get addresses: SIP address email address or E.164 number non-US origin: cheap labor, no DNC laws Privacy invasion T1: 2.4 call attempts/second @ 1000/month LD 500 kb/s DSL: 9 call attempts/second @ 50/month know user is actually there Nuisance calls possibly no good way to trace already a problem with Skype

SIP spam Call spam telemarketing content filtering likely ineffective IM spam SIP MESSAGE or message sessions spam intent may not be obvious in first message get attention first with “Hello” short messages harder to analyze with content filters but typically requires white-listing based on presence subscription Presence spam (request addition to watcher list) mostly nuisance – user may need to manually deny request J. Rosenberg, C. Jennings, draft-rosenberg-sipping-spam, July 2004

SIP spam prevention All earlier mechanisms apply, with largely the same caveats Black lists White list domain-level within domain, only if domain practices sound user management may use buddy list as white list stronger user authentication Consent-based communication needs to subscribe first but may not be able to recognize address (“is [email protected] a spammer or some long-lost friend?”)

SIP spam prevention Use of MARID-like DNS domain verification possible may not be needed, due to usage of TLS for interdomain communications transitive trust principle: but doesn’t preclude rogue sub-domains e.g., “is hgs10.columbia.edu allowed to route SIP calls for columbia.edu?” trust that previous hop applied identity management principles longer term, use S/MIME certificates for user-level authentication, but doesn’t improve spam prevention much not widely available now if S/MIME certificates are cheap, spammers can mint new identities

SIP authentication Digest auth over TLS destination proxy (identified by SIP URI domain) outbound proxy TLS mutual host verification insert crypto-signed identity assertion (AIB sip-identity) [email protected] : 128.59.16. 1 registrar SIP trapezoid voice traffic (S)RTP

From domain to user policies Not all domains can be classified as “good” or “bad” as a whole Many different domain types: Employer ISP Associations (IEEE, ACM, ATIS, ) Personal domains Mailbox providers Divide domains by their user policy: Admission-controlled domains Bonded domains Membership domains most employers e.g., credit card Open, rate-limited domains Open domains Kumar Srivastava, Henning Schulzrinne, “Preventing Spam for SIP-based Instant Messaging and Sessions”, Columbia University Technical Report, September 2004.

Reputation and domain descriptions Need to define mechanism to obtain domain user verification policy Individual user reputation: deposit positive or negative feedback information based on calls depends on cooperation of domain limit user feedback rate to avoid ballotstuffing Fortunately, there seem to be few parttime spammers

Using social networks for spam control is a friend of strength of knowledge 0.3 trust in good behavior 0.5 total trust (strength * trust)

Privacy: Context context “the interrelated conditions in which something exists or occurs” anything known about the participants in the (potential) communication relationship both at caller and callee time CPL capabilities caller preferences location location-based call routing location events activity/availability presence sensor data (mood, bio) not yet, but similar in many aspects to location data

Architectures for (geo) information access Claim: all using protocols fall into one of these categories Presence or event notification “circuit-switched” model subscription: binary decision Messaging email, SMS basically, event notification without (explicit) subscription but often out-of-band subscription (mailing list) Request-response RPC, HTTP; also DNS, LDAP typically, already has session-level access control (if any at all) Presence is superset of other two GEOPRIV IETF working group looking generically at location services (privacy) SIMPLE and SIP: event notification, presence

GEOPRIV and SIMPLE architectures rule maker rule interface target presentity caller publication interface PUBLISH INVITE location server presence agent notification interface location recipient GEOPRIV watcher SIP presence SUBSCRIBE NOTIFY INVITE callee SIP call

GEOPRIV and SIMPLE Policy rules There is no sharp geospatial boundary Discussed in both GEOPRIV (geospatial) and SIMPLE (SIP IM) Presence contains other sensitive data (activity, icons, ) and others may be added Example: future extensions to personal medical data “only my cardiologist may see heart rate, but notify everybody in building if heart rate 0” Thus, generic policies are necessary

Presence/Event notification Three places for policy enforcement subscription binary notification content filtering, suppression only policy, no geo information subscriber may provide filter could reject based on filter (“sorry, you only get county-level information”) greatly improves scaling since no event-level checks needed only policy, no geo information third-party notification e.g., event aggregator can convert models: gateway subscribes to event source, distributes by email both policy and geo data

Presence policy XML rules managed via XCAP SUBSCRIBE subscription policy subscriber (watcher) for each watcher event generator policy subscriber filter rate limiter change to previous notification? NOTIFY

Policy relationships common policy geopriv-specific presence-specific RPID future CIPID

PIDF-LO (location object) Basic location object civic and geospatial typically, in conjunction with presence contains source and authority basic privacy rules: retention period redistribution allowed ?xml version "1.0" encoding "UTF-8"? presence xmlns "urn:ietf:params:xml:ns:pidf" xmlns:gp "urn:ietf:params:xml:ns:pidf:geopriv10" xmlns:gml "urn:opengis:specification:gml:schema-xsd:feature:v3.0" entity "pres:[email protected]" tuple id "sg89ae" status gp:geopriv gp:location-info gml:location gml:Point gml:id "point1" srsName "epsg:4326" gml:coordinates 37:46:30N 122:25:10W /gml:coordinates /gml:Point /gml:location /gp:location-info gp:usage-rules gp:retransmission-allowed no /gp:retransmission-allowed gp:retention-expiry 2003-06-23T04:57:29Z /gp:retention-expiry /gp:usage-rules /gp:geopriv /status timestamp 2003-06-22T20:57:29Z /timestamp /tuple /presence

Privacy rule sets Conditions such as Actions identity of requestor time-of-day sphere e.g., allow subscription Transformation e.g., reduce accuracy of geo data rule id "f3g44r1" conditions identity uri [email protected] /uri /identity validity from 2003-12-24T17:00:00 01:00 /from to 2003-12-24T19:00:00 01:00 /to /validity /conditions actions /actions /rule

Conclusion Protocol and technical means as a complement to legal actions Identity-based techniques more promising than content-based approaches New applications (VoIP, IM, presence) vulnerable to unsolicited communications with possibly larger impact due to lower cost, legal barriers content-based techniques fail altogether New applications do not lend themselves to current content-based spam prevention techniques Domain-based rather than person-based mechanisms appear promising Need policy languages for sharing private data

Back to top button