Digital Identity Management on the Internet Al Gutierrez and Will Tsui

Digital Identity
Management on the
Internet
Al Gutierrez and Will Tsui
CPSC 457
Spring 2006
Intro
• Internet: simple end-to-end design
– Dumb, minimal network: only connects
devices
• Improving technology =>
– More high value transactions
– More accounts at online services
• Propagation of sensitive information
• Consequence of existing identity
systems: Safety and convenience of
conducting Internet transactions not ideal
What is digital
identity?
Identity in the Physical World
• We are unique, irreplicable individuals,
right?
• iden·ti·ty “the condition of being the
same with something described or
asserted.” (Merriam-Webster)
• Identity is how one is described either
by self-assertions or by assertions of
another.
• Real-world example: buying alcohol
Identity in the Physical World
• How well can we identify someone in real
life?
• Identification is never perfect
• Authentication: three factors
– Something you are
– Something you have
– Something you know
Digital Identity and Its Limitations
• Digital identity is a set of characteristics
asserted “by one digital subject about
itself or by another digital subject, in a
digital realm.” (Microsoft)
• As in the real world
– Identity need not be human
– Limited by authentication factors
• Authentication inherently more difficult on
the Internet
Digital Identity Management
• Focused on maintaining these asserted
characteristics of subjects, a.k.a. claims
• Why is digital identity management
important?
– Inventory
– Access control
• Out of scope: authentication
Digital Identity on the
Internet:
Current Problems
Problems with Current (Non-) Solutions
•
•
•
•
•
•
•
1. Unreliability
2. Inconvenience
3. Inconsistence
4. Impermanence / In-transience
5. Insecurity
6. Propagation
7. Intrusion
1. Unreliability
Unreliable identification of people.
– It's possible to identify machines (with caveats).
• It's not possible to secure remote machines.
– Perhaps this is provably so.
• One-way protocols can be spoofed.
– To wit: SMTP, the default outgoing mail protocol.
– It's possible to secure the (network) channel
between machines.
– It's possible to conduct transactions between
machines.
– Currently, it's not possible to identify the parties
concretely.
– Very poor management of identification information.
2. Inconvenience
• People currently have to register and create multiple
'accounts.'
• People have to create strong, independent passwords.
This is usually not even done properly.
• People have to remember or securely store all this info.
• Many sites require CAPTCHAs.
– Completely Automated Public Turing test to tell Computers
and Humans Apart
– Determine whether a ‘user’ is ‘human’ for a period of time.
• Login systems are primitive and rely on browsers.
– Cookies
– URL Query Strings
– HTTP 1.1 Basic Auth
3. Inconsistency
• There are various types of registration/login systems
around.
• Many, many different authentication ‘schemes’ and
associated GUIs that vary across:
– Servers (Apache / IIS / …)
– Languages (PHP, Perl, Ruby, C/CGI…)
– Frameworks (Rails, Struts, Form systems, CMSs…)
• Functionality greatly varies across these systems.
e.g. Can I reset my password?
or Can I delete my account?
• This is not necessarily the site creators’ fault:
- There is a great burden of work on sites.
- There is a burden on the user too, to learn too much.
4. Impermanence / In-transience
Online identity is not meaningfully transitive.
• An account in domain A is useless at domain B.
• So is Reputation/Credibility/Credit/Experience
across domains.
– E.g. Two different MMOs where the same person wants
to keep a single character / persona.
– Identity is also “ephemeral”
• HTTP is a stateless protocol
• Therefore, everything on top resembles this.
• After a period of time, IDs usually “expire.”
5. Insecurity
Current infrastructure is basically insecure.
•
•
•
•
People lose/leak passwords.
People choose weak passwords.
Cookies are vulnerable to XSS attacks.
Machines can be compromised.
» Trojans.
» Keyloggers.
» Viruses/Spyware/Malware.
• Protocols/Ciphers become outdated / breakable:
– e.g. SSL1, MD4 and possibly MD5.
5. Insecurity (contd.)
• The security of the system is a chain.
– It's subject to 'the weakest link‘
– When that link is broken, a person's identity can be
compromised.
– Not too hard, given some very insecure public systems
out there.
» e.g. Yale's SSN fiasco.
» Servers can be compromised.
» e.g. the Lexis-Nexis massive leak.
» etc., etc.
6. Propagation
• There is a vast propagation of sensitive
information.
– Very prone to leaking.
• Leaks are also vulnerable to weakest link.
• E.g.
– Amazon (likely secure) => Shady Vendor
– Amazon (likely secure) => Shady Shipper
– The current paradigm is leak, then secure.
– A better paradigm would be based around
‘prevention.’
7. Intrusion
• Essentially involuntary actions.
– Lots of unsolicited communication
• Commercial
• Anonymous / Ambiguous
– A lot of spam belongs here.
• Religious
• Political
– Can result in privacy violations
• e.g. Hidden HTTP requests in HTML email.
Legal Situation
Legal Situation
• The law cannot target “general
improvement.”
• It must aim for specific problems, and
make those things punishable.
– E.g. Identity Theft
• The current environment is:
– Certain federal agencies.
– The individual states’ id-related laws.
Federal Level
• The Federal Trace Commision (FTC)
takes care of overall complaints.
• The FTC can also take care of issues
with unassigned agencies:
– Credit Cards, Debt. (FDC Act)
•
•
•
•
For Bankruptcy: U.S. Trustee (UST).
For Passports: U.S. State Dept.
Tax fraud: IRS.
Drivers’ Licenses: state DMV
Federal Level (Contd.)
•
•
•
•
Mail theft: USPS.
Phone fraud: Depends on utility.
Financial Crimes: U.S. Secret Service
Bank fraud: Office of the Comptroller of the
Currency (OCC)
– Only “National” banks.
• Social Security Numbers: SS Administration
• Student Loans: U.S. Dept. Of Education
• Prosecution is done by the U.S. DOJ.
Federal Level (contd.)
• If you suffer even one instance of ID theft
involving multiple pieces of information,
you’re in for:
– a lot of work.
– small chance of success of recovery.
– Thus, people are less likely to do anything.
• Dozens of federal agencies doing
piecemeal work.
– ID is an afterthought in general, relegated to
some “Customer Relations” dept.
Federal Level (contd.)
• There is also the Identity Theft and
Assumption Deterrence Act (1998), -> 18
U.S.C. §1028
– For all intents and purposes it’s pre-internet.
– Makes certain violations a felony.
– Allows the FBI to get involved.
– Somewhat strong “in theory” (up to 15
years).
– Discrepancies between businesses and
individuals.
State Level Legislation
• Mostly a patchwork of laws.
• About 16 have financial freeze laws.
– Prevents thieves from obtaining new credit.
• About 23 have “security” breach
notification statutes.
– All passed in 2005, effective in 2006.
– California led the way, starting in 2003.
– Alerts victims (usually) only when there is
harm.
Best & Worst States
• Best:
– North Dakota
– South Dakota
– Maine
• Worst:
– Arizona
– Nevada
– California
• (All deserts…?)
Legislative Problems
• The law tends to move slowly.
• It is very difficult for the govt. to follow
technology closely.
– Witness DOJ v. Microsoft, where it was clueless.
– On the internet, the problem is a fast arms race.
• Spammers vs. Email Filters
• Viruses vs. Anti-Viruses
• Phishers vs. Phishing Databases
• The law usually can’t get technical enough to
be practical.
– Results in vagueness.
– Thus may not be enforceable.
Original Solution Proposed
Client-Side Transactions
• End user controls the flow of personal
information, not the relying party (online
service that relies on identity claims)
• Example: ordering a book from Amazon
– temporary financial transaction IDs
– shipping transaction ID
Client-Side Transactions
• Addresses:
– (2) Inconvenience: client-side interface would
mediate all sensitive information transactions;
manage multiple accounts in one place; no need to
remember (strong) passwords for each account
– (3) Inconsistency: standardized means of
disseminating personal information
– (6) Propagation: only supply relying parties with
necessary information
– (5) Insecurity: doesn’t rely on weak, user-created
usernames & passwords
Client-Side Personas
• Relies on Client-side Transactions
• Create multiple personas
– Locally or on ‘naïve,’ encrypted stores on
remote servers (not restricted to local
machine)
• Limit the propagation of sensitive
information by generating unique GUIDs
and strong passwords
Client-Side Personas
• Addresses:
– (1) ID Unreliability: personas can be
government-trusted
– (4) ID Persistency: unique GUID can
automatically authenticate sessions
– (7) Intrusion (via ‘participation’): incoming
communications only from trusted users
Oh, wait…uh…wow.
Existing Technological
Solutions
Microsoft .NET Passport
Microsoft .NET Passport: Problems
• Online services had to
pay a subscription fee
• Single point-of-failure
• Do we trust Microsoft
to take part in all of
our online
transactions?
• No context-based
identity
Enter: The Liberty Alliance
• 2001: Sun, Sprint, Sony,
Verisign, eBay…
• Single sign-on system based on a “circle of
trust”
• Federated identity
– Aggregating personal information across multiple
systems
– Authenticating a user across multiple systems
– Exchanging claims via SAML, the Security Assertions
Markup Language
• Focus on identity systems for corporate
SAML Tokens
• Represent security credentials using XML
– A way of creating an distributing authentication and
authorization assertions
• Three distinct types of assertions:
– SAML authentication assertions: subject, method,
time
– SAML attribute assertions: associates subject with
attributes
– SAML authorization assertions: associates subject
with resource permissions
Federated Identity with SAML - Pull
Profile
1
2
4
airline.com
5
3
User
rentalcar.com
Federated Identity with SAML - Push
Profile
1
2
airline.com
3
User
rentalcar.com
Secure Transfer of SAML Tokens
•
A secure communication between two
authenticated parties must follow the
principles of:
– Non-repudiation
– Data integrity
– Confidentiality
•
XML Encryption: confidentiality
– Sender generates random shared key
1. Sender encrypts message using shared key
2. Sender encrypts shared key using recipient’s public
key
– Sender sends (1) and (2) to recipient
Secure Transfer of SAML Tokens
•
XML Signature: non-repudiation + data
integrity
<Signature
xmlns="http://www.w3.org/2000/09/xmldsig#">
<SignedInfo>
<CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xmlc14n-20010315"/>
<SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#dsasha1" />
<Reference URI="http://www.yale.edu/index.html">
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" />
<DigestValue>j6lwx3rvEPO0vKtMup4NbeVu8nk=</DigestValue>
</Reference>
</SignedInfo>
<SignatureValue>MC0E~LE=</SignatureValue>
<KeyInfo>
<X509Data>
<X509SubjectName>CN=Ed Simon,O=XMLSec
Inc.,ST=OTTAWA,C=CA</X509SubjectName>
<X509Certificate> MIID5jCCA0+gA...lVN </X509Certificate>
</X509Data>
</KeyInfo>
</Signature>
More Recent Developments
URL-Based Identity Management:
OpenID
1.
2.
3.
4.
5.
6.
7.
User enters identity URL at the relying party
Relying party redirects browser to identity URL
User logs in at identity URL
Identity URL verifies relying party by checking access control list
Identity URL sends security token back to browser
Browser redirects security token to relying party
Relying party verifies security token directly with identity URL
URL-Based Identity Management:
SXIP
• Similar to OpenID, but adds
functionality for profile
exchange
• Centralized way of managing
personal information
– Multiple personas
– Updating personal
information
• SXIP 2.0 extension: support
for trusted claims
– i.e. verified e-mail address
Pros and Cons of URL-Based Identity
+ Uses existing web & browser
technologies
+ Easy to adopt: no new software needed
+ Accessible from anywhere
— Inconvenient typing of URLs
— Open to phishing attacks
— Trusted claims?
The WS-* Architecture: An Identity
Metasystem
• IBM and Microsoft, working with OASIS
• An “Identity Metasystem” to create an open
identity architecture that allows older identity
management systems to work alongside new
advances in identity technology
• Set of protocols for distributing claims
• Components
– Negotiating protocols
– Transforming claims
Implementing WS-*: Microsoft
InfoCard
• InfoCard identity selector GUI: a client
application allowing user control of digital
identities (which are comprised of claims)
• InfoCards are encrypted XML documents
– No actual identity information is stored in them
• Identity information stored with Identity Providers
– Contains means of accessing claims
• Metadeta that describes claims associated with the digital
identity
• Identity technology (SAML, X.509, Kerberos?)
• Issuer (Verisign, Thawte, self-issued?)
• Unique identifier
InfoCard Demo
InfoCard Typical Usage Scenario
InfoCard Typical Usage Scenario
(cont’d)
InfoCard Benefits and
Problems
InfoCard Benefits
• 1. Unreliability:
– Infocard makes it possible to identify people,
and is agnostic of physical authentication.
• Roughly 2 levels:
– Self-issued ID -> weak
– ID from a certified provider -> strong
• 2. Inconvenience
– No need to memorize passwords, create
multiple accounts, register manually at sites.
InfoCard Benefits (contd.)
• 3. Inconsistency:
– InfoCard provides one unified, clean
interface for managing identity.
– The interface is rooted in the OS, not the
browser.
• Protected from assault via MPAPI.
• It is not clear whether the interface will style itself
over user-themes.
– Basic concepts to understand: Cards &
Claims.
InfoCard Benefits (contd.)
• 4. Impermanence
– Infocard automatically handles things like log-in and
expiration, so there’s no need to do it manually.
– The system is independent of local HTTP requests, it
runs in its own protected process space.
Transitivity:
– Infocard opens the door to the possibility of ID
transitivity, via ID federation.
– If you have an ID at provider 1 (e.g. Yale) and it is
compatible with provider 2 (e.g. MS), they can
federate the information.
– The combined information will result in a stronger ID.
• I.e. the person is a certified student and an employee.
InfoCard Benefits (contd.)
• 5. Insecurity
– Depends on the strength of the WS-*
implementation.
– Nixing of password for security tokens
eliminates a huge security problem.
– System protected from a lot of local-machine
hazards via OS-kernel level memory
protection, process protection.
– Implementation can keep up to date via
automatic user-independent updates.
InfoCard Benefits (contd.)
• 6. Propagation
– Infocard helps reduce the propagation of
sensitive information by preventing the leak
in the first place.
– Uses a system of “claims”
– You don’t send the information they don’t
need.
– The result is less of your data flying around.
InfoCard Benefits (contd.)
• 7. Intrusion / Participation
– Infocard does not address intrusion directly.
– You can use infocard and still get email spam.
– However, let’s say you set a blog up, and it is
Infocard compatible.
• You can add as many claims as you want.
– E.g. commercial interest, political affiliation.
• You can ask for ID certified by educational
organizations.
– E.g. only college students may post.
Potential InfoCard problems
• Perhaps a false sense of security
• Infocard doesn’t address the issue of Trust
directly.
• If you honor the claims of an unscrupulous
vendor, your information still falls in their hands.
• It might be difficult to reconcile strong
organizational ID with weak individual ID.
• Introduces deep OS-integration.
– Last one didn’t work so well => IE6
Infocard and Anonymity
• Infocard does not address anonymity.
– The goal is the opposite: good ID.
• But it does allow it – sort of:
– You can use “bogus” (and weak) infocards.
• However, this is problematic.
– Inconvenient,
– Pseudonymous
– Still traceable at the network/IP level.
Infocard and Anonymity (contd.)
• Interface
It would be good to support anonymous
communication at the interface level:
– Allow automatic client generation of bogus
cards that aren’t linked. Use a different (new)
one each time.
– Allow individual infocard-level granularity for
proxy support.
• E.g. tell it to always use Tor/Privoxy when using one
of your bogus cards, or combine with above.
Other Infocard Improvements
• Computer Automation
One should be able to add CAPTCHA claims for weak /
anonymous IDs.
• Trust
It would be highly convenient to link the claims of a
vendor with trust-information providers. E.g.:
- Community Sites.
- Government databases?
And Specially with organizations that people trust “in real
life.” E.g.:
- The Better Business Bureau (BBB).
- Consumer Reports
Other Infocard Improvements (contd.)
• Authentication
– Infocard is more or less agnostic about physical authentication.
– The assumption is that if you’re properly logged on to your
machine, you are the person owning those infocard.
– Vista will help with this issue, but there is no provision for things
like recovery (e.g. if your system is cracked).
– Additional built-in support for smart cards, specialized hardware
tokens, biometrics, etc. would be desirable.
• Peer Review
– Information is still changing quickly and is not widely available.
Once finalized, experts should evaluate it to see its defects
before it goes live.
• What does Bruce Schneier have to say about it?