Tor, Attacks Against, and NSA`s Toolbox

Tor, Attacks Against, and NSA’s Toolbox
Introduction
Ercan Ozturk, Prof. Dr. Ali Aydin Selcuk
Snowden and NSA’s Tor Toolbox
In Johson et al. (CCS ’13), traffic correlation attacks are
Tor is free software that helps people surf on the Web
anonymously, and dodge censorship which is carried out in some
countries. Tor was initially developed at the U.S. Naval Research
Laboratory with the purpose of protecting government
communications, and was later open sourced for public use. Today,
Tor has a total user base of 2 million people, and the Tor network
consists of roughly 6000 relays which are mostly run by volunteers.
Tor Basics
Onion Routing is a technique that allows anonymous
communications over a network of computers. In onion routing,
messages are routed through a circuit (a path of Tor relays, Figure 1)
after encrypted multiple times, like layers of an onion (Figure 2),
with the encryption keys of the relays in the circuit. When a message
reaches a relay, it is decrypted once, and is forwarded towards the
next relay. Since a node only knows its previous and next nodes, this
assures the real source of the message stays anonymous.
A circuit contains three Tor relays by default. These relays are
often referred as the entry node, middleman, and exit node.
discussed by using a Tor path simulator (TorPS). Their analysis
shows that 80% of Tor users may be de-anonymized by a relay
adversary in 6 months. Also, an AS adversary may de-anonymize
nearly all Tor users who are in some common locations, in 3 months.
The Tor Project introduced entry guards (not frequently changed
entry nodes) to solve this problem.
Chakravarty et al (On the Effectiveness of Traffic Analysis against
Anonymity Networks Using Flow Records, 2014) present another
attack using flow records of the routers. To correlate the traffic,
Chakravarty et al. use two unique patterns sent from a corrupted
server to the client for 5-7 minutes. The real-world experiments
result in accuracy of 81.6% de-anonymized users with a false
positive rate of 5.5%. The Tor Project responded to this result by
saying that the false positive rate is reasonably high. If an adversary
watches a relay, and sees 100000 flows, the number of false
positives will be around 5500. This means that one of the 5500
matches is the correct user, and it is hard to determine which one.
1.2 Website Fingerprinting
Website fingerprinting is based on training a classifier by using
machine learning algorithms on previously gathered data, and
applying the classifier to the currently gathered data to find which
user is visiting which website. To train the classifier; packet sizes,
timestamps of the packets, etc. can be used. But current researches
like Juarez et al. (CCS ‘14) indicate that Tor network’s size makes
using website fingerprinting to identify users impossible. Moreover,
the paper also shows that the slight changes in the assumptions in
the researches affect the success of fingerprinting attacks highly.
Whistleblower Edward Snowden disclosed documents related to the
NSA’s efforts to break Tor. Among them in a presentation titled “Tor
Stinks”, NSA admits that they will never be able to de-cloak all Tor
users or a specific Tor user. Nevertheless, they have some tools and
techniques to spoil the Tor network:
Attacks against Tor Browser Bundle: These attacks try to spoil
Firefox’s vulnerabilities. For example, EgostisticalGiraffe exploits a
vulnerability in an XML extension of Javascript, EX4. The Firefox
versions 11.0-16.0.2 were vulnerable against this attack.
FoxAcid Servers: Web servers designed to launch prepared attacks
against visitors directed with a specific tag. By using these servers NSA
aims to take control of the visitors’ computers.
Circuit Reconstruction: By insterting high-bandwidth nodes, with
traffic correlation techniques, NSA can reveal the identities of Tor
users who are using the corrupted circuit. But, NSA admits that it is
hard to own all of the relays in a circuit, and they don’t have enough
nodes to apply this attack. However, they say that GCHQ also owns
some nodes, and by working together thsey may be able to apply this
attack. GCHQ is also working on de-anonymizing Tor users. In a
presentation, they say that they tried tracking packets in the circuits,
but the method was unsuccessful. They are thinking of applying traffic
correlation attacks by owning / observing the guard node, and the exit
node. (Figure 3)
2. Active Attacks
Figure 1: A circuit
Figure 2: Onion encryption
Circuit Construction: In Tor, circuits are built incrementally, one
hop at a time. First, a list of Tor relays are obtained from a Tor
directory server, and a random path is chosen from the source to the
destination. Then, at every iteration, TLS keys are negotiated
between the Tor relay and the source. These keys are used to
encrypt the data, and the data are sent through the circuit via the
SOCKS protocol.
Attacks Against Tor
1. Passive Attacks
1.1 Traffic Correlation/Analysis
Since Tor is a low-latency anonymous communication system, an
adversary watching the two ends of a circuit can correlate the traffic
by examining the arrival and departure times of the packets.
2.1 Iterated Compromise
If an adversary compromises a relay in the circuit, and then
compromises the next one until all relays in the circuit become
compromised, the adversary may de-cloak the user. But the
adversary should complete the iteration within a life-time of a
circuit.(Default life time of a circuit is 10 minutes.)
2.2 Distributing Hostile Code
All Tor releases are signed by the Tor Project with an official
public key; hence, the Tor users can verify the Tor release. But an
attacker can still trick some Tor users to run a Tor-like software,
and degrade their anonymity.
2.3 Blocking Access to the Tor Network
Blocking access to the Tor network is mostly applied by
governments to prevent the community reaching the censored
websites or resources.
The Great Firewall of China (GFC), for example, was using simple
IP black-listing techniques to block access to the Tor network in its
previous attempts. But users were still able to use Tor via bridges
(unpublished Tor relays). Then, GFC used the unique cipher list in
the TLS hello message sent by the Tor clients to identify and block
the Tor connection. Tor solved the problem by imitating the
Firefox’s cipher list in the TLS client hello.
Figure 3: GCHQ’s traffic correlation mechanism (Source: Spiegel)
Capitalize on Human Error: Instead of going after the Tor and its
implementation, intelligence agencies mostly go after the human
error. For example, in the conviction of the SIlkroad (illegal drug
market operated as a Tor hidden service) founder Ross Ulbricht, the
law enforcement officers used the unencrypted data in his computer,
and his non-anonymous moves on the internet, even the photographs
he shared on the social media.
Conclusion
Tor is a robust system and the current network size of it makes the
most of the attacks impossible. Moreover, disclosed documents show
that intelligence agencies fail behind the technology of Tor. They
mostly try to de-cloak Tor users by using glitches in the browser bundle
or by capitalizing on targeted people’s mistakes.