How to Detect Tor Users with Ease?

How to Detect Tor Users with Ease?

The Onion Routing (Tor) is an open-source solution to enable anonymous communications. For instance, you can use it to anonymize your Internet traffic. Many journalists and political activists use Tor to avoid being prosecuted. Unfortunately, anonymity also means Tor is used for illicit activities. Here we discuss how the Tor network behaves, how Tor users can be detected, and when it makes sense to filter Tor activities.

How does Tor work?

Originally, Tor was created for military purposes but eventually turned into a tool that everybody can use. In order to use Tor and explore the Internet with anonymity and privacy, you need to access the Tor network. This is done with a client such as the Tor browser. What does the Tor browser do? When using this browser, your data goes through different Tor servers or nodes.

To better understand how Tor is working, let's illustrate some scenarios. First, when you use your "classic" browser such as Google Chrome or Firefox to access a Web resource, the flow to access Web pages on the Internet is as follows:

The classic flow to get a Web resource on the Internet

In red is the usual Internet traffic. For the sake of the explanation, it goes directly from you to the websites (and other resources) you connect to. Your Internet Service Provider (ISP) transmits all your packets for you. Sometimes your traffic is encrypted (HTTPS, TLS, etc), but quite often it is not. When the traffic goes clear, everything is visible to your ISP. Said differently, your ISP knows the sites you connect to and can read all the unencrypted traffic you exchange.

Now, Imagine Tor as being a tunnel passing through different nodes. The entry and exit points are on separate nodes and in the middle are a bunch of Tor relay nodes which are also separate systems. The purpose of Tor is to route traffic so that each node (relay) knows only the preceding and the following node to allow connections that are nearly impossible to trace. We call it Onion Routing:

The Tor flow to get a Web resource on the Internet

The traffic itself is encrypted from you to the exit node and even re-encrypted between nodes (this process tends to add some latency to page loading times at the tradeoff of staying anonymous). From this point on, the same rules apply as with the ISP in the classic scenario. All unencrypted traffic is visible and the destinations of your connections are known (green line). What the outgoing node should not know is where the traffic is coming from, which is you.

One way to misuse Tor would be to make clear connections to personal resources / sites like authenticating with your usual credentials to Facebook. The exit node could guess your real identity. Once done, it could be the link between you and the other sites you visit through Tor.

The strength and weakness of the Tor network is related to the number of nodes operated by independent entities. Let's suppose the Tor network as a puzzle where you have to connect the dots, each node sees 3 other nodes (the previous, next and current node). If an entity operates enough nodes it would be possible to get a more precise idea of ​​what is happening on the Tor network.

The Dark Side of Tor

While Tor may circumvent censorship and defend people against tracking or surveillance, not everyone uses Tor for what one might call a noble cause. For instance, many cybercriminals and hackers use it to stay anonymous while conducting their illegal business.

The anonymous browser is especially useful to criminals because it provides access to the Dark Web. This dark part of the Internet contains multiple illegal networks such as the former illegal marketplace Silk Road. This used to be a vast network where people sold and bought all sorts of illegal items, such as drugs, weapons but also stolen credit card details. In short, many criminals use Tor to avoid getting caught when they're going about their illegal activities.

How to detect Tor users and block the traffic?

As highlighted previously, to detect Tor traffic, the only information you need to know is the IP address of Tor exit nodes. The IP address of Tor relays and entry nodes are not required for this case since they'll never connect to pages on the public Internet.

The Tor project publishes an official list of exit node IP addresses to simplify the task of identifying Tor exit nodes. The list includes about 1,500 entries. Here is an example that shows how to load and use the public list in a NodeJS express app:

const axios = require('axios');
const express = require('express');
const app = express();

async function main() {
    // It is assumed the list is updated upon service startup
    const response = await axios.get('https://check.torproject.org/torbulkexitlist');
    const torExitNodeIps = new Set(response.data.split('\n'));

    app.use((req, res, next) => {
      const ip = req.connection.remoteAddress;
      if (torExitNodeIps.has(ip)) {
        res.status(403).send('Tor clients are not allowed.');
        return;
      }
      next();
    });
    
    app.get('/', (req, res) =>
      res.send('Hello World!')
    );
    
    app.listen(3000);
}

main()

The approach above catches most Tor connections, but it has 2 drawbacks. Firstly, the list isn't updated automatically. You need to restart the service to refresh the list of Tor exit nodes. Secondly, there exist some unofficial exit nodes (e.g. IPv6 exit nodes) that are not in the official public list.

You may wonder how to address both issues? We created Ipregistry for this purpose. Ipregistry aggregates an up-to-date list of the official exit nodes with accurate lists of unofficial Tor IPs so that you get the most accurate and up-to-date Tor list available for detection with a simple HTTP request. Here is a sample using the Ipregistry API:

const axios = require('axios');
const express = require('express');
const app = express();

app.use(async (req, res, next) => {
  const ip = req.connection.remoteAddress;
  const ipInfo = await axios.get(`https://api.ipregistry.co/${ip}?key=${YOUR_API_KEY}`);

  if (ipInfo.data.security.is_tor_exit) {
    res.status(403).send('Tor clients are not allowed.');
    return;
  }
  next();
});

app.get('/', (req, res) =>
  res.send('Hello World!')
);

app.listen(3000);

When to block Tor traffic?

It is important to understand that Tor is not only used for fraudulent activities. Tor can be used to bypass Internet censorship and other regulating systems to allow the free movement of information. For instance, many Web pages and services are blocked in China. Using Tor, people get anonymity, privacy, and circumvention.

The main question you should ask yourself before filtering Tor traffic is whether you risk blocking legitimate traffic. As an e-commerce store manager, a customer that makes use of Tor should be flagged as suspicious for fraud. As a consequence, you may want to block Tor traffic. Unless you're selling products that require anonymity, blocking Tor should have minimal impact on your business and improve payment success rates.

On the other side, if you're running a news page, blocking Tor connections could restrict people in certain regions of the World from accessing the information they need. Unless you're taking payments from your site, there's minimal risk of fraud so it would usually be best to allow Tor traffic, but monitor it closely for fraudulent activities.

Conclusion

Whether blocking all Tor traffic or not depends on your use case. Tor traffic can be filtered using a list of IP addresses, or by using the Ipregistry API. The latter option has the benefits to always be up-to-date and considers all Tor exit nodes. Besides, it also exposes other data such as the user's location and more threat data.

What's your experience with Tor? Have you had issues with users hitting your website via Tor? Let us know in the comments.

Get started with 100,000 free lookups: sign up