Before, I quite couldn't understand how anonymity could be reached in the modern networks. However, onion routing is actually quite simple but powerful way to do that. All genius is simple!
Onion routing (OR) is one of the ways to send secure and anonymous messages over the public network. Onion routing is in base of a popular nowadays network called Tor. Basically, onion routing is an overlay over computer network (ie internet), where message is not sent as plain text using standard routing techniques, but instead the route is chosen by the sender, and the message is encrypted in a way that only single relay node can work with it properly at a time.
Sender builds the route it wants to use to send the message, gets the public key for each relay node in the route, and uses public keys to encrypt message incrementally, adding more and more security, so the message looks like the onion. That's why it is called onion routing.
Assume that sender
S
defined the route to target node T
using intermediate relay nodes A
, B
and C
, so the message m
should be routed using path S -> A -> B -> C -> T
. Sender S
knows the public keys for all of these hosts. Lets say that public key for A
is a
, for B
it is b
and for C
it is c
. Sender prepares the message so it contains the payload (input message) and the next node in the path. Target node T
would see only the payload, without any routing instruction, as it is a final node in the path. Then S
encrypts message in the reverse (to the route) order, ie: a( b( c(m, T), C), B)
, where, for instance, a(m, T)
means encrypt message m
with route to next node T
using public key a
. So
A
decrypts received message using own private key. In this case, A
sees the encrypted message that only B
can decrypt, but also sees the routing instruction: send message to B
. Thus A
sends the message to B
. B
receives the message, decrypts it, sees the next node in the path and sends the message to it. This repeats till the message is delivered to the last relay node, ie C
. Depending on implementation, C
could see either the original message, or encrypted message (thus, only T
can decrypt it).The more relay nodes are in the routing path, the better security and anonymity level. However, this comes with a price - increased latency. The OR protocol helps with risk that some or few nodes in the path are compromised. However, if all nodes are compromised, then OR does not bring any value, except encrypting message. This is because, the main reason why it is thought to deliver anonymity is that it's hard to say where is located the sender of specified message.
There are multiple improvements that makes OR even better:
- message is encrypted always, so even the latest node in the routing path never knows the plain content of the message
- there are plenty of hosts that could be chosen as routers, b/c this gives more and more different paths chosen for each message
- use different paths for each message
- send the return path with the request message, so the
T
won't send the response the same way. - there is a trust to the seed node (which used to provide the list of available relay nodes).
Finding relay nodes (or proxy servers as they also called) could be done using via some central service or DHT that provides the list of available relay nodes. OR could be done in P2P manner, where any node can be both sender, recipient and relay node. Also gossip protocols could be used as alternative to DHT or central service, so the information about available relay nodes is known by everyone.