1
00:00:00,200 --> 00:00:03,612
So, we've talked a lot about
Bitcoin's anonymity in this lecture.

2
00:00:03,612 --> 00:00:07,240
But, Bitcoin's anonymity becomes even
more powerful when combined with other

3
00:00:07,240 --> 00:00:08,240
technologies.

4
00:00:08,240 --> 00:00:11,460
In particular,
anonymous communication technologies.

5
00:00:11,460 --> 00:00:14,880
We've talked about Tor a little bit,
we've alluded to it several times, but

6
00:00:14,880 --> 00:00:16,030
now let's go into more detail.

7
00:00:17,210 --> 00:00:20,440
Let's first set up the problem of
anonymous communication, though.

8
00:00:20,440 --> 00:00:22,323
So this is what the system looks like.

9
00:00:22,323 --> 00:00:23,508
There are a bunch of senders.

10
00:00:23,508 --> 00:00:24,967
There are a bunch of recipients.

11
00:00:24,967 --> 00:00:29,040
And messages are routed from senders
through recipients through this network

12
00:00:29,040 --> 00:00:29,660
over here.

13
00:00:30,930 --> 00:00:32,860
And of course,
there's going to be an attacker.

14
00:00:32,860 --> 00:00:34,730
This attacker, and
this is called a threat model,

15
00:00:34,730 --> 00:00:36,670
the attacker controls several things.

16
00:00:36,670 --> 00:00:39,143
Some of these nodes in red
are compromised by the attacker.

17
00:00:39,143 --> 00:00:43,790
Some of these edges, some of these links
between on this nodes to the network

18
00:00:43,790 --> 00:00:47,620
are also controlled by the attacker
even if the nodes themselves are not.

19
00:00:48,760 --> 00:00:52,790
Similarly, some of the recipient nodes
over here and some of these links

20
00:00:52,790 --> 00:00:56,490
from the network to the recipient node
are also controlled by the attacker.

21
00:00:56,490 --> 00:01:00,830
And finally some of the internal nodes
of the anonymous communication network

22
00:01:00,830 --> 00:01:02,865
all under the control of the attacker but

23
00:01:02,865 --> 00:01:07,560
crucially, not all of the communication
network is controlled by the attacker.

24
00:01:07,560 --> 00:01:11,180
We want to achieve anonymity in this
hostile environment and as before

25
00:01:11,180 --> 00:01:14,670
anonymity refers to unlinkability
between the sender and the receiver.

26
00:01:16,050 --> 00:01:17,800
So how does Tor accomplish this?

27
00:01:17,800 --> 00:01:20,530
It's the same old pattern
of picking a chain of

28
00:01:20,530 --> 00:01:23,472
intermediaries to route
your messages through.

29
00:01:23,472 --> 00:01:25,836
And here it is in a nice visual form, and

30
00:01:25,836 --> 00:01:29,942
I have to thank the Electronic
Frontier Foundation for this slide.

31
00:01:29,942 --> 00:01:32,926
So what's going on, Alice over here
wants to talk to Bob over here.

32
00:01:32,926 --> 00:01:36,602
So she pre-selects a path
through this set of routers and

33
00:01:36,602 --> 00:01:39,290
that number is fixed in the Tor protocol.

34
00:01:39,290 --> 00:01:40,390
It's always three.

35
00:01:40,390 --> 00:01:44,110
But conceptually you can imagine that
it would be any number you want, and

36
00:01:44,110 --> 00:01:47,830
the more nodes you route through the more
anonymity you get, or the harder it is,

37
00:01:47,830 --> 00:01:50,050
I should say, to breach anonymity.

38
00:01:51,150 --> 00:01:53,700
So these nodes denoted with
a plus are all the Tor nodes, and

39
00:01:53,700 --> 00:01:58,380
she picks some subset of three nodes
randomly in order to route her message.

40
00:01:59,620 --> 00:02:01,970
And the security property
that we get is that,

41
00:02:01,970 --> 00:02:06,830
as long as at least one of these three
nodes that she picks is not compromised or

42
00:02:06,830 --> 00:02:10,902
colluding with the attacker,
then she is sort of safe here.

43
00:02:10,902 --> 00:02:14,890
In that Alice cannot be linked
to Bob by somebody who's

44
00:02:14,890 --> 00:02:16,510
observing some of
the nodes in the network.

45
00:02:16,510 --> 00:02:20,550
I should say that there are many
attacks possible on Tor.

46
00:02:20,550 --> 00:02:24,530
One of them, for example, is called
an end-to-end traffic correlation attack.

47
00:02:24,530 --> 00:02:28,688
So there are going to be timing patterns
in the flow of traffic between Alice and

48
00:02:28,688 --> 00:02:30,850
whatever Bob is, maybe a website.

49
00:02:30,850 --> 00:02:33,190
And so
the attacker controls both of these links,

50
00:02:33,190 --> 00:02:36,180
then just by observing the correlation
in those timing patterns,

51
00:02:36,180 --> 00:02:39,650
he might be able to determine that these
two nodes are in communication with each

52
00:02:39,650 --> 00:02:46,200
other even if he knows nothing about the
route that the message took between them.

53
00:02:46,200 --> 00:02:49,400
So one key point here is how do
you hide routing information?

54
00:02:49,400 --> 00:02:50,790
What do I mean by that?

55
00:02:50,790 --> 00:02:54,120
When a message is gone from
Alice to the first router,

56
00:02:54,120 --> 00:02:59,190
it has to have the IP address of Bob's
computer somewhere in that message.

57
00:02:59,190 --> 00:03:02,630
Otherwise there is no way that this router

58
00:03:02,630 --> 00:03:05,520
can appropriately forward that on
to reach the right destination.

59
00:03:06,880 --> 00:03:11,422
However, we don't want this router
to actually learn that IP address

60
00:03:11,422 --> 00:03:15,680
because if the router does learn that IP
address, then it knows both Alice's IP,

61
00:03:15,680 --> 00:03:17,590
because the message came from her, and

62
00:03:17,590 --> 00:03:20,970
Bob's IP, because that's where
the message is eventually going.

63
00:03:20,970 --> 00:03:25,270
And now this router has the link between
the two ends of the communication, and

64
00:03:25,270 --> 00:03:27,049
this would be a problem if
this router were malicious.

65
00:03:28,780 --> 00:03:31,200
So as you might guess,
the answer involves encryption.

66
00:03:31,200 --> 00:03:33,940
And as you can see in this picture,
these links that are in green,

67
00:03:33,940 --> 00:03:35,070
they're encrypted connections.

68
00:03:35,070 --> 00:03:38,790
And this one is an unencrypted connection.

69
00:03:38,790 --> 00:03:41,720
Let's look in more detail to
see how this encryption works.

70
00:03:41,720 --> 00:03:44,180
It's a specific way in
which encryption is used.

71
00:03:44,180 --> 00:03:46,940
It's called a layered encryption.

72
00:03:46,940 --> 00:03:52,320
It resembles an onion, so that's why
onion routing is a related concept here.

73
00:03:52,320 --> 00:03:53,580
So what is going on here?

74
00:03:53,580 --> 00:03:57,770
Alice and router 1 share a symmetric
key that's represented in purple.

75
00:03:59,300 --> 00:04:02,330
Alice and router 2 share this
key that's represented in blue.

76
00:04:02,330 --> 00:04:05,169
And Alice and R3 share the key
that's represented in gold.

77
00:04:06,520 --> 00:04:10,260
Now these symmetric keys are not stored
long-term by any of these nodes.

78
00:04:10,260 --> 00:04:14,150
They're established as
necessary using key exchange,

79
00:04:14,150 --> 00:04:18,050
the only persistent keys are the long-term
public keys of these routers.

80
00:04:18,050 --> 00:04:20,540
And these routers do in fact
have long-lived identities and

81
00:04:20,540 --> 00:04:21,850
public keys and so on.

82
00:04:21,850 --> 00:04:25,420
Alice, of course, does not need
to have any long-term public key.

83
00:04:25,420 --> 00:04:28,730
When she picks a path of these
routers she finds their public keys,

84
00:04:28,730 --> 00:04:35,120
executes key exchange protocols, and
obtains these shared symmetric keys.

85
00:04:35,120 --> 00:04:38,340
And what she's going to do is
when she sends the message to R1,

86
00:04:38,340 --> 00:04:39,699
it's going to be triply encrypted.

87
00:04:40,840 --> 00:04:45,250
The outermost layer of encryption is a
symmetric encryption between Alice and R1.

88
00:04:45,250 --> 00:04:47,790
And so what this allows R1 to do is

89
00:04:47,790 --> 00:04:50,510
peel off that layer of encryption
like peeling off an onion.

90
00:04:52,576 --> 00:04:56,690
And when router 1 peels off that layer
of encryption, inside it's going to find

91
00:04:56,690 --> 00:05:01,055
the IP address of router 2, and
an encrypted message to send to router 2.

92
00:05:02,350 --> 00:05:05,960
And it's going to forward that, router 2
peels off a further layer of encryption,

93
00:05:05,960 --> 00:05:09,220
and then to router 3, further layer
of encryption, now the message is

94
00:05:09,220 --> 00:05:13,910
unencrypted, consisting of the plain text
message, as well as Bob's IP address.

95
00:05:13,910 --> 00:05:17,130
And so router 3 now sends that
message in plain text to Bob.

96
00:05:18,770 --> 00:05:22,340
Of course what you probably
want to do is further layer

97
00:05:22,340 --> 00:05:27,020
a protocol like HTTPS or
secure web browsing on top of Tor so

98
00:05:27,020 --> 00:05:30,470
that even this message from
router 3 to Bob is encrypted.

99
00:05:30,470 --> 00:05:32,950
But the Tor protocol itself
doesn't guarantee that,

100
00:05:32,950 --> 00:05:34,520
has no way of guaranteeing that,

101
00:05:34,520 --> 00:05:38,870
because Bob might be a regular web server
that doesn't even speak the Tor protocol.

102
00:05:38,870 --> 00:05:43,263
And so there is no way that Tor can be
responsible,for the encryption between R3,

103
00:05:43,263 --> 00:05:47,237
which is called the exit node, and
the ultimate recipient of the message.

104
00:05:48,931 --> 00:05:52,535
I'll leave you to think about why this
wouldn't quite work if there were only one

105
00:05:52,535 --> 00:05:53,550
layer of encryption.

106
00:05:53,550 --> 00:05:57,270
For example, if Alice tried to encrypt
the message all the way from her to R3,

107
00:05:57,270 --> 00:05:58,990
it wouldn't quite work.

108
00:05:58,990 --> 00:06:01,820
The routing would not quite work out.

109
00:06:01,820 --> 00:06:06,414
But as it is, the very neat property that
you have is that R1 only know Alice's IP

110
00:06:06,414 --> 00:06:08,520
address and R2's address.

111
00:06:08,520 --> 00:06:11,170
Does not know R3's or Bob's address.

112
00:06:11,170 --> 00:06:15,480
And similarly every node knows only
the addresses of the node that was

113
00:06:15,480 --> 00:06:17,440
one hub before it and one hub after it.

114
00:06:18,530 --> 00:06:22,930
And in fact, when the message gets to
this point, the IP address of Alice

115
00:06:22,930 --> 00:06:26,320
is not even present anymore whether or
not in encrypted form.

116
00:06:27,470 --> 00:06:29,620
So that's really how
you get anonymity here.

117
00:06:29,620 --> 00:06:33,780
If any one of these, if R2 for
example, were compromised,

118
00:06:33,780 --> 00:06:36,020
then it would learn R1's and
R3's IP addresses.

119
00:06:36,020 --> 00:06:37,300
But not Alice's or Bob's.

120
00:06:39,390 --> 00:06:41,410
So that's how Tor works.

121
00:06:41,410 --> 00:06:44,010
And now let's talk about Silk Road and

122
00:06:44,010 --> 00:06:49,060
in particular, the problem that a site
like Silk Road has to overcome is this.

123
00:06:50,492 --> 00:06:52,480
Silk Road is what is known
as a hidden service,

124
00:06:52,480 --> 00:06:57,000
in other words the Silk Road server wants
to hide its address for obvious reasons.

125
00:06:58,930 --> 00:07:01,640
If you haven't heard about Silk Road, let
me just say a sentence about it briefly,

126
00:07:01,640 --> 00:07:04,020
you're going to see more
detail about it next lecture.

127
00:07:04,020 --> 00:07:07,080
Silk Road was a website that operated for
a couple of years.

128
00:07:07,080 --> 00:07:08,500
It was an anonymous marketplace.

129
00:07:08,500 --> 00:07:10,030
It sold a variety of goods, but

130
00:07:10,030 --> 00:07:12,590
the thing that it was most known for
is selling drugs.

131
00:07:12,590 --> 00:07:16,340
And because of the pervasive anonymity or
at least the pseudonymity of the system,

132
00:07:16,340 --> 00:07:20,270
the idea was that it was very hard for
law enforcement to go after.

133
00:07:20,270 --> 00:07:23,720
And the story of what happened next,
I will leave to the next lecture.

134
00:07:23,720 --> 00:07:29,608
But let's look at the technology that made
something like Silk Road possible and

135
00:07:29,608 --> 00:07:31,635
the implications of that.

136
00:07:31,635 --> 00:07:36,770
So here is a simplified algorithm by which
a server can keep its identity hidden and

137
00:07:36,770 --> 00:07:38,240
yet provide services through Tor.

138
00:07:39,470 --> 00:07:42,820
What it does is, it connects through
what is called a rendezvous point,

139
00:07:42,820 --> 00:07:46,630
which is one of the Tor routers,
through Tor.

140
00:07:46,630 --> 00:07:49,560
And then what it's going to do is it's
going to publish the mapping between

141
00:07:49,560 --> 00:07:54,380
its name, its domain name, and
the address of the rendezvous point

142
00:07:54,380 --> 00:07:58,179
through directory services
that the Tor system offers.

143
00:07:59,710 --> 00:08:02,650
And these domain names are not
your regular DNS domain names.

144
00:08:03,850 --> 00:08:07,090
That wouldn't work because of this
whole parallel system of routing.

145
00:08:07,090 --> 00:08:09,690
And so,
these are called onion addresses and

146
00:08:09,690 --> 00:08:12,490
they're going to look like
this long string dot onion.

147
00:08:12,490 --> 00:08:16,028
And notice that it looks a lot like
Bitcoin public keys, and it's for

148
00:08:16,028 --> 00:08:19,887
sort of the same reasons, it's because
anyone can generate one of these.

149
00:08:22,708 --> 00:08:25,850
And now the client will have to learn

150
00:08:25,850 --> 00:08:30,710
the onion address of the site
that it wants to visit.

151
00:08:30,710 --> 00:08:32,230
When the Silk Road existed,

152
00:08:32,230 --> 00:08:35,310
if you wanted to go to Silk Road,
you couldn't type in silkroad.com.

153
00:08:35,310 --> 00:08:36,270
That wouldn't make any sense,

154
00:08:36,270 --> 00:08:39,330
because Silk Road is not even
available over the regular Web.

155
00:08:39,330 --> 00:08:42,658
Instead you would have to, through some
manner, and this was a widely known

156
00:08:42,658 --> 00:08:46,090
address, you would have to find, this
is not Silk Road's address by the way,

157
00:08:46,090 --> 00:08:49,626
this is the onion address of DuckDuckGo,
a search engine that offers privacy and

158
00:08:49,626 --> 00:08:50,235
anonymity.

159
00:08:50,235 --> 00:08:54,220
But you would find a similar address
that belonged to Silk Road and

160
00:08:54,220 --> 00:08:56,720
put that into your Tor enabled browser.

161
00:08:58,150 --> 00:09:01,220
And what your client would automatically
do is look up the mapping for

162
00:09:01,220 --> 00:09:04,860
the address of the rendezvous point,
connect to that rendezvous point, and

163
00:09:04,860 --> 00:09:08,970
through that rendezvous point have
a anonymous and encrypted connection

164
00:09:08,970 --> 00:09:13,419
to the ultimate server without the server
having to publish its actual IP address.

165
00:09:15,450 --> 00:09:19,180
So that covers some of the technology
behind Silk Road, in particular anonymous

166
00:09:19,180 --> 00:09:22,870
communication and how do you do anonymous
payments which is of course with Bitcoin.

167
00:09:23,990 --> 00:09:27,630
But still you need more technology in
order to make this whole system work.

168
00:09:28,640 --> 00:09:29,450
You need security.

169
00:09:29,450 --> 00:09:32,810
In other words, how can you be sure
that when you pay someone on Silk Road

170
00:09:32,810 --> 00:09:34,920
they're going to actually
sell you the goods?

171
00:09:34,920 --> 00:09:38,760
Silk Road had a reputation system for
that and how do you do anonymous shipping?

172
00:09:38,760 --> 00:09:43,470
The site pretty much left this to the
participants, advised buyers to provide

173
00:09:43,470 --> 00:09:48,260
an anonymous PO box, for
example, to ship goods to.

174
00:09:48,260 --> 00:09:49,750
So let's take a step back.

175
00:09:49,750 --> 00:09:52,430
We've covered a lot of
technology in this lecture.

176
00:09:52,430 --> 00:09:55,910
Hopefully you've understood that Bitcoin
anonymity is a very powerful thing.

177
00:09:55,910 --> 00:09:58,670
And it gains in power when
combined with other technologies.

178
00:09:58,670 --> 00:10:02,100
In particular,
anonymous communication technologies.

179
00:10:02,100 --> 00:10:05,330
And also anonymity is a deeply,
morally ambiguous thing.

180
00:10:06,620 --> 00:10:09,010
There are many moral distinctions
that we would like to make.

181
00:10:09,010 --> 00:10:12,130
That we're not able to adequately
express at the technological level.

182
00:10:12,130 --> 00:10:15,469
And so some of this moral
ambiguity appears to be inherent.

183
00:10:17,140 --> 00:10:20,060
Hopefully, it's also been clear
that anonymity is very fragile.

184
00:10:20,060 --> 00:10:23,930
One mistake can create a link
that you're trying to hide.

185
00:10:24,980 --> 00:10:26,840
But also, anonymity is
an important thing to protect.

186
00:10:26,840 --> 00:10:28,065
It's worth well protecting.

187
00:10:28,065 --> 00:10:30,635
It has a lot of good uses
in addition to bad uses.

188
00:10:31,985 --> 00:10:36,205
So most of the things that we've talked
about today are either at the forefront of

189
00:10:36,205 --> 00:10:41,065
research, technologically, or they're
a topic of serious ethical debates.

190
00:10:41,065 --> 00:10:44,655
None of this is really settled, and
so this is an ongoing conversation,

191
00:10:44,655 --> 00:10:46,220
area of ongoing research.

192
00:10:46,220 --> 00:10:49,070
We don't know which anonymity system for
Bitcoin, if any,

193
00:10:49,070 --> 00:10:50,950
is going to become prominent or
mainstream.

194
00:10:52,420 --> 00:10:56,720
And so this is a great opportunity for
you, either as a developer, or in thinking

195
00:10:56,720 --> 00:11:00,085
through the ethical implications,
to get involved in some of these issues.

196
00:11:00,085 --> 00:11:00,696
And hopefully,

197
00:11:00,696 --> 00:11:03,718
what you've learned in this lecture has
given you the right background for that.


