1
00:00:00,150 --> 00:00:03,540
So we've talked a lot about
Bitcoin's anonymity in this lecture.

2
00:00:03,540 --> 00:00:07,190
But Bitcoin's anonymity becomes even
more powerful when combined with other

3
00:00:07,190 --> 00:00:11,410
technologies, in particular,
anonymous communication technologies.

4
00:00:11,410 --> 00:00:12,910
We've talked about Tor a little bit.

5
00:00:12,910 --> 00:00:14,720
We've alluded to it several times.

6
00:00:14,720 --> 00:00:15,970
But now let's go into more detail.

7
00:00:17,160 --> 00:00:20,380
Let's first set up the problem of
anonymous communication, though.

8
00:00:20,380 --> 00:00:21,874
So, this is what the system looks like.

9
00:00:21,874 --> 00:00:23,641
There are a bunch of senders.

10
00:00:23,641 --> 00:00:25,170
There are a bunch of recipients.

11
00:00:25,170 --> 00:00:28,430
And messages are routed from
senders through recipients

12
00:00:28,430 --> 00:00:29,600
through this network over here.

13
00:00:30,880 --> 00:00:32,800
And of course there is
gonna be an attacker.

14
00:00:32,800 --> 00:00:34,670
This attacker, and
this is called a thread model,

15
00:00:34,670 --> 00:00:36,620
the attacker controls several things.

16
00:00:36,620 --> 00:00:39,073
Some of these nodes in red
are compromised by the attacker.

17
00:00:39,073 --> 00:00:43,760
Some of these edges, some of these links
between on these nodes to the network

18
00:00:43,760 --> 00:00:47,570
are also controlled by the attacker,
even if the nodes themselves are not.

19
00:00:48,710 --> 00:00:52,740
Similarly some of the recipient nodes
over here and some of these links

20
00:00:52,740 --> 00:00:56,670
from the network to the recipient node,
are also controlled by the attacker and

21
00:00:56,670 --> 00:01:00,770
finally some of the internal nodes of
the anonymous communication network.

22
00:01:00,770 --> 00:01:02,480
All under the control of the attacker, but

23
00:01:02,480 --> 00:01:06,070
crucially not all of the communication
network is controlled by the attacker.

24
00:01:07,290 --> 00:01:10,250
And we want to achieve anonymity
in this hostile environment.

25
00:01:10,250 --> 00:01:13,849
And as before anonymity refers to
un-linkability between the sender and

26
00:01:13,849 --> 00:01:14,610
the receiver.

27
00:01:16,000 --> 00:01:17,740
So how does Tor accomplish this?

28
00:01:17,740 --> 00:01:22,160
It's the same old pattern of picking
a chain of intermediaries to

29
00:01:22,160 --> 00:01:26,100
route your messages through, and
here it is in a nice visual form.

30
00:01:26,100 --> 00:01:30,070
I have to thank the Electronic
Frontier Foundation for this slide.

31
00:01:30,070 --> 00:01:31,193
So what's going on?

32
00:01:31,193 --> 00:01:33,962
Alice over here wants to
talk to Bob over here So

33
00:01:33,962 --> 00:01:37,360
she pre-selects a path
through this set of routers.

34
00:01:37,360 --> 00:01:40,330
And that number is fixed in the Tor
protocol, it's always three.

35
00:01:40,330 --> 00:01:43,920
But conceptually you can imagine that
it would be any number you want.

36
00:01:43,920 --> 00:01:46,720
And the more nodes you run through,
the more anonymity you get.

37
00:01:46,720 --> 00:01:49,990
Where the harder it is,
I should say, to breach anonymity.

38
00:01:51,100 --> 00:01:53,530
So these nodes denoted with
a plus are all the Tor nodes.

39
00:01:53,530 --> 00:01:58,330
And she picks some substantive three nodes
randomly in order to write her message.

40
00:01:59,560 --> 00:02:03,950
And the security property that we get is
that as long as it least one of these

41
00:02:03,950 --> 00:02:06,770
three nodes that she picks
is not compromised or

42
00:02:06,770 --> 00:02:10,930
colluding with the attacker,
then she is sort of safe here.

43
00:02:10,930 --> 00:02:14,840
In that Alice can not be linked
to Bob by somebody who's

44
00:02:14,840 --> 00:02:16,450
observing some of
the nodes in the network.

45
00:02:16,450 --> 00:02:20,500
I should say that there are many
attacks possible on Torr.

46
00:02:20,500 --> 00:02:24,130
One of them, for example, is called
an end to end traffic correlation attack.

47
00:02:24,130 --> 00:02:29,080
So there are gonna be timing patterns in
the flow of traffic between Alice and

48
00:02:29,080 --> 00:02:30,440
whatever Bob is, maybe a website.

49
00:02:30,440 --> 00:02:33,140
And so if the attacker
controls both of these links,

50
00:02:33,140 --> 00:02:36,880
then just by observing the correlation in
those timing patterns he might be able to

51
00:02:36,880 --> 00:02:40,170
determine that these two nodes are in
communication with each other,

52
00:02:40,170 --> 00:02:43,930
even if he knows nothing about the route
that the message took between them.

53
00:02:45,920 --> 00:02:49,340
So one key point here is how do
you hide routing information?

54
00:02:49,340 --> 00:02:50,750
What do I mean by that?

55
00:02:50,750 --> 00:02:54,070
When a message is gone from
Alice to the first router,

56
00:02:54,070 --> 00:02:59,080
it has to have the IP address of Bob's
computer somewhere in that message.

57
00:02:59,080 --> 00:03:02,580
Otherwise, there is no
way that this router

58
00:03:02,580 --> 00:03:05,460
can appropriately forward that on
to reach the right destination.

59
00:03:06,840 --> 00:03:11,740
However, we don't want this router
to actually learn that IP address.

60
00:03:11,740 --> 00:03:14,080
Because if the router
does that IP address,

61
00:03:14,080 --> 00:03:17,530
then it knows both Alice's IP,
because the message came from her, and

62
00:03:17,530 --> 00:03:20,920
Bob's IP, because that's where
the message is eventually going.

63
00:03:20,920 --> 00:03:25,120
And now this router has the link between
the two ends of the communication.

64
00:03:25,120 --> 00:03:26,999
Now this would be a problem if
were this router were malicious.

65
00:03:28,720 --> 00:03:31,150
So, as you might guess,
the answer involves encryption.

66
00:03:31,150 --> 00:03:33,870
And as you can see in this picture,
these links here in green,

67
00:03:33,870 --> 00:03:38,730
they're encrypted connections and
this one is an unencrypted connection.

68
00:03:38,730 --> 00:03:41,660
Let's look in more detail to
see how this encryption works.

69
00:03:41,660 --> 00:03:46,890
It's a specific way in which encryption is
used, it's called a layered encryption.

70
00:03:46,890 --> 00:03:52,270
It resembles an onion so that's why
onion routing is a related concept here.

71
00:03:52,270 --> 00:03:53,520
So what is going on here?

72
00:03:53,520 --> 00:03:56,250
Alice and
router one share a symmetric key.

73
00:03:56,250 --> 00:03:57,720
That's represented in purple.

74
00:03:59,210 --> 00:04:02,440
Alice and router two share this
key that's represented in blue and

75
00:04:02,440 --> 00:04:05,119
Alice and router three share
the key that's represented in gold.

76
00:04:06,460 --> 00:04:10,200
Now these symmetric keys are not stored
long-term by any of these nodes.

77
00:04:10,200 --> 00:04:14,070
They're established as
necessary using key exchange.

78
00:04:14,070 --> 00:04:17,990
The only persistent keys are the long-term
public keys of these routers.

79
00:04:17,990 --> 00:04:21,100
And these routers do in fact have
long-lived identities and public keys and

80
00:04:21,100 --> 00:04:21,800
so on.

81
00:04:21,800 --> 00:04:25,350
Alice, of course, does not need
to have any long-term public key.

82
00:04:25,350 --> 00:04:29,290
When she picks a path of these routers,
she finds their public keys executes key

83
00:04:29,290 --> 00:04:35,070
exchange protocols, and
obtains these shared symmetric keys.

84
00:04:35,070 --> 00:04:36,110
And what she's gonna do is,

85
00:04:36,110 --> 00:04:39,639
when she sends the message to R1,
it's going to be triply encrypted.

86
00:04:40,790 --> 00:04:45,070
The outer most layer of encryption is a
symmetric encryption between Alice and R1,

87
00:04:45,070 --> 00:04:47,730
and so what this allows R1 to do is

88
00:04:47,730 --> 00:04:50,460
peel off that layer of encryption
like peeling off an onion.

89
00:04:52,510 --> 00:04:56,320
And when router one peels off that layer
of encryption, inside it's going to

90
00:04:56,320 --> 00:05:01,150
find the IP address of router two and an
encrypted message to send to router two.

91
00:05:01,150 --> 00:05:03,740
And it's going to forward that.

92
00:05:03,740 --> 00:05:07,180
Router two peels off a further layer of
encryption and then to Router three for

93
00:05:07,180 --> 00:05:08,420
another layer of encryption.

94
00:05:08,420 --> 00:05:12,050
Now, the message is unencrypted,
consisting of the plain text message,

95
00:05:12,050 --> 00:05:14,020
as well as Bob's IP address.

96
00:05:14,020 --> 00:05:17,090
And so router three now sends that
message in plain text to Bob.

97
00:05:18,710 --> 00:05:22,290
Of course, what you probably
want to do is further layer

98
00:05:22,290 --> 00:05:26,960
a protocol like HTTPS or
secure web browsing on top of Tor so

99
00:05:26,960 --> 00:05:30,420
that even this message from
router three to Bob is encrypted.

100
00:05:30,420 --> 00:05:32,690
But the Tor protocol itself
doesn't guarantee that.

101
00:05:32,690 --> 00:05:37,000
There's no way of guaranteeing that
because Bob might be a regular web server

102
00:05:37,000 --> 00:05:38,990
that doesn't even speak
the Tor protocol and so

103
00:05:38,990 --> 00:05:41,610
there's no way that Tor
can be responsible.

104
00:05:41,610 --> 00:05:43,130
For the encryption between R3,

105
00:05:43,130 --> 00:05:47,060
which is called the exit node and
the ultimate recipient of the message.

106
00:05:49,170 --> 00:05:51,830
I'll leave you to think about
why this wouldn't quite work if

107
00:05:51,830 --> 00:05:53,520
there were only one layer of encryption.

108
00:05:53,520 --> 00:05:57,210
For example, if Alice tried to encrypt
the message all the way from her to R3,

109
00:05:57,210 --> 00:05:58,930
it wouldn't quite work.

110
00:05:58,930 --> 00:06:01,760
The routing would not quite work out.

111
00:06:01,760 --> 00:06:06,140
But as it is the very neat property
that you have is that R1 only knows

112
00:06:06,140 --> 00:06:11,120
Alice's IP address and R2's address,
does not know R3's or Bob's address.

113
00:06:11,120 --> 00:06:16,098
And similarly every node knows only
the addresses of the node that was one

114
00:06:16,098 --> 00:06:18,480
hop before it and one hop after it.

115
00:06:18,480 --> 00:06:21,340
And in fact when the message
gets to this point.

116
00:06:21,340 --> 00:06:25,370
The IP address of Alice is not
even present anymore whether or

117
00:06:25,370 --> 00:06:26,300
not in encrypted form.

118
00:06:27,420 --> 00:06:31,160
So, that's really how you get anonymity
here and if any one of these, if R2 for

119
00:06:31,160 --> 00:06:36,120
example were compromised then it would
learn R1's and R3's addresses but

120
00:06:36,120 --> 00:06:37,250
not Alice's or Bob's.

121
00:06:39,340 --> 00:06:41,360
So, that's how Tor works.

122
00:06:41,360 --> 00:06:47,200
And now let's talk about Silk Road, and
in particular the problem that a site like

123
00:06:47,200 --> 00:06:52,420
Silk Road has to overcome is this, Silk
Road is what is known as a hidden service.

124
00:06:52,420 --> 00:06:56,950
In other words, the Silk Road server wants
to hide its address, for obvious reasons.

125
00:06:58,870 --> 00:07:01,590
If you haven't heard about Silk Road, let
me just say a sentence about it briefly.

126
00:07:01,590 --> 00:07:03,918
You're gonna see it in more
detail in the next lecture.

127
00:07:03,918 --> 00:07:07,020
A Silk Road was a website that
operated for a couple of years.

128
00:07:07,020 --> 00:07:09,980
It was an anonymous marketplace
that sold a variety of goods but

129
00:07:09,980 --> 00:07:12,660
the thing that was most known for
is selling drugs and

130
00:07:12,660 --> 00:07:16,280
because of the pervasive anonymity or
at least pseudonymity in the system,

131
00:07:16,280 --> 00:07:20,220
the idea was it was a very hard for
law enforcement to go after.

132
00:07:20,220 --> 00:07:23,670
And the story of what happened next
I will leave to the next lecture.

133
00:07:23,670 --> 00:07:28,410
But let's look at the technology that made
something like Silk Road possible and

134
00:07:28,410 --> 00:07:29,410
the implications of that.

135
00:07:31,680 --> 00:07:36,720
So here is a simplified algorithm by which
a server can keep its identity hidden and

136
00:07:36,720 --> 00:07:38,210
yet provide services through Tor.

137
00:07:39,420 --> 00:07:42,770
What it does is it connects to what
is called the rendezvous point

138
00:07:42,770 --> 00:07:46,580
which is one of the Tor
routers through Tor.

139
00:07:46,580 --> 00:07:49,710
And then what it's going to do is it's
going to publish the mapping between its

140
00:07:49,710 --> 00:07:54,320
name and its domain name and
the address of the rendezvous point

141
00:07:54,320 --> 00:07:58,129
through directory services
that the Tor system offers.

142
00:07:59,660 --> 00:08:02,600
And these domain names are not
your regular DNS domain names.

143
00:08:03,810 --> 00:08:07,480
That wouldn't work because it's this
whole parallel system of routing.

144
00:08:07,480 --> 00:08:09,640
So these are called onion addresses, and

145
00:08:09,640 --> 00:08:12,490
they're gonna look like
this long string.onion.

146
00:08:12,490 --> 00:08:16,082
Notice that it looks a lot like
Bitcoin public keys, and it's for

147
00:08:16,082 --> 00:08:17,530
sort of the same reasons.

148
00:08:17,530 --> 00:08:19,769
It's because anyone can
generate one of these.

149
00:08:22,858 --> 00:08:27,652
And now the client will have
to learn the onion address of

150
00:08:27,652 --> 00:08:30,650
the site that it wants to visit.

151
00:08:30,650 --> 00:08:33,540
When the Silk Road existed,
if you wanted to go to Silk Road,

152
00:08:33,540 --> 00:08:36,830
you couldn't type in silkroad.com, that
wouldn't make any sense because Silk Road

153
00:08:36,830 --> 00:08:39,270
is not even available
over the regular web.

154
00:08:39,270 --> 00:08:41,470
Instead you would have to
through some manner and

155
00:08:41,470 --> 00:08:43,700
this was a widely known address
you would have to find.

156
00:08:43,700 --> 00:08:47,290
This is not Silk Road's address by the way
this is the onion address of Duck Duck Go.

157
00:08:47,290 --> 00:08:51,090
A search engine that offers privacy and
anonymity.

158
00:08:51,090 --> 00:08:54,170
But you would find a similar address
that belonged to Silk Road and

159
00:08:54,170 --> 00:08:56,670
put that into your Tor enabled browser.

160
00:08:58,090 --> 00:09:01,010
And that what your client would
automatically do is look up the mapping

161
00:09:01,010 --> 00:09:04,820
for the address of the rendezvous point,
connect to that rendezvous point, and

162
00:09:04,820 --> 00:09:08,910
through that rendezvous point,
have a anonymous and encrypted connection

163
00:09:08,910 --> 00:09:13,369
to the ultimate server, without the server
having to publish it's actual IP address.

164
00:09:15,400 --> 00:09:17,760
So that covers some of
the technology behind Silk Road.

165
00:09:17,760 --> 00:09:19,880
In particular anonymous communication and

166
00:09:19,880 --> 00:09:22,820
how do you do anonymous payments
which is of course with Bitcoin.

167
00:09:23,930 --> 00:09:27,580
But still you need more technology in
order to make this whole system work.

168
00:09:28,590 --> 00:09:31,500
You need security, in other words
how can you be sure that when you

169
00:09:31,500 --> 00:09:34,840
pay someone on Silk Road they're
going to actually sell you the goods?

170
00:09:34,840 --> 00:09:36,800
Silk Road had a reputation system for
that.

171
00:09:36,800 --> 00:09:38,710
And how do you do anonymous shipping?

172
00:09:38,710 --> 00:09:42,160
The site pretty much left this
to the participants that advised

173
00:09:42,160 --> 00:09:46,175
buyers to provide an anonymous PO Box for
example to ship good to.

174
00:09:47,653 --> 00:09:49,720
So let's take a step back.

175
00:09:49,720 --> 00:09:52,380
We've covered a lot of
technology in this lecture.

176
00:09:52,380 --> 00:09:55,860
Hopefully you've understood that Bitcoin
anonymity is a very powerful thing.

177
00:09:55,860 --> 00:09:58,620
And it gains in power when
combined with other technologies,

178
00:09:58,620 --> 00:10:02,050
in particular anonymous
communication technologies.

179
00:10:02,050 --> 00:10:05,290
And also anonymity is a deeply,
morally ambiguous thing.

180
00:10:06,560 --> 00:10:09,760
There are many moral distinctions that we
would like to make that we're not able to

181
00:10:09,760 --> 00:10:12,080
adequately express at
the technological level.

182
00:10:12,080 --> 00:10:15,449
And so, some of those moral
ambiguity appears to be inherent.

183
00:10:17,090 --> 00:10:20,000
Hopefully it's also been clear
that anonymity is very fragile.

184
00:10:20,000 --> 00:10:23,880
One must take can create a link
that you're trying to hide.

185
00:10:24,930 --> 00:10:27,280
But also anonymity is
an important thing to protect.

186
00:10:27,280 --> 00:10:31,930
It's worthwhile protecting it has a lot
of good uses in addition to bad uses.

187
00:10:31,930 --> 00:10:34,500
So most of the the things
that we've talked about today

188
00:10:34,500 --> 00:10:37,730
are either at the forefront
of research technologically,

189
00:10:37,730 --> 00:10:41,000
or they're a topic of
serious ethical debates.

190
00:10:41,000 --> 00:10:42,990
None of this is really settled, and so

191
00:10:42,990 --> 00:10:46,160
this is an ongoing conversation
area of ongoing research.

192
00:10:46,160 --> 00:10:49,030
We don't know which anonymity system for
Bitcoin, if any,

193
00:10:49,030 --> 00:10:50,910
is going to become prominent or
mainstream.

194
00:10:52,370 --> 00:10:56,660
And so this is a great opportunity for
you, either as a developer or in thinking

195
00:10:56,660 --> 00:11:00,030
through the ethical implications to
get involved in some of these issues.

196
00:11:00,030 --> 00:11:02,827
And hopefully what you've learned in
this lecture has given you the right

197
00:11:02,827 --> 00:11:03,695
background for that.
