1
00:00:00,190 --> 00:00:05,620
So, we've spent a bunch of time learning
about this MMLU benchmark and really what

2
00:00:05,630 --> 00:00:10,500
I've been trying to do is demystify it
for you because I want you to understand

3
00:00:10,510 --> 00:00:17,540
that nothing is as complex as it seems
and you are able to do anything yourself

4
00:00:17,550 --> 00:00:21,100
if you just dedicate a little time to
understand it.

5
00:00:22,170 --> 00:00:26,900
All these benchmarks and advanced
prompting techniques, all of this is

6
00:00:26,910 --> 00:00:31,560
really just testing and evaluating
different prompts and different models

7
00:00:31,570 --> 00:00:38,760
and now it's time to take it the next
level and introduce you to MMLU Pro.

8
00:00:39,610 --> 00:00:43,020
This is the next level of the MMLU benchmark.

9
00:00:44,230 --> 00:00:47,240
Why do we need a pro version of this?

10
00:00:48,250 --> 00:00:50,300
Isn't the MMLU benchmark good enough?

11
00:00:51,250 --> 00:00:54,100
That's a good question and in a lot of

12
00:00:54,110 --> 00:00:57,440
cases, the answer is yes, MMLU is good enough.

13
00:00:58,670 --> 00:01:00,700
But we're not just here to do good

14
00:01:00,710 --> 00:01:01,360
enough, right?

15
00:01:02,230 --> 00:01:05,000
We want to push the boundaries of what

16
00:01:05,010 --> 00:01:09,040
these models and what prompting can do.

17
00:01:09,910 --> 00:01:13,100
So I'm going to teach you about this MMLU

18
00:01:13,110 --> 00:01:16,500
Pro benchmark
because I'm confident that once you

19
00:01:16,510 --> 00:01:21,480
understand the improvements this
benchmark makes over the original MMLU,

20
00:01:22,070 --> 00:01:27,800
you'll be able to critique other
benchmarks and also think of your own

21
00:01:27,810 --> 00:01:31,180
ways to improve other benchmarks
.

22
00:01:31,310 --> 00:01:35,640
Essentially, this is going to future
-proof your knowledge and skills when it

23
00:01:35,650 --> 00:01:40,540
comes to using benchmarks for testing and evaluations.

24
00:01:40,750 --> 00:01:43,660
Again, as we just learned, benchmarks are

25
00:01:43,670 --> 00:01:49,220
generally used as an evaluation method
for models, not for prompting.

26
00:01:49,230 --> 00:01:55,520
But they obviously involve prompts and so
they're a great data set, a great source

27
00:01:55,530 --> 00:02:00,320
of information and data that we can use
for our own prompting tests.

28
00:02:00,330 --> 00:02:04,520
And also sometimes when you're prompt
engineering, you're going to want to

29
00:02:04,530 --> 00:02:08,680
compare different models, you need to
understand which model is best for your

30
00:02:08,690 --> 00:02:10,400
specific use case.

31
00:02:10,410 --> 00:02:15,100
So let's dive into MMLU Pro.

32
00:02:15,110 --> 00:02:19,360
Here's the paper that introduced this new benchmark.

33
00:02:19,370 --> 00:02:21,300
It's from researchers at the University

34
00:02:21,310 --> 00:02:25,800
of Waterloo, University of Toronto and
Carnegie Mellon University.

35
00:02:25,810 --> 00:02:28,380
But again, why did they introduce it?

36
00:02:28,610 --> 00:02:32,280
Well, they talk about that in the paper.

37
00:02:32,830 --> 00:02:38,200
Essentially, they found that there are
three problems with the MMLU benchmark.

38
00:02:38,210 --> 00:02:42,940
And before I get into these, this isn't
something unique necessarily that these

39
00:02:42,950 --> 00:02:44,360
researchers found.

40
00:02:44,370 --> 00:02:48,540
There are always criticisms of benchmarks

41
00:02:48,550 --> 00:02:52,320
and that's one of the reasons that you're
learning about them, so that you can

42
00:02:52,330 --> 00:02:54,660
critique them and analyze them yourself.

43
00:02:54,670 --> 00:02:56,680
And these criticisms were actually quite

44
00:02:56,690 --> 00:02:59,260
well known by the community.

45
00:02:59,270 --> 00:03:02,160
It's just that these researchers actually

46
00:03:02,170 --> 00:03:04,200
decided to solve those issues.

47
00:03:04,370 --> 00:03:06,100
Okay, what are those issues now?

48
00:03:06,870 --> 00:03:12,360
First, the questions in the MMLU
benchmark only have three distractor options.

49
00:03:12,370 --> 00:03:15,380
So they're normal multiple choice
questions, right?

50
00:03:15,390 --> 00:03:21,460
They had four options to choose from, one
of which was correct, it was the ground

51
00:03:21,590 --> 00:03:24,000
truth, it was the golden answer.

52
00:03:24,010 --> 00:03:26,640
And then they had three distractor

53
00:03:27,450 --> 00:03:29,660
options, three options that were incorrect.

54
00:03:29,790 --> 00:03:32,340
Now, that means that a model basically

55
00:03:32,350 --> 00:03:36,940
can score 25 % simply by guessing.

56
00:03:36,950 --> 00:03:38,640
I don't know about you, but this is why I

57
00:03:38,650 --> 00:03:42,220
always loved multiple choice questions.

58
00:03:42,230 --> 00:03:44,520
The answer is given to you and you just

59
00:03:44,530 --> 00:03:51,020
have to find the one that seems the most
right, even if you don't know that it's right.

60
00:03:51,030 --> 00:03:53,620
So how did they solve for that issue?

61
00:03:53,790 --> 00:03:57,120
Well, instead of just four options, they

62
00:03:57,130 --> 00:04:00,500
created 10 options for each question.

63
00:04:00,510 --> 00:04:02,480
So it's still multiple choice questions

64
00:04:02,490 --> 00:04:10,620
in MMLU Pro, but instead of just A, B, C,
D, they have 10 different options.

65
00:04:10,630 --> 00:04:14,640
If I was taking a multiple choice test
with 10 different options, that would

66
00:04:14,650 --> 00:04:16,820
definitely prove a lot harder.

67
00:04:16,830 --> 00:04:18,700
It's going to make it much more difficult

68
00:04:18,710 --> 00:04:22,680
for me to just guess and find the right answer.

69
00:04:22,790 --> 00:04:25,060
Okay, that seems like a good solution, right?

70
00:04:25,070 --> 00:04:28,120
And we're going to talk about that one a
bit more in a second.

71
00:04:28,130 --> 00:04:30,520
But let's go back to the issues here.

72
00:04:30,530 --> 00:04:32,420
The second issue that the researchers

73
00:04:32,430 --> 00:04:39,420
identified was that the questions in MMLU
didn't really require much thinking.

74
00:04:39,430 --> 00:04:46,380
It didn't really require chain of thought
or reasoning.

75
00:04:46,490 --> 00:04:53,520
Now, that's fine, it just means that the
questions aren't that hard, especially

76
00:04:53,530 --> 00:05:00,340
for something like a large language model
that is really good at picking pieces of

77
00:05:00,350 --> 00:05:02,240
information out, right?

78
00:05:02,250 --> 00:05:04,960
It can pull out facts and information

79
00:05:04,970 --> 00:05:10,320
much better than a human, but it's still
not great at really thinking and

80
00:05:10,330 --> 00:05:11,740
reasoning through things.

81
00:05:11,750 --> 00:05:13,660
That's what a lot of these prompting

82
00:05:13,670 --> 00:05:18,640
techniques we've learned about help it
do, to think and reason better.

83
00:05:18,650 --> 00:05:23,760
So as a result, the MMLU questions, since
they are mostly knowledge -driven rather

84
00:05:23,770 --> 00:05:30,120
than requiring reasoning, are relatively
easy for these models, especially the

85
00:05:30,130 --> 00:05:31,980
leading ones, the frontier ones.

86
00:05:31,990 --> 00:05:33,480
So how did they solve that?

87
00:05:33,490 --> 00:05:37,300
With MMLU Pro, they upped the ante.

88
00:05:37,310 --> 00:05:39,620
They made these questions require

89
00:05:39,630 --> 00:05:42,980
deliberate reasoning in order to answer.

90
00:05:42,990 --> 00:05:44,900
So the model has to really think through

91
00:05:45,650 --> 00:05:48,600
them, and we'll talk about that one a bit
more in a second too.

92
00:05:48,610 --> 00:05:54,360
But now let's turn to the third issue
with the MMLU benchmark, and that is,

93
00:05:54,370 --> 00:05:56,540
there are mistakes.

94
00:05:56,550 --> 00:06:00,000
This one is actually my favorite because,

95
00:06:00,270 --> 00:06:03,880
again, it just sort of demystifies all
this stuff.

96
00:06:03,890 --> 00:06:10,500
Everyone thinks, oh, there's this MMLU
benchmark, and it's cited in the OpenAI

97
00:06:10,510 --> 00:06:15,900
and Anthropic papers and Google papers
when they release new models.

98
00:06:15,910 --> 00:06:18,340
It must be so perfect.

99
00:06:18,670 --> 00:06:20,420
Well, no.

100
00:06:20,430 --> 00:06:25,760
When people actually started looking
closely at the MMLU benchmark, they found

101
00:06:25,770 --> 00:06:31,020
that some of the questions didn't have a
correct answer to them.

102
00:06:31,030 --> 00:06:37,160
The four options given didn't include a
correct one, or there were mistakes in

103
00:06:37,170 --> 00:06:39,500
the actual questions themselves.

104
00:06:39,510 --> 00:06:41,000
So there you go.

105
00:06:41,010 --> 00:06:42,060
Everybody makes mistakes.

106
00:06:42,070 --> 00:06:44,160
Next time you make a mistake and think,

107
00:06:44,330 --> 00:06:49,340
oh, gosh, I'm just such an idiot, well,
hey, AI researchers do it all the time

108
00:06:49,870 --> 00:06:52,040
too, so don't be down on yourself.

109
00:06:52,050 --> 00:06:56,480
So how did they solve this in MMLU Pro?

110
00:06:56,590 --> 00:07:03,180
Well, they had two rounds of expert
reviews to reduce the noise, that is, the

111
00:07:03,190 --> 00:07:06,100
incorrect answers in the dataset.

112
00:07:06,110 --> 00:07:08,340
So that's pretty straightforward.

113
00:07:08,350 --> 00:07:09,320
And so there you go.

114
00:07:09,330 --> 00:07:11,500
There are the three main issues that the

115
00:07:11,510 --> 00:07:16,700
researchers identified with MMLU, and
there are the three main solutions that

116
00:07:16,710 --> 00:07:19,440
they solved with MMLU Pro.

117
00:07:19,450 --> 00:07:21,500
Now let's dive into the details a bit

118
00:07:21,550 --> 00:07:23,900
here, because that's where all the fun
stuff is, right?

119
00:07:23,910 --> 00:07:27,900
That's where we get into the nitty
-gritty and the real application of

120
00:07:27,910 --> 00:07:31,220
prompt engineering to the real world.

121
00:07:31,230 --> 00:07:34,500
So the first thing to understand is that

122
00:07:34,510 --> 00:07:40,360
MMLU Pro actually reuses a lot of the
questions in MMLU.

123
00:07:40,370 --> 00:07:45,880
So you can see here, this is the
discipline, so math, physics, chemistry, etc.

124
00:07:45,890 --> 00:07:54,480
And the number of questions that MMLU Pro
has, so MMLU Pro has 1 ,351 math

125
00:07:54,490 --> 00:07:59,720
questions in it, and 1 ,299 physics
questions, and so on and so forth.

126
00:07:59,730 --> 00:08:04,080
And in total, it had 12 ,032 questions.

127
00:08:04,090 --> 00:08:07,700
That's actually less than MMLU, which had

128
00:08:07,710 --> 00:08:09,880
about 15 ,000 questions.

129
00:08:09,890 --> 00:08:12,500
But the key point to understand here is

130
00:08:12,510 --> 00:08:22,240
that of those 12 ,000 questions in MMLU
Pro, 6 ,810 of them came from the

131
00:08:22,250 --> 00:08:24,120
original MMLU benchmark.

132
00:08:24,130 --> 00:08:28,040
You can see that in this column here.

133
00:08:28,050 --> 00:08:37,120
And then they added 5 ,222 questions that
are brand new to the MMLU Pro benchmark.

134
00:08:37,130 --> 00:08:45,040
And now you might be asking, hold on a
second, going from 4 options per multiple

135
00:08:45,050 --> 00:08:50,080
choice question to 10 is actually a
pretty big task.

136
00:08:50,090 --> 00:08:56,720
Can you imagine taking thousands of
questions and adding 6 options that

137
00:08:56,730 --> 00:09:03,740
should sound right, like they can't be
completely unrelated to the question, but

138
00:09:03,750 --> 00:09:05,680
they also can't be correct.

139
00:09:05,690 --> 00:09:08,160
They have to have some sort of flaw in them.

140
00:09:08,170 --> 00:09:13,880
So how would you go about solving that?

141
00:09:13,890 --> 00:09:17,940
Well, if your answer is use an LLM, then

142
00:09:17,950 --> 00:09:21,620
that's exactly what the researchers did too.

143
00:09:21,630 --> 00:09:23,640
They took all the multiple choice

144
00:09:23,650 --> 00:09:31,720
questions and then used GPT -4 Turbo to
add 6 additional options.

145
00:09:31,730 --> 00:09:34,140
So there you go, again, demystifying this.

146
00:09:34,150 --> 00:09:35,980
There's no magic to it.

147
00:09:35,990 --> 00:09:37,920
You could do this yourself.

148
00:09:37,930 --> 00:09:40,880
You could take this MMLU Pro benchmark

149
00:09:40,890 --> 00:09:46,340
and make it 20 options for every
question, and therefore make it even

150
00:09:46,350 --> 00:09:48,300
harder for the models.

151
00:09:48,310 --> 00:09:49,560
And you know what?

152
00:09:49,570 --> 00:09:55,000
If you wanted to do that, here's the
prompt that they used to create those

153
00:09:55,010 --> 00:09:57,960
additional 6 options.

154
00:09:57,970 --> 00:10:00,280
So you can see it starts up here.

155
00:10:00,290 --> 00:10:03,500
I have a multiple choice question with 4
options, 1 of which is correct.

156
00:10:03,510 --> 00:10:07,620
I need to expand it to 10 options.

157
00:10:07,630 --> 00:10:10,400
Please generate 6 additional plausible

158
00:10:10,410 --> 00:10:13,040
but incorrect options.

159
00:10:13,050 --> 00:10:15,700
And then you can see here it's labeled 1 shot.

160
00:10:15,710 --> 00:10:16,800
So this is the shot.

161
00:10:16,810 --> 00:10:18,600
It has the input.

162
00:10:18,610 --> 00:10:21,280
This is the question and the 4 options.

163
00:10:21,290 --> 00:10:26,000
And it says what the answer is.

164
00:10:26,010 --> 00:10:30,040
And then it's giving an exemplar, a shot,
the output.

165
00:10:30,050 --> 00:10:31,020
This is what it wants.

166
00:10:31,030 --> 00:10:38,780
It wants generated 6 additional options, EFGHIJ.

167
00:10:38,790 --> 00:10:39,940
There you go.

168
00:10:39,950 --> 00:10:42,040
Easy peasy, right?

169
00:10:42,090 --> 00:10:44,680
Okay, let's keep going through this
because I really love breaking this stuff

170
00:10:44,690 --> 00:10:49,140
down for you and so that we can really
sort of understand it at a much deeper

171
00:10:49,150 --> 00:10:55,020
level than, of course, the average person
who uses ChatGPT or some other model.

172
00:10:55,030 --> 00:10:59,860
But frankly, you're starting to
understand these benchmarks more than a

173
00:10:59,870 --> 00:11:04,220
lot of people working in the AI space.

174
00:11:04,230 --> 00:11:08,140
So nice work and let's keep going.

175
00:11:08,150 --> 00:11:13,980
So if you recall about MMLU, it was 5 shot.

176
00:11:14,330 --> 00:11:16,820
Remember, it included 5 shots and then

177
00:11:16,830 --> 00:11:17,960
the question.

178
00:11:18,190 --> 00:11:20,740
Well, since we want to compare apples to

179
00:11:20,750 --> 00:11:27,520
apples in a lot of cases, right, MMLU Pro
also uses 5 shot.

180
00:11:27,530 --> 00:11:32,900
But interestingly, it also uses chain of thought.

181
00:11:32,910 --> 00:11:34,400
And that makes sense because remember,

182
00:11:34,410 --> 00:11:41,200
one of the main things about MMLU Pro was
that the questions were more reasoning -based.

183
00:11:41,210 --> 00:11:46,080
They require the model to think through
the question and the answer more so than

184
00:11:46,090 --> 00:11:49,240
was the case in MMLU.

185
00:11:49,250 --> 00:11:51,820
Let's see what that actually looks like here.

186
00:11:51,830 --> 00:11:59,140
So this is an example of one question in
the MMLU Pro benchmark.

187
00:11:59,150 --> 00:12:01,480
In fact, the question isn't actually even here.

188
00:12:01,490 --> 00:12:03,780
There's just a variable saving its spot

189
00:12:03,790 --> 00:12:04,880
for the question.

190
00:12:04,890 --> 00:12:06,920
Let me show you what I mean here.

191
00:12:06,930 --> 00:12:12,180
So we've got the instruction here at the
top in yellow.

192
00:12:12,190 --> 00:12:15,760
The following are multiple choice
questions with answers about physics.

193
00:12:15,770 --> 00:12:19,740
Think step by step and then finish your
answer with the answer is X where X is

194
00:12:19,750 --> 00:12:22,020
the correct answer.

195
00:12:22,030 --> 00:12:23,980
Simple enough, right?

196
00:12:23,990 --> 00:12:26,580
And then we've got 5 shots.

197
00:12:26,590 --> 00:12:28,000
Let's count them.

198
00:12:28,330 --> 00:12:35,350
1, 2, 3, 4, 5
.

199
00:12:35,360 --> 00:12:38,550
And let's look at this first one here a
little bit closer.

200
00:12:38,560 --> 00:12:43,590
You can see, okay, it's a question about
refracting telescope and then it has the

201
00:12:43,600 --> 00:12:43,890
options here.

202
00:12:43,900 --> 00:12:46,370
And you can see there's 10 different

203
00:12:46,380 --> 00:12:48,630
options, 10 different answers you can
choose from.

204
00:12:48,800 --> 00:12:52,510
A, B, C, D, E, F, G, H, I, J.

205
00:12:52,520 --> 00:12:56,190
And then it has an answer that says let's

206
00:12:56,200 --> 00:12:59,510
think step by step and then goes through
the thinking process.

207
00:12:59,520 --> 00:13:03,150
In a refracting telescope, if both lenses
are converging, blah, blah, blah.

208
00:13:03,160 --> 00:13:06,490
And then it ends with the answer is 8.

209
00:13:06,560 --> 00:13:08,250
So again, this is a shot.

210
00:13:08,260 --> 00:13:14,090
So it's showing how to think through the question.

211
00:13:14,160 --> 00:13:16,310
All right, so we have those 5 shots and

212
00:13:16,320 --> 00:13:19,090
then we have the actual question.

213
00:13:19,100 --> 00:13:23,270
Like I said, this is just a variable

214
00:13:23,280 --> 00:13:28,530
waiting for the question and the options
to be inserted.

215
00:13:28,540 --> 00:13:34,610
But that is where the actual MMLU Pro
question would go.

216
00:13:34,620 --> 00:13:38,390
And now I have a quick little exercise
here I want you to do.

217
00:13:38,400 --> 00:13:46,010
I want you to pause the video in a moment
and just think about what type of chain

218
00:13:46,020 --> 00:13:49,370
of thought is being used here.

219
00:13:49,380 --> 00:13:53,690
Pause the video, look at it, think it

220
00:13:53,700 --> 00:13:56,850
through, and then come back.

221
00:13:56,920 --> 00:13:58,310
All right, welcome back.

222
00:13:58,320 --> 00:14:01,570
Did you think step by step?

223
00:14:01,580 --> 00:14:04,590
Okay, that's a pretty good joke.

224
00:14:04,600 --> 00:14:07,350
We were talking about chain of thought,
think step by step, you get it?

225
00:14:07,360 --> 00:14:09,490
Okay, perfect.

226
00:14:09,500 --> 00:14:12,290
Well, in fact, I wanted you to do that

227
00:14:12,300 --> 00:14:17,570
because the researchers here have
interestingly basically thrown all the

228
00:14:17,580 --> 00:14:21,770
chain of thought that they possibly can
into this prompt.

229
00:14:21,780 --> 00:14:26,290
You can see in the actual instruction,
they say, think step by step.

230
00:14:26,300 --> 00:14:29,150
That's zero -shot chain of thought, right?

231
00:14:29,160 --> 00:14:35,430
But then in each of the shots, they also

232
00:14:35,440 --> 00:14:39,770
give what I'll call normal chain of
thought, where they actually walk through

233
00:14:39,780 --> 00:14:48,580
the thinking process that they want the
model to follow, right?

234
00:14:48,590 --> 00:14:54,860
That's the answer field in each of these shots.

235
00:14:54,870 --> 00:14:59,540
And even more so, in the actual question

236
00:14:59,550 --> 00:15:05,860
that they want the model to answer,
they've pre -populated the answer with,

237
00:15:05,870 --> 00:15:09,360
again, zero -shot chain of thought.

238
00:15:09,370 --> 00:15:11,040
So, that's kind of interesting.

239
00:15:11,050 --> 00:15:20,240
They've really thrown everything at this
to get the model to use chain of thought thinking.

240
00:15:20,250 --> 00:15:23,220
Now, maybe this is cheating a bit, right?

241
00:15:23,230 --> 00:15:25,060
Maybe it's cheating to use chain of

242
00:15:25,070 --> 00:15:27,300
thought inside the prompt.

243
00:15:27,310 --> 00:15:29,740
And if you're trying to make this

244
00:15:29,750 --> 00:15:36,180
benchmark harder than MMLU by introducing
more reasoning tasks, well, maybe you

245
00:15:36,190 --> 00:15:42,500
shouldn't have actually introduced chain
of thought prompting into the benchmark.

246
00:15:42,510 --> 00:15:45,520
That's just something for you to think about.

247
00:15:45,530 --> 00:15:50,120
But here is the key aspect of MMLU Pro,

248
00:15:50,130 --> 00:15:53,960
and that is the results.

249
00:15:53,970 --> 00:15:57,200
The key is that it's harder for these

250
00:15:57,210 --> 00:16:01,040
models to answer these questions than MMLU.

251
00:16:01,050 --> 00:16:03,460
So, you can see here along the X -axis,

252
00:16:03,470 --> 00:16:08,680
we've got three different models, and on
the Y -axis, we have the accuracy score.

253
00:16:08,690 --> 00:16:15,960
All these orange bars are what these
models scored on the MMLU benchmark, and

254
00:16:15,970 --> 00:16:22,020
these blue bars are what this same model
scored on the MMLU Pro benchmark.

255
00:16:22,030 --> 00:16:31,920
So, you can see, GPT -4 -0, it's harder
for that model to deal with the MMLU Pro questions.

256
00:16:31,930 --> 00:16:38,760
Same with LLAMA -3, 70 billion, and
GEMMA, 7 billion.

257
00:16:38,770 --> 00:16:44,760
So, in that sense, the benchmark is
successful because researchers were just

258
00:16:44,770 --> 00:16:46,720
finding these models were getting too
good at it.

259
00:16:46,730 --> 00:16:55,300
You can see, GPT -4 -0 was almost at 90 %
accuracy at MMLU.

260
00:16:55,310 --> 00:17:03,460
The models are getting a little too smart
at these sort of multiple choice question benchmarks.

261
00:17:03,470 --> 00:17:06,780
So, we're going to have to continually
make them harder.

262
00:17:06,850 --> 00:17:13,660
Right now, that's MMLU Pro, but no doubt
in the future, we're going to add another

263
00:17:13,670 --> 00:17:17,740
benchmark that maybe will make it even harder.

264
00:17:17,750 --> 00:17:20,700
Like I said, it could possibly be you

265
00:17:20,710 --> 00:17:27,680
creating that next benchmark by going
from 10 options for every question to 20

266
00:17:27,690 --> 00:17:31,460
options for every question, or maybe
there's other ways you can make these

267
00:17:31,770 --> 00:17:37,240
questions harder, like removing the zero
-shot chain of thought from the prompt.

268
00:17:37,250 --> 00:17:42,180
Again, this knowledge is meant to future
-proof you so that you understand both

269
00:17:42,190 --> 00:17:48,200
how these benchmarks work and where
they're going in the future.

270
00:17:48,210 --> 00:17:53,800
Alright, and with that, you are now a
benchmark pro.

271
00:17:54,030 --> 00:17:55,040
Get it?

272
00:17:55,530 --> 00:17:56,320
MMLU Pro.

273
00:17:57,050 --> 00:17:57,900
Benchmark Pro.

274
00:17:57,910 --> 00:18:00,720
Okay, that was another really, really

275
00:18:00,990 --> 00:18:01,540
good joke.

276
00:18:01,610 --> 00:18:02,040
Thank you.

277
00:18:02,050 --> 00:18:02,960
You're welcome very much.

278
00:18:03,090 --> 00:18:03,140
Thank you, Scott.