1
00:00:08,541 --> 00:00:09,843
KARIN GICHUHI:
Good afternoon.

2
00:00:09,843 --> 00:00:11,511
My name is Karin Gichuhi.

3
00:00:11,511 --> 00:00:14,347
I am the team lead for
Strategic Information in

4
00:00:14,347 --> 00:00:17,550
the Office of
HIV/AIDS at USAID.

5
00:00:17,550 --> 00:00:19,686
I wanted to start
this presentation by

6
00:00:19,686 --> 00:00:22,589
acknowledging three other
co-presenters couldn't

7
00:00:22,589 --> 00:00:25,458
join me today, and their
names are listed below.

8
00:00:25,458 --> 00:00:28,428
They're all members of the
division in which I work:

9
00:00:28,428 --> 00:00:30,764
Aaron Chafetz, who is one
of my Data Visualization

10
00:00:30,764 --> 00:00:35,869
Specialists; Ramona
Godbole, who is one our

11
00:00:35,869 --> 00:00:38,071
Expenditure Analysis
advisors; and, Catherine

12
00:00:38,071 --> 00:00:41,441
Nichols, who is one of the
data analysts on my team.

13
00:00:41,441 --> 00:00:43,343
So I didn't put all of
these slides together

14
00:00:43,343 --> 00:00:45,378
myself and I wanted to
acknowledge them before I

15
00:00:45,378 --> 00:00:47,447
get started.

16
00:00:47,447 --> 00:00:51,017
Today I want to talk a
little bit about being a

17
00:00:51,017 --> 00:00:54,487
critical consumer
of information.

18
00:00:54,487 --> 00:00:57,223
And I titled this
presentation: Evidence: It

19
00:00:57,223 --> 00:00:59,592
is all about
interpretation?

20
00:00:59,592 --> 00:01:00,960
And I want you to think
about that as we go

21
00:01:00,960 --> 00:01:02,295
through today's
presentation.

22
00:01:02,295 --> 00:01:08,435
When we talk about being a
consumer of information,

23
00:01:08,435 --> 00:01:09,969
what do I mean by that.

24
00:01:09,969 --> 00:01:11,104
Many of you in the
audience today are in

25
00:01:11,104 --> 00:01:14,140
school, and you may be
working with data sets or

26
00:01:14,140 --> 00:01:16,609
a task of data analysis;
others of you are in the

27
00:01:16,609 --> 00:01:18,244
workplace and you
might be having to make

28
00:01:18,244 --> 00:01:20,480
multimillion dollar
decisions based off of

29
00:01:20,480 --> 00:01:23,616
information you have ten
minutes to review, and so

30
00:01:23,616 --> 00:01:25,785
I wanted to spend a little
bit of time talking about

31
00:01:25,785 --> 00:01:29,255
the responsibility we have
when reviewing data or

32
00:01:29,255 --> 00:01:31,090
putting data together.

33
00:01:31,090 --> 00:01:34,594
Increasingly, we have
access to more data, or

34
00:01:34,594 --> 00:01:37,630
what we call strategic
information, and even

35
00:01:37,630 --> 00:01:40,200
reading the newspaper
or you Facebook account

36
00:01:40,200 --> 00:01:42,035
presents you with a
challenge on whether or

37
00:01:42,035 --> 00:01:44,904
not to reference the
information shared with

38
00:01:44,904 --> 00:01:47,106
you as you go
about you day.

39
00:01:47,106 --> 00:01:50,510
It becomes increasingly
difficult to discern if

40
00:01:50,510 --> 00:01:53,880
you are in the presence
of useful or useable or

41
00:01:53,880 --> 00:01:57,784
validated data, or what I
like to call data vomit.

42
00:01:57,784 --> 00:02:01,988
I put my information in
two categories of data vomit.

43
00:02:01,988 --> 00:02:04,324
There's the situation
where you have more

44
00:02:04,324 --> 00:02:06,359
information than you even
know what to do with,

45
00:02:06,359 --> 00:02:08,961
that's one form of data
vomit, and then the other

46
00:02:08,961 --> 00:02:11,731
group of information that
I like to call data vomit

47
00:02:11,731 --> 00:02:13,800
is where people have
pieced pieces of

48
00:02:13,800 --> 00:02:16,669
information together to
tell the story that they

49
00:02:16,669 --> 00:02:20,974
want to be told.

50
00:02:20,974 --> 00:02:23,676
I boiled down a critical
consume of information

51
00:02:23,676 --> 00:02:25,211
into three pieces.

52
00:02:25,211 --> 00:02:29,215
It's about being curious,
it's about being clear

53
00:02:29,215 --> 00:02:31,651
with the question that
you want to answer, and

54
00:02:31,651 --> 00:02:33,586
there's also a
responsibility about

55
00:02:33,586 --> 00:02:35,889
maintaining
data integrity.

56
00:02:35,889 --> 00:02:40,793
So how I apply these
concepts to work, and what

57
00:02:40,793 --> 00:02:43,796
I encourage everyone on my
team and the people I work

58
00:02:43,796 --> 00:02:46,666
with to do, really
revolves around the fact

59
00:02:46,666 --> 00:02:49,802
that when reviewing
information it's important

60
00:02:49,802 --> 00:02:53,540
to ask questions about the
data source, about the

61
00:02:53,540 --> 00:02:56,476
methodology, and
understand how the

62
00:02:56,476 --> 00:03:00,613
summarizing statement
was determined.

63
00:03:00,613 --> 00:03:02,849
Not all data sets or
summary information is

64
00:03:02,849 --> 00:03:06,219
even appropriate to talk
about related to the

65
00:03:06,219 --> 00:03:09,355
question somebody wants
answered, and maintaining

66
00:03:09,355 --> 00:03:11,691
data integrity means
don't cherry pick the

67
00:03:11,691 --> 00:03:15,995
information to
tell your story.

68
00:03:15,995 --> 00:03:20,266
I wanted to share a few
recent headlines that I

69
00:03:20,266 --> 00:03:23,970
reviewed last September
and they all have to do

70
00:03:23,970 --> 00:03:26,739
with real life
examples of this.

71
00:03:26,739 --> 00:03:29,509
In the first slide, you'll
notice that I mentioned a

72
00:03:29,509 --> 00:03:31,978
number of individual
decisions we make

73
00:03:31,978 --> 00:03:33,313
throughout the day.

74
00:03:33,313 --> 00:03:35,782
Whether you're grocery
shopping or determining

75
00:03:35,782 --> 00:03:38,484
which doctor to go to, a
lot of information is used

76
00:03:38,484 --> 00:03:40,687
to make financial based
decisions: what should I

77
00:03:40,687 --> 00:03:43,990
fund, what should I buy,
but we also need to

78
00:03:43,990 --> 00:03:47,827
consider the
responsibility of

79
00:03:47,827 --> 00:03:51,864
distilling down the
results of a study to a

80
00:03:51,864 --> 00:03:52,999
single headline.

81
00:03:52,999 --> 00:03:55,234
And I wanted to highlight
a few things that came up

82
00:03:55,234 --> 00:03:56,402
recently.

83
00:03:56,402 --> 00:03:59,005
The first, in September
of last year, the Census

84
00:03:59,005 --> 00:04:01,274
Bureau released their
latest report indicating

85
00:04:01,274 --> 00:04:03,943
that household median
income grew in urban

86
00:04:03,943 --> 00:04:06,946
areas, but declined
in rural areas.

87
00:04:06,946 --> 00:04:09,649
This was in five of the
major newspapers, yet the

88
00:04:09,649 --> 00:04:12,251
data set, released
separately and in a

89
00:04:12,251 --> 00:04:15,221
separate report days
later, showed that this

90
00:04:15,221 --> 00:04:16,154
was not the case.

91
00:04:16,154 --> 00:04:20,559
We hear over and over
that baby boomers and

92
00:04:20,560 --> 00:04:22,929
millennials ruin
everything.

93
00:04:22,929 --> 00:04:25,565
In a VOX article they
quoted that "baby boomers

94
00:04:25,565 --> 00:04:27,834
and millennials also ruin
the census bureau data

95
00:04:27,834 --> 00:04:30,570
set," because when you
looked at the inputs into

96
00:04:30,570 --> 00:04:33,906
this data set and who was
monitored over time and

97
00:04:33,906 --> 00:04:37,810
the impact on the
median income as it was

98
00:04:37,810 --> 00:04:40,913
calculated, baby boomers
and millennials played a

99
00:04:40,913 --> 00:04:44,417
bill role but that wasn't
able to be captured here.

100
00:04:44,417 --> 00:04:48,054
How we defined urban and
rural also had work to be

101
00:04:48,054 --> 00:04:49,555
done.

102
00:04:49,555 --> 00:04:51,691
But you wouldn't know this
because in many of the

103
00:04:51,691 --> 00:04:54,694
larger released studies
they don't associate the

104
00:04:54,694 --> 00:04:59,732
data set with what's
being reported.

105
00:04:59,732 --> 00:05:01,801
The sugar industry.

106
00:05:01,801 --> 00:05:04,037
As universities and
research organizations

107
00:05:04,037 --> 00:05:06,806
rely on funding to
initiate and continue

108
00:05:06,806 --> 00:05:09,742
research, there are also
responsibilities within

109
00:05:09,742 --> 00:05:12,145
research organizations to
ensure that conflicts of

110
00:05:12,145 --> 00:05:15,715
interest are not present
even prior to research

111
00:05:15,715 --> 00:05:17,050
being funded.

112
00:05:17,050 --> 00:05:21,187
So, in this issue, we have
the sugar industry who

113
00:05:21,187 --> 00:05:23,956
hired two Harvard
nutritionists back in the

114
00:05:23,956 --> 00:05:27,927
day and explained the
outcome that they wanted.

115
00:05:27,927 --> 00:05:30,430
So they went and did a
literature review and

116
00:05:30,430 --> 00:05:33,666
downplayed the role that
sugar played on coronary

117
00:05:33,666 --> 00:05:36,302
and heart disease and came
up with the answer that it

118
00:05:36,302 --> 00:05:40,273
was all about fat and
cholesterol, and you know

119
00:05:40,273 --> 00:05:42,008
the rest of that story.

120
00:05:42,008 --> 00:05:45,244
But, more recently, HHS
issued new rules to open

121
00:05:45,244 --> 00:05:47,080
up data from
clinical trials.

122
00:05:47,080 --> 00:05:50,016
This has become such a
glaring issue, we're

123
00:05:50,016 --> 00:05:53,286
making a lot of monetary
decisions about what to

124
00:05:53,286 --> 00:05:55,688
fund and in which
direction to go to based

125
00:05:55,688 --> 00:05:58,491
on the results of studies
without access

126
00:05:58,491 --> 00:05:59,158
to the date set.

127
00:05:59,158 --> 00:06:01,828
So while the rule has been
in place, it hasn't been

128
00:06:01,828 --> 00:06:04,497
enforced and so it's
encouraging to see that

129
00:06:04,497 --> 00:06:06,999
that's moving forward.

130
00:06:06,999 --> 00:06:09,435
I list these examples
because I really want to

131
00:06:09,435 --> 00:06:13,239
highlight that it's not
mission impossible to

132
00:06:13,239 --> 00:06:16,642
understand what you're
looking at, but we have a

133
00:06:16,642 --> 00:06:19,245
responsibility, if we're
working with information,

134
00:06:19,245 --> 00:06:22,115
to select the appropriate
data visualization when

135
00:06:22,115 --> 00:06:24,517
presenting our analysis,
and I'm going to talk

136
00:06:24,517 --> 00:06:27,620
today about the same data
different messages when

137
00:06:27,620 --> 00:06:31,090
using visualization and,
when presenting data, be

138
00:06:31,090 --> 00:06:34,494
clear about what the data
does and does not tell you.

139
00:06:34,494 --> 00:06:35,161
Be honest.

140
00:06:35,161 --> 00:06:38,131
A lot of times, when we
hook at information, it's

141
00:06:38,131 --> 00:06:40,166
not going to answer the
questions that we have.

142
00:06:40,166 --> 00:06:44,904
It may encourage us to ask
additional questions and

143
00:06:44,904 --> 00:06:52,945
that's okay and that's
what we want to see happen.

144
00:06:52,945 --> 00:06:56,382
So here I have pulled a
graphic from The Sun, and

145
00:06:56,382 --> 00:06:58,151
this is a recent article.

146
00:06:58,151 --> 00:07:02,054
And what I want to draw
your attention to here is

147
00:07:02,054 --> 00:07:06,793
this data visualization
piece in the middle.

148
00:07:06,793 --> 00:07:09,162
There's a benefit in
making it easier to look

149
00:07:09,162 --> 00:07:12,165
at the results of an
election poll, but if you

150
00:07:12,165 --> 00:07:16,102
can see, from the front
page here, you'll notice

151
00:07:16,102 --> 00:07:19,939
that UKIP, the purple
that's got a red ring

152
00:07:19,939 --> 00:07:24,377
around it, received zero
seats, which is greater in

153
00:07:24,377 --> 00:07:28,447
donut chart than the other
section which received 22

154
00:07:28,447 --> 00:07:37,223
seats and throws off the
rest of the proportions.

155
00:07:37,223 --> 00:07:39,025
So what?

156
00:07:39,025 --> 00:07:40,960
That was just a
silly mistake.

157
00:07:40,960 --> 00:07:41,961
Maybe.

158
00:07:41,961 --> 00:07:44,997
But I would argue that
oftentimes there are

159
00:07:44,997 --> 00:07:47,733
people and businesses who
are using visualization

160
00:07:47,733 --> 00:07:50,937
principles to influence
you and sell you on things

161
00:07:50,937 --> 00:07:52,772
every day.

162
00:07:52,772 --> 00:07:55,241
So we can look at this
wall of sauces here in the

163
00:07:55,241 --> 00:07:58,144
grocery store which may
look innocent and someone

164
00:07:58,144 --> 00:08:00,179
in the store simply
stocked these in random

165
00:08:00,179 --> 00:08:02,481
order, but we know that
companies pay store

166
00:08:02,481 --> 00:08:05,484
premiums to place their
products in the middle of

167
00:08:05,484 --> 00:08:07,887
the aisle so that you'll
see them first, and as

168
00:08:07,887 --> 00:08:10,022
you're going down the
aisle, you're more likely

169
00:08:10,022 --> 00:08:13,759
to pick those items.

170
00:08:13,759 --> 00:08:15,728
So we're being manipulated
into what we see and

171
00:08:15,728 --> 00:08:17,663
purchase without
recognizing the forces at

172
00:08:17,663 --> 00:08:20,900
play here, but that's why
it's important for all of

173
00:08:20,900 --> 00:08:29,208
us to be critical
consumers of data.

174
00:08:29,208 --> 00:08:32,345
So what's the value
in visualizing data?

175
00:08:32,345 --> 00:08:34,847
First, it helps us to
explore and understand the

176
00:08:34,847 --> 00:08:37,015
data that we're working
with, and secondly, it

177
00:08:37,015 --> 00:08:40,151
helps communicate our
message and our idea to a

178
00:08:40,152 --> 00:08:41,888
broader audience.

179
00:08:41,888 --> 00:08:44,824
So this is where this
interplay of using your

180
00:08:44,824 --> 00:08:47,126
visualization for good
versus evil comes into

181
00:08:47,126 --> 00:08:50,029
play because you can take
the same information and

182
00:08:50,029 --> 00:08:52,498
display it in multiple
different ways to ensure

183
00:08:52,498 --> 00:08:57,503
that the decision you want
to have happen, happens.

184
00:08:57,503 --> 00:09:01,841
I don't know how many
of you have heard of

185
00:09:01,841 --> 00:09:03,676
Anscombe's quartet.

186
00:09:03,676 --> 00:09:06,545
This is what I'm showing
on the board right here.

187
00:09:06,545 --> 00:09:12,952
This is 1973, Francis
Anscombe showed that while

188
00:09:12,952 --> 00:09:17,390
the data sets here appear
to be pretty similar, when

189
00:09:17,390 --> 00:09:20,526
we plot the data sets on
an XY coordinate plane we

190
00:09:20,526 --> 00:09:23,095
get the following results.

191
00:09:23,095 --> 00:09:26,232
So, looking at summary
statistics, this looks the

192
00:09:26,232 --> 00:09:28,968
same in the chart, but
now, in section 1, I've

193
00:09:28,968 --> 00:09:31,404
got a rough linear
relationship with

194
00:09:31,404 --> 00:09:35,574
variance, and section 2,
it fits a neat curve, but

195
00:09:35,574 --> 00:09:36,475
it's not linear.

196
00:09:36,475 --> 00:09:40,313
In section 3 it's tight
linear with one outlier,

197
00:09:40,313 --> 00:09:43,883
and in the fourth example
here, X remains a constant

198
00:09:43,883 --> 00:09:45,685
but there's one outlier.

199
00:09:45,685 --> 00:09:48,154
And if I just looked at
the summary statistics - I

200
00:09:48,154 --> 00:09:52,058
don't know if they display
easily enough, but they're

201
00:09:52,058 --> 00:09:54,627
very similar values, and
I wouldn't be able to

202
00:09:54,627 --> 00:09:56,729
understand the
relationship of these

203
00:09:56,729 --> 00:10:00,833
values without
plotting them.

204
00:10:00,833 --> 00:10:02,768
Visualizing data can
reveal trends and

205
00:10:02,768 --> 00:10:05,371
relationships, so if I
give a decision maker the

206
00:10:05,371 --> 00:10:07,840
raw data on the left, that
will take them a lot of

207
00:10:07,840 --> 00:10:11,577
time to go through and
also be very frustrating,

208
00:10:11,577 --> 00:10:14,981
but now I can communicate
a message to my audience

209
00:10:14,981 --> 00:10:17,049
by plotting the
information in a way

210
00:10:17,049 --> 00:10:18,718
that's more digestible.

211
00:10:18,718 --> 00:10:20,953
These are all common sense
pieces, but I wanted to

212
00:10:20,953 --> 00:10:24,657
highlight that right now
many of use Excel, and we

213
00:10:24,657 --> 00:10:27,526
have macro buttons that we
can pick and choose the

214
00:10:27,526 --> 00:10:29,729
different ways you want to
visualize data, but there

215
00:10:29,729 --> 00:10:33,165
are also lots of resources
out there called "visual

216
00:10:33,165 --> 00:10:35,668
vocabulary tabs" or lists.

217
00:10:35,668 --> 00:10:38,237
This is one of my favorite
from the Financial Times.

218
00:10:38,237 --> 00:10:42,808
Ann Emory, Stephanie
Everett, a lot of people

219
00:10:42,808 --> 00:10:45,878
give you options and
little tutorials on how to

220
00:10:45,878 --> 00:10:48,748
use the best graph or
chart with the data that

221
00:10:48,748 --> 00:10:57,790
you have.

222
00:10:57,790 --> 00:10:59,325
As I mentioned in
the opening of the

223
00:10:59,325 --> 00:11:02,695
presentation, data and
statistics, in particular,

224
00:11:02,695 --> 00:11:04,964
are getting a bad rap
these days, and I would

225
00:11:04,964 --> 00:11:07,266
argue that, yes,
methodology can be flawed,

226
00:11:07,266 --> 00:11:09,535
but decisions are being
made by individuals who

227
00:11:09,535 --> 00:11:13,472
are not critical
consumers of information.

228
00:11:13,472 --> 00:11:16,108
What if the methododology
was sound and the data was

229
00:11:16,108 --> 00:11:18,878
just presented in a way
that made the decision

230
00:11:18,878 --> 00:11:23,649
maker, who had ten minutes
of free time, enabled them

231
00:11:23,649 --> 00:11:26,886
to make a quick decision
and a decision that was

232
00:11:26,886 --> 00:11:33,125
made easy based upon
stellar visualization?

233
00:11:33,125 --> 00:11:35,261
As consumers of data, we
should try to understand

234
00:11:35,261 --> 00:11:37,063
the purpose and
biases of the data.

235
00:11:37,063 --> 00:11:40,066
You should be thinking
about accuracy, data

236
00:11:40,066 --> 00:11:45,438
sources, uses, data
quality and message, but I

237
00:11:45,438 --> 00:11:47,673
do want to highlight
some of the common data

238
00:11:47,673 --> 00:11:50,709
visualization lies that
we often come across.

239
00:11:50,709 --> 00:11:53,746
Some are well intentioned
and some are done on

240
00:11:53,746 --> 00:11:55,247
purpose.

241
00:11:55,247 --> 00:11:57,683
So in the abstract, these
questions seem difficult

242
00:11:57,683 --> 00:12:00,252
for us to answer but let's
start simple and look at a

243
00:12:00,252 --> 00:12:04,690
handful of
common examples.

244
00:12:04,690 --> 00:12:07,893
Right here I have a slide
with the title Impressive

245
00:12:07,893 --> 00:12:10,563
growth in testing volume
over the past four

246
00:12:10,563 --> 00:12:11,530
quarters.

247
00:12:11,530 --> 00:12:15,134
So on X axis, I'm looking
at time periods of

248
00:12:15,134 --> 00:12:18,971
quarters, three months at
a time, and on the Y axis,

249
00:12:18,971 --> 00:12:20,773
I'm looking at the
absolute number of

250
00:12:20,773 --> 00:12:22,942
individuals tested.

251
00:12:22,942 --> 00:12:26,846
And the summary statement
here is saying: we've done

252
00:12:26,846 --> 00:12:27,513
really well.

253
00:12:27,513 --> 00:12:30,549
We've improved the number
of people we've tested

254
00:12:30,549 --> 00:12:33,786
over the quarters.

255
00:12:33,786 --> 00:12:36,789
But, in actuality, there's
one thing wrong with the

256
00:12:36,789 --> 00:12:39,358
way this graph was set up.

257
00:12:39,358 --> 00:12:42,761
Does anyone have an idea
what would improve this

258
00:12:42,761 --> 00:12:43,529
graph?

259
00:12:43,529 --> 00:12:46,932
Yeah.

260
00:12:46,932 --> 00:12:49,568
AUDIENCE MEMBER: The Y
axis needs to start at zero.

261
00:12:49,568 --> 00:12:50,402
KARIN GICHUHI:
That's correct.

262
00:12:50,402 --> 00:12:51,570
She said the Y axis
needs to start at zero.

263
00:12:51,570 --> 00:12:54,540
Because what you're really
seeing in absolute numbers

264
00:12:54,540 --> 00:12:58,844
here is a six percent
change over four quarters.

265
00:12:58,844 --> 00:13:02,982
If I take the same graph
and start my Y axis at

266
00:13:02,982 --> 00:13:06,385
zero, now the percent
change is still six

267
00:13:06,385 --> 00:13:12,958
percent, but the visual
is not as dramatic.

268
00:13:12,958 --> 00:13:14,527
Pie charts.

269
00:13:14,527 --> 00:13:17,129
So, pie charts, there's
two pieces here a want to

270
00:13:17,129 --> 00:13:18,264
highlight.

271
00:13:18,264 --> 00:13:21,567
One is a little more about
us and humans and the

272
00:13:21,567 --> 00:13:24,837
other about our dear
friends at Excel.

273
00:13:24,837 --> 00:13:27,940
The default in Excel
placed a color legend next

274
00:13:27,940 --> 00:13:29,708
to the graph and the
reader has to go back and

275
00:13:29,708 --> 00:13:31,443
forth to figure out what
they're actually looking

276
00:13:31,443 --> 00:13:34,513
at, but another issue with
pie charts is they often

277
00:13:34,513 --> 00:13:37,049
contain too many slices,
depending on your data

278
00:13:37,049 --> 00:13:39,618
set, and it's too
cluttered to make sense of.

279
00:13:39,618 --> 00:13:41,954
But on the more
fundamental level, pie

280
00:13:41,954 --> 00:13:45,391
charts are difficult for
readers due to the way or

281
00:13:45,391 --> 00:13:47,526
eyes and brains
are structured.

282
00:13:47,526 --> 00:13:49,428
We're good at judging
length and have a

283
00:13:49,428 --> 00:13:52,364
difficult time
interpreting areas, so how

284
00:13:52,364 --> 00:13:57,136
large is this piece of
pie number 2 over time?

285
00:13:57,136 --> 00:13:59,572
When pictured next to each
other I have a hard time

286
00:13:59,572 --> 00:14:03,142
determining if there's
an illusion here or is

287
00:14:03,142 --> 00:14:09,114
FY17Q2, pie slice number
2, actually larger than

288
00:14:09,114 --> 00:14:15,020
any of the others?

289
00:14:15,020 --> 00:14:18,224
When I change this to a
bar chart, our brains are

290
00:14:18,224 --> 00:14:19,825
better able to compare the
length and it's easier to

291
00:14:19,825 --> 00:14:25,564
use this to convey
the same information.

292
00:14:25,564 --> 00:14:27,299
There are a number of
issues to be wary of when

293
00:14:27,299 --> 00:14:28,334
looking at maps.

294
00:14:28,334 --> 00:14:30,135
One big issue that can
occur, as seen in this

295
00:14:30,135 --> 00:14:33,239
comic, is that not scaling
variables can lead to

296
00:14:33,239 --> 00:14:35,674
wrong conclusions.

297
00:14:35,674 --> 00:14:40,312
So, in this scenario, does
the business owner adjust

298
00:14:40,312 --> 00:14:43,148
their content or
advertising to Martha

299
00:14:43,148 --> 00:14:47,586
Stewart and Furry Porn
fans, or is the reason

300
00:14:47,586 --> 00:14:51,457
that these maps look the
same just happen to match

301
00:14:51,457 --> 00:14:53,525
the population
concentration, and the

302
00:14:53,525 --> 00:14:55,194
answer is, they happen
to match the population

303
00:14:55,194 --> 00:14:56,695
concentration.

304
00:14:56,695 --> 00:14:59,698
There's no significant
relationship between

305
00:14:59,698 --> 00:15:03,636
geolocation and the
mentioned subgroups.

306
00:15:03,636 --> 00:15:10,376
And lastly, this map shows
color coding associated

307
00:15:10,376 --> 00:15:12,978
with percent of
individuals below 185

308
00:15:12,978 --> 00:15:15,614
percent of federal
poverty threshold.

309
00:15:15,614 --> 00:15:18,884
This is another census
map, but the color moves

310
00:15:18,884 --> 00:15:21,720
from a lighter to a darker
red as poverty rates

311
00:15:21,720 --> 00:15:24,223
increase and, as a result,
the map makes poverty look

312
00:15:24,223 --> 00:15:29,028
like a gaping wound.

313
00:15:29,028 --> 00:15:30,696
So with those few
examples, I did just want

314
00:15:30,696 --> 00:15:32,998
to walk through the
practical application of

315
00:15:32,998 --> 00:15:36,935
how we use information
when informing a policy maker.

316
00:15:36,935 --> 00:15:39,705
And just like every super
hero has been origin

317
00:15:39,705 --> 00:15:42,775
story, our graphs also
need to have an origin story.

318
00:15:42,775 --> 00:15:45,577
We need to start looking
at data with a purpose and

319
00:15:45,577 --> 00:15:48,447
have a specific question
in mind that we're trying

320
00:15:48,447 --> 00:15:51,016
to answer with the
information we have.

321
00:15:51,016 --> 00:15:53,252
One of my biggest pet
peeves right now is I get

322
00:15:53,252 --> 00:15:55,721
people all the time coming
and saying, "Karin, can

323
00:15:55,721 --> 00:15:58,424
you or your team tell
us how we're doing?

324
00:15:58,424 --> 00:16:00,426
Can you be more specific?

325
00:16:00,426 --> 00:16:02,394
How are you
doing with what?

326
00:16:02,394 --> 00:16:05,597
With identifying
individuals for testing?

327
00:16:05,597 --> 00:16:08,400
With capturing certain age
groups among people living

328
00:16:08,400 --> 00:16:12,004
with HIV who need to be
enrolled into care?"

329
00:16:12,004 --> 00:16:13,672
There are many different
ways to answer the

330
00:16:13,672 --> 00:16:14,840
question: how
are we doing?

331
00:16:14,840 --> 00:16:17,042
And one of the biggest
things you can do in any

332
00:16:17,042 --> 00:16:19,812
organization or
implementing any project

333
00:16:19,812 --> 00:16:23,816
is have an agreed upon set
of goals so that you are

334
00:16:23,816 --> 00:16:26,885
all on the same page about
what you're hoping to do

335
00:16:26,885 --> 00:16:29,655
and then you can agree on
what you want to monitor

336
00:16:29,655 --> 00:16:34,159
over time.

337
00:16:34,159 --> 00:16:37,329
So once I have my
question, I want to review

338
00:16:37,329 --> 00:16:41,300
some of the information I
have that can answer that

339
00:16:41,300 --> 00:16:43,736
question and I like to
go about collecting and

340
00:16:43,736 --> 00:16:48,040
tidying up the data, plot
it on a graph that's most

341
00:16:48,040 --> 00:16:51,844
appropriate, and simplify
it using font, color, and

342
00:16:51,844 --> 00:16:53,746
size to my advantage.

343
00:16:53,746 --> 00:16:55,914
So it's critical that
everything in our graph is

344
00:16:55,914 --> 00:16:59,218
done with a purpose, even
the font, even the color

345
00:16:59,218 --> 00:17:02,287
and the size to
strategically convey our

346
00:17:02,287 --> 00:17:02,955
message.

347
00:17:02,955 --> 00:17:06,625
So this is when that power
of data visualization

348
00:17:06,625 --> 00:17:09,928
comes into play because
how you incorporate your

349
00:17:09,928 --> 00:17:12,865
colors and your fonts and
your size are how you're

350
00:17:12,865 --> 00:17:14,333
going to influence
the viewer.

351
00:17:14,333 --> 00:17:20,172
It's also really important
to go ahead and spell out

352
00:17:20,172 --> 00:17:22,540
what your message
is in the graph.

353
00:17:22,540 --> 00:17:25,643
So here, I started out
with data points, and

354
00:17:25,644 --> 00:17:32,251
these points represent the
volume of testing and the

355
00:17:32,251 --> 00:17:36,155
percent of individuals
tested who are positive,

356
00:17:36,155 --> 00:17:38,390
and I've divided up each
of these dots into a

357
00:17:38,390 --> 00:17:42,060
quadrant and essentially,
the lower quadrant where

358
00:17:42,060 --> 00:17:46,665
you see C1, C2, and C3,
are those countries that

359
00:17:46,665 --> 00:17:49,168
have high testing volume
and low positivity.

360
00:17:49,168 --> 00:17:51,770
So how might I use
this information?

361
00:17:51,770 --> 00:17:55,174
We actually make funding
and program implementation

362
00:17:55,174 --> 00:17:58,444
decisions off of this type
of information to see

363
00:17:58,444 --> 00:18:00,679
where we're doing more
effective testing.

364
00:18:00,679 --> 00:18:04,783
Effective defined as,
are you finding the most

365
00:18:04,783 --> 00:18:05,484
positives.

366
00:18:05,484 --> 00:18:10,956
So when we go back to
this idea of a critical

367
00:18:10,956 --> 00:18:15,060
consumer, I really do like
to emphasize that only you

368
00:18:15,060 --> 00:18:17,095
can prevent the
perpetuation of data

369
00:18:17,095 --> 00:18:19,198
visualization lies.

370
00:18:19,198 --> 00:18:22,234
You have the ability to
ask questions, and so ask

371
00:18:22,234 --> 00:18:24,837
questions in
any situation.

372
00:18:24,837 --> 00:18:29,608
Ask for the data set.

373
00:18:29,608 --> 00:18:31,944
Thinking critically about
global health metrics is

374
00:18:31,944 --> 00:18:35,414
also becoming more
important than ever.

375
00:18:35,414 --> 00:18:37,716
We have access to a
lot of health data.

376
00:18:37,716 --> 00:18:41,153
Here's an example of
sources of global health

377
00:18:41,153 --> 00:18:43,188
data on the left and, on
the right, we've listed a

378
00:18:43,188 --> 00:18:44,723
number of topics.

379
00:18:44,723 --> 00:18:48,093
And so this is because
it's important to have a

380
00:18:48,093 --> 00:18:51,196
specific research question
in mind, or general

381
00:18:51,196 --> 00:18:54,333
question about program
implementation, prior to

382
00:18:54,333 --> 00:18:57,870
sifting through the global
health data that we have

383
00:18:57,870 --> 00:19:02,441
access to in order to have
an appropriate approach.

384
00:19:02,441 --> 00:19:05,577
Well, we're going to focus
on, for some examples in

385
00:19:05,577 --> 00:19:11,183
PEPFAR is HIV/AIDS as the
global health topic, and

386
00:19:11,183 --> 00:19:14,386
the open source of global
health data are the PEPFAR

387
00:19:14,386 --> 00:19:15,921
Dashboards.

388
00:19:15,921 --> 00:19:18,524
So first, I just wanted to
see who in the room has

389
00:19:18,524 --> 00:19:21,226
heard of PEPFAR?

390
00:19:21,226 --> 00:19:22,895
Yay, okay!

391
00:19:22,895 --> 00:19:24,897
So I won't bore you with
the information that's on

392
00:19:24,897 --> 00:19:26,932
the slide, but what's been
interesting for me as a

393
00:19:26,932 --> 00:19:28,967
data person is when we
started out we had about

394
00:19:28,967 --> 00:19:33,939
50,000 people on ART and
now we talk not just about

395
00:19:33,939 --> 00:19:37,609
the 11.5 million people we
have on ART, but we also

396
00:19:37,609 --> 00:19:40,979
talk more specifically
about reduced new

397
00:19:40,979 --> 00:19:44,116
pediatric infections which
have reduced by 70 percent

398
00:19:44,116 --> 00:19:46,985
since 2000 and, for the
first time, we've been

399
00:19:46,985 --> 00:19:49,888
able to show validated
declines in adult HIV

400
00:19:49,888 --> 00:19:52,724
incidents in Malawi,
Zambia, and Zimbabwe to

401
00:19:52,724 --> 00:19:55,994
the tune of 51 to
76 percent decline.

402
00:19:55,994 --> 00:19:57,763
So we're getting more
sophisticated with the

403
00:19:57,763 --> 00:19:59,431
surveys we're using and
the type of information

404
00:19:59,431 --> 00:20:02,734
we're collecting to be
able to talk about new

405
00:20:02,734 --> 00:20:06,238
outcome reports related to
incidents or viral load

406
00:20:06,238 --> 00:20:09,141
suppression, so
that's exciting.

407
00:20:09,141 --> 00:20:12,044
Monitoring is what we
really focused on a lot

408
00:20:12,044 --> 00:20:15,080
for the general audience,
for all of our partners,

409
00:20:15,080 --> 00:20:17,583
for Congress, and
for general reports.

410
00:20:17,583 --> 00:20:19,518
You're all familiar with
this definition, but it's

411
00:20:19,518 --> 00:20:22,120
really that routine
process of collecting

412
00:20:22,120 --> 00:20:24,056
performance based
indicators that we've all

413
00:20:24,056 --> 00:20:27,993
agreed upon in the global
arena to monitor how we're

414
00:20:27,993 --> 00:20:30,696
doing.

415
00:20:30,696 --> 00:20:33,999
We consider this important
in PEPFAR because it, we

416
00:20:33,999 --> 00:20:36,602
think, and we know, that
it drives greater impact

417
00:20:36,602 --> 00:20:39,204
transparency and
accountability and, more

418
00:20:39,204 --> 00:20:42,107
importantly, we are able
to improve or partner

419
00:20:42,107 --> 00:20:45,077
performance and increase
program efficiency and

420
00:20:45,077 --> 00:20:45,777
effectiveness.

421
00:20:45,777 --> 00:20:47,679
And I'll talk a little
bit later about how we're

422
00:20:47,679 --> 00:20:50,315
defining efficiency
and effectiveness.

423
00:20:50,315 --> 00:20:52,184
You can see these
Dashboards online at

424
00:20:52,184 --> 00:20:57,723
data.pepfar.net/global,
and you can review lots of

425
00:20:57,723 --> 00:21:00,258
information there
including the results and

426
00:21:00,258 --> 00:21:02,527
targets for PEPFAR's
monitoring.

427
00:21:02,527 --> 00:21:05,330
But I wanted to use a very
specific example, and one

428
00:21:05,330 --> 00:21:09,267
of the most generic ones,
about how we critically

429
00:21:09,267 --> 00:21:14,272
consume information as it
relates to results towards

430
00:21:14,272 --> 00:21:15,173
targets.

431
00:21:15,173 --> 00:21:18,276
So one of the things we do
with many of our global

432
00:21:18,276 --> 00:21:21,880
health implementation
programs, is we set targets.

433
00:21:21,880 --> 00:21:24,216
We say we're giving you
this amount of money, we

434
00:21:24,216 --> 00:21:27,152
expect you to reach this
many individuals with this

435
00:21:27,152 --> 00:21:28,987
intervention, and then
we're going to see how you

436
00:21:28,987 --> 00:21:32,524
did at each quarter, at
each semiannual time

437
00:21:32,524 --> 00:21:35,394
point, or at the
end of the year.

438
00:21:35,394 --> 00:21:39,097
One PEPFAR indicator used
to monitor treatment

439
00:21:39,097 --> 00:21:40,499
coverage is Current on
Treatment, so it's the

440
00:21:40,499 --> 00:21:42,267
number of adults and
children currently

441
00:21:42,267 --> 00:21:43,769
receiving ART.

442
00:21:43,769 --> 00:21:47,005
And one way that PEPFAR
monitors performance is we

443
00:21:47,005 --> 00:21:49,441
say percent achievement.

444
00:21:49,441 --> 00:21:52,277
You set your target at
this, and you reached how

445
00:21:52,277 --> 00:21:53,645
many?

446
00:21:53,645 --> 00:21:57,349
So a specific example in
Tanzania, and this is a

447
00:21:57,349 --> 00:22:00,385
made-up partner name,
Superior Health Services

448
00:22:00,385 --> 00:22:04,423
provides direct service
delivery of ART for adults

449
00:22:04,423 --> 00:22:06,758
and children
in 1500 sites.

450
00:22:06,758 --> 00:22:11,196
So, at the end of the
fiscal year, they put over

451
00:22:11,196 --> 00:22:15,767
78,000 individuals on ART
which means they achieved

452
00:22:15,767 --> 00:22:17,936
93 percent of
their target.

453
00:22:17,936 --> 00:22:21,039
Now, if I want to
determine if I should

454
00:22:21,039 --> 00:22:24,209
continue to fund this
partner or if they did a

455
00:22:24,209 --> 00:22:27,279
good job, would this be
enough to answer that

456
00:22:27,279 --> 00:22:29,381
question?

457
00:22:29,381 --> 00:22:32,851
I ask you.

458
00:22:32,851 --> 00:22:35,654
I am looking for no.

459
00:22:35,654 --> 00:22:40,826
But what other information
might I want to know here?

460
00:22:40,826 --> 00:22:41,927
AUDIENCE MEMBER: Is
the target right?

461
00:22:41,927 --> 00:22:42,961
KARIN GICHUHI: Is
the target right.

462
00:22:42,961 --> 00:22:44,830
What do you mean by "is
it the target right?"

463
00:22:44,830 --> 00:22:46,798
AUDIENCE MEMBER: Are
the people there?

464
00:22:46,798 --> 00:22:48,433
KARIN GICHUHI: Are
the people there.

465
00:22:48,433 --> 00:22:50,068
So, she's bringing up a
really good point as we

466
00:22:50,068 --> 00:22:53,305
set targets based on Epi
data, people living with

467
00:22:53,305 --> 00:22:54,072
HIV.

468
00:22:54,072 --> 00:22:57,576
Do we have recent survey
data that suggests we know

469
00:22:57,576 --> 00:23:00,712
that denominator,
so that's one.

470
00:23:00,712 --> 00:23:04,015
What was their
target last year?

471
00:23:04,015 --> 00:23:06,585
Sometimes I see partners
who didn't achieve very

472
00:23:06,585 --> 00:23:12,524
much but their target was
five times more than it

473
00:23:12,524 --> 00:23:14,292
was the year before.

474
00:23:14,292 --> 00:23:16,628
And so looking at
percent is not enough.

475
00:23:16,628 --> 00:23:19,498
I often want to go back
and look at the trajectory

476
00:23:19,498 --> 00:23:22,434
over time and see that
they're improving, not

477
00:23:22,434 --> 00:23:25,270
necessarily that their
percent achievement rose

478
00:23:25,270 --> 00:23:25,971
dramatically.

479
00:23:25,971 --> 00:23:32,911
Just little things, but
the context matters.

480
00:23:32,911 --> 00:23:35,413
This is an example of
looking at treatment

481
00:23:35,413 --> 00:23:36,815
current within a Cascade.

482
00:23:36,815 --> 00:23:40,852
It's another way we
monitor the PEPFAR program

483
00:23:40,852 --> 00:23:43,655
on the pepfar.net
Dashboards, and one of the

484
00:23:43,655 --> 00:23:46,625
things we've done to make
it easily digestible for

485
00:23:46,625 --> 00:23:49,327
the consumer is we've
color coded it.

486
00:23:49,327 --> 00:23:53,298
So I can focus in on any
one of these indicators

487
00:23:53,298 --> 00:23:55,500
and you'll see that the
CURR on treatment is the

488
00:23:55,500 --> 00:23:59,704
second from the bottom,
and I see green green,

489
00:23:59,704 --> 00:24:01,740
yellow.

490
00:24:01,740 --> 00:24:05,210
So I achieved my target at
91 percent at the end of

491
00:24:05,210 --> 00:24:06,812
last year.

492
00:24:06,812 --> 00:24:10,282
At the end of this quarter
I'm still doing well, but

493
00:24:10,282 --> 00:24:13,652
I think projected by the
end of the year I'm going

494
00:24:13,652 --> 00:24:15,720
to fall into the yellow.

495
00:24:15,720 --> 00:24:19,825
So this is a really easy
way to digest information

496
00:24:19,825 --> 00:24:22,260
and I can just focus
in on these colors.

497
00:24:22,260 --> 00:24:24,830
But, if I am being a
critical consumer of the

498
00:24:24,830 --> 00:24:27,699
information, is it all
about the color and the

499
00:24:27,699 --> 00:24:28,767
percent?

500
00:24:28,767 --> 00:24:29,501
No.

501
00:24:29,501 --> 00:24:31,136
I want to go back and see
some of this additional

502
00:24:31,136 --> 00:24:32,270
information.

503
00:24:32,270 --> 00:24:34,606
And so you can see, to the
right of it, I can also

504
00:24:34,606 --> 00:24:39,344
see the actual number of
results and I would want

505
00:24:39,344 --> 00:24:41,913
to go back to the country
and ask about other issues

506
00:24:41,913 --> 00:24:43,081
for underperformance.

507
00:24:43,081 --> 00:24:44,850
Was there election
violence?

508
00:24:44,850 --> 00:24:46,184
Were the nurses on strike?

509
00:24:46,184 --> 00:24:49,387
Was there a drought?

510
00:24:49,387 --> 00:24:51,590
Did the partner just start
are have they been there

511
00:24:51,590 --> 00:24:52,858
for 15 years?

512
00:24:52,858 --> 00:24:55,260
All of the context is
really important before I

513
00:24:55,260 --> 00:25:00,065
make any type of
funding decision.

514
00:25:00,065 --> 00:25:02,868
So Context and
Methodology Matter.

515
00:25:02,868 --> 00:25:05,337
I think probably one of
the hottest topics right

516
00:25:05,337 --> 00:25:08,506
now, and I had a couple of
sessions of this earlier

517
00:25:08,506 --> 00:25:12,878
today, is about costing,
expenditure analysis, and

518
00:25:12,878 --> 00:25:15,380
what we do and don't know
when we look at that type

519
00:25:15,380 --> 00:25:17,315
of data.

520
00:25:17,315 --> 00:25:20,185
So health economic data,
as you know, presents a

521
00:25:20,185 --> 00:25:23,688
unique set of
interpretation challenges.

522
00:25:23,688 --> 00:25:26,758
It's more about context
and methodology than many

523
00:25:26,758 --> 00:25:31,229
of the other data
sets we've looked at.

524
00:25:31,229 --> 00:25:33,531
There's an old joke: You
ask a mathematician, an

525
00:25:33,531 --> 00:25:35,467
accountant, and an
economist what's two plus

526
00:25:35,467 --> 00:25:36,868
two.

527
00:25:36,868 --> 00:25:39,404
And the mathematician
says, "Four."

528
00:25:39,404 --> 00:25:42,340
And the accountant says,
"About four, more or less

529
00:25:42,340 --> 00:25:44,309
on average," and the
economist smiles and says,

530
00:25:44,309 --> 00:25:46,678
"What do you
want it to be?"

531
00:25:46,678 --> 00:25:50,215
And this is really easy to
do when there are so many

532
00:25:50,215 --> 00:25:53,818
unknowns with the inputs
and when you can also lay

533
00:25:53,818 --> 00:25:55,921
on data visualization.

534
00:25:55,921 --> 00:25:58,423
So context and methodology
matter tremendously no

535
00:25:58,423 --> 00:26:00,959
matter what data you look
at, but this is, perhaps

536
00:26:00,959 --> 00:26:03,161
exacerbated when you're
talking about health

537
00:26:03,161 --> 00:26:09,034
economic data
specifically.

538
00:26:09,034 --> 00:26:12,070
So why it is so important?

539
00:26:12,070 --> 00:26:14,205
The decisions you're
making aren't just about

540
00:26:14,205 --> 00:26:17,309
who or how many people
you'll target, but it's

541
00:26:17,309 --> 00:26:21,279
about how much money is
allocated to everything.

542
00:26:21,279 --> 00:26:25,317
So not just related to the
economy itself right now,

543
00:26:25,317 --> 00:26:28,119
but more so than ever,
global stakeholders,

544
00:26:28,119 --> 00:26:30,588
Global Fund, PEPFAR, WHO,
UNAIDS, we're all looking

545
00:26:30,588 --> 00:26:35,660
at the same information
together to make decisions

546
00:26:35,660 --> 00:26:38,096
about how we fund moving
forward, and one of the

547
00:26:38,096 --> 00:26:41,366
biggest questions we have
to be able to answer is,

548
00:26:41,366 --> 00:26:45,170
hey, country A, when we
leave, this is how much

549
00:26:45,170 --> 00:26:48,306
it's going to cost for you
to maintain and sustain

550
00:26:48,306 --> 00:26:50,508
these programs.

551
00:26:50,508 --> 00:26:52,877
We don't know how much
some of these things cost

552
00:26:52,877 --> 00:26:54,846
and it's something we're
taking a look at, and it's

553
00:26:54,846 --> 00:26:57,048
something that we're
monitoring very closely,

554
00:26:57,048 --> 00:26:59,484
but things get real when
you're talking about

555
00:26:59,484 --> 00:27:02,620
allocating billions of
dollars from one health

556
00:27:02,620 --> 00:27:06,124
program to another and
deciding whether to budget

557
00:27:06,124 --> 00:27:10,528
for certain things or how
funds are distributed.

558
00:27:10,528 --> 00:27:12,831
So without context and
methodology, we run the

559
00:27:12,831 --> 00:27:15,633
risk of misinterpretation
and misuse, so we rarely

560
00:27:15,633 --> 00:27:16,868
have the perfect data.

561
00:27:16,868 --> 00:27:20,171
Everyone admits that, but
particularly economic data

562
00:27:20,171 --> 00:27:22,340
available for decision
making and health right

563
00:27:22,340 --> 00:27:23,641
when it's needed.

564
00:27:23,641 --> 00:27:25,777
Nevertheless, with
the limited resources

565
00:27:25,777 --> 00:27:28,346
available, it's imperative
that we ensure we're

566
00:27:28,346 --> 00:27:31,850
maximizing outputs at a
given cost, ensuring that

567
00:27:31,850 --> 00:27:34,786
resources are allocated
optimally, and budgeting

568
00:27:34,786 --> 00:27:40,525
and planning prudently.

569
00:27:40,525 --> 00:27:43,294
So whenever we design a
health cost analysis,

570
00:27:43,294 --> 00:27:45,363
there's quite a few things
we need to think about.

571
00:27:45,363 --> 00:27:47,632
The type, the perspective,
the purpose, the data

572
00:27:47,632 --> 00:27:48,500
collection.

573
00:27:48,500 --> 00:27:51,403
The figure here summarizes
just a few of these

574
00:27:51,403 --> 00:27:54,272
things, and when you look
at any health economic

575
00:27:54,272 --> 00:27:56,674
data, these are exactly
what types of things that

576
00:27:56,674 --> 00:27:58,877
you should consider in
order to interpret it

577
00:27:58,877 --> 00:27:59,844
correctly.

578
00:27:59,844 --> 00:28:02,747
But the kicker: almost
certainly, none of these

579
00:28:02,747 --> 00:28:05,917
things are going to be on
a graph, a table, or in

580
00:28:05,917 --> 00:28:08,219
the data that you see.

581
00:28:08,219 --> 00:28:10,855
You have to be a critical
customer of this data,

582
00:28:10,855 --> 00:28:13,324
especially if you want to
be a responsible decision

583
00:28:13,324 --> 00:28:15,794
maker.

584
00:28:15,794 --> 00:28:17,662
So what is expenditure
analysis in the way that

585
00:28:17,662 --> 00:28:19,097
we use it?

586
00:28:19,097 --> 00:28:21,900
It's collected annually,
we look retrospectively at

587
00:28:21,900 --> 00:28:24,469
the money spent by all
the organizations we give

588
00:28:24,469 --> 00:28:27,872
money to and these
organizations report, from

589
00:28:27,872 --> 00:28:31,943
their perspective, all the
expenditures they incurred

590
00:28:31,943 --> 00:28:35,146
in the past fiscal year,
even if they bought

591
00:28:35,146 --> 00:28:39,684
investments that will
last for 10 years.

592
00:28:39,684 --> 00:28:42,554
When we've got related
results data, we link

593
00:28:42,554 --> 00:28:47,292
these expenditures to the
PEPFAR beneficiaries to

594
00:28:47,292 --> 00:28:50,495
calculate a spend
per person or unit

595
00:28:50,495 --> 00:28:54,165
expenditure.

596
00:28:54,165 --> 00:28:56,234
This data can be
tremendously helpful from

597
00:28:56,234 --> 00:28:58,803
a resource tracking
perspective and it helps

598
00:28:58,803 --> 00:29:01,172
us better understand the
expenses that the United

599
00:29:01,172 --> 00:29:03,842
States government incurs
to provide a range of HIV

600
00:29:03,842 --> 00:29:05,210
services.

601
00:29:05,210 --> 00:29:07,745
To improve accountability
and oversight of PEPFAR

602
00:29:07,745 --> 00:29:10,648
efforts by tracking
the spending and

603
00:29:10,648 --> 00:29:14,285
accomplishments over time,
we have a sense of getting

604
00:29:14,285 --> 00:29:17,288
to that question: how much
will it cost to continue

605
00:29:17,288 --> 00:29:21,426
to offer the services and
get these programs in a

606
00:29:21,426 --> 00:29:24,529
position where they can
be sustained by the host

607
00:29:24,529 --> 00:29:33,938
country government.

608
00:29:33,938 --> 00:29:36,441
Before we get into
PEPFAR-specific examples,

609
00:29:36,441 --> 00:29:41,279
I had Ramona put together
an explanation of how she

610
00:29:41,279 --> 00:29:43,515
would like to demonstrate
EA-like data.

611
00:29:43,515 --> 00:29:47,352
This graphic here shows
her personal expenditures

612
00:29:47,352 --> 00:29:48,453
for the last month.

613
00:29:48,453 --> 00:29:51,356
She's grouped it into some
broad categories and she's

614
00:29:51,356 --> 00:29:54,025
noted a couple additional
contextual details like

615
00:29:54,025 --> 00:29:56,361
her travel expenditures
might have been high

616
00:29:56,361 --> 00:29:58,696
because going to a
friend's wedding.

617
00:29:58,696 --> 00:30:02,100
Based on this data, would
you feel comfortable

618
00:30:02,100 --> 00:30:04,769
making the following
decisions?

619
00:30:04,769 --> 00:30:11,776
Your budget for
the next month?

620
00:30:11,776 --> 00:30:15,580
Does this reflect the full
cost of the support of her

621
00:30:15,580 --> 00:30:19,317
family when there are two
income earners and only

622
00:30:19,317 --> 00:30:21,786
one budget here
being shown?

623
00:30:21,786 --> 00:30:23,955
This comes up a lot in
the countries we work in

624
00:30:23,955 --> 00:30:26,991
because I can look at what
PEPFAR is doing, but I

625
00:30:26,991 --> 00:30:29,761
don't know what Global
Fund is contributing or

626
00:30:29,761 --> 00:30:31,296
other stakeholders.

627
00:30:31,296 --> 00:30:34,732
So when we look at the
overall budget as an

628
00:30:34,732 --> 00:30:39,571
input, those are
things to consider.

629
00:30:39,571 --> 00:30:41,973
Would you expect other
families to have similar

630
00:30:41,973 --> 00:30:43,775
spending patterns?

631
00:30:43,775 --> 00:30:51,816
Should you benchmark your
spending against others?

632
00:30:51,816 --> 00:30:54,152
This is an illustrative
example of how we might

633
00:30:54,152 --> 00:30:57,088
budget using historical
unit expenditure by

634
00:30:57,088 --> 00:31:01,259
multiplying the spend per
person times the target

635
00:31:01,259 --> 00:31:02,260
for the upcoming year.

636
00:31:02,260 --> 00:31:06,764
So you can see, for
instance, in prevention it

637
00:31:06,764 --> 00:31:10,235
says: My unit expenditure
for the last year was $25

638
00:31:10,235 --> 00:31:13,204
per person and if I wanted
to reach one million

639
00:31:13,204 --> 00:31:18,142
people then it's going to
cost 25 million dollars.

640
00:31:18,142 --> 00:31:21,112
Are you prepared to say
to the Ministry of Health

641
00:31:21,112 --> 00:31:26,150
that $25 per person is
the cost of the program?

642
00:31:26,150 --> 00:31:29,787
What would you say?

643
00:31:29,787 --> 00:31:37,462
Is there any information
here that's missing?

644
00:31:37,462 --> 00:31:39,764
AUDIENCE MEMBER: What is
the Ministry contributing?

645
00:31:39,764 --> 00:31:43,334
KARIN GICHUHI: What is the
ministry contributing.

646
00:31:43,334 --> 00:31:45,069
What is anyone
else contributing?

647
00:31:45,069 --> 00:31:50,541
What are the inputs here,
are those agreed upon?

648
00:31:50,541 --> 00:31:53,911
So this is at the highest,
highest level, a very

649
00:31:53,911 --> 00:31:58,116
rudimentary way of
approaching cost per

650
00:31:58,116 --> 00:32:02,086
person.

651
00:32:02,086 --> 00:32:04,656
This is one of my favorite
graphs to talk about

652
00:32:04,656 --> 00:32:07,025
because it
drives me crazy.

653
00:32:07,025 --> 00:32:12,130
This graph shows the
allocation of resources

654
00:32:12,130 --> 00:32:15,333
based on expenditure
analysis and other data

655
00:32:15,333 --> 00:32:19,570
and the explanation
of this graph says:

656
00:32:19,570 --> 00:32:24,042
expenditure per PLHIV, so
person living with HIV and

657
00:32:24,042 --> 00:32:29,113
percent of PLHIV by SNU.

658
00:32:29,113 --> 00:32:31,449
SNU is sub national unit.

659
00:32:31,449 --> 00:32:35,720
So what I'm looking at
here is the provinces are

660
00:32:35,720 --> 00:32:39,023
listed across my X axis,
the dollar amount is

661
00:32:39,023 --> 00:32:44,329
listed on the Y axis, and
you can get a sense of the

662
00:32:44,329 --> 00:32:47,865
total expenditures
per PLHIV.

663
00:32:47,865 --> 00:32:52,937
PLHIV is that
aqua colored dash.

664
00:32:52,937 --> 00:32:56,541
And so one of the
conclusions that you might

665
00:32:56,541 --> 00:33:00,345
come to in looking at
this, is that Province G

666
00:33:00,345 --> 00:33:05,049
is underfunded where you
see a very low amount of

667
00:33:05,049 --> 00:33:09,520
dollar spend, and a very
high number of PLHIV being

668
00:33:09,520 --> 00:33:14,525
reached, while Province
E and F are overfunded

669
00:33:14,525 --> 00:33:16,594
relatively.

670
00:33:16,594 --> 00:33:18,830
But having more data
or context here could

671
00:33:18,830 --> 00:33:21,366
significantly change the
way in which I interpret

672
00:33:21,366 --> 00:33:23,701
this data.

673
00:33:23,701 --> 00:33:28,940
So what else would I want
to know in this instance?

674
00:33:28,940 --> 00:33:31,542
Yeah.

675
00:33:31,542 --> 00:33:33,811
AUDIENCE MEMBER: Prior
year funding and PLHIV

676
00:33:33,811 --> 00:33:36,948
rates.

677
00:33:36,948 --> 00:33:39,083
KARIN GICHUHI: Prior year
funding and PLHIV rates?

678
00:33:39,083 --> 00:33:40,585
AUDIENCE MEMBER: And
how they have changed.

679
00:33:40,585 --> 00:33:41,586
KARIN GICHUHI: And
how they've changed.

680
00:33:41,586 --> 00:33:43,388
Yeah.

681
00:33:43,388 --> 00:33:46,224
The other thing is, when I
see Province G and I see

682
00:33:46,224 --> 00:33:48,493
that they spent so little
money but reached so many

683
00:33:48,493 --> 00:33:53,131
people, I ask myself who
else is there providing

684
00:33:53,131 --> 00:33:56,601
funding and what other
support is occurring in

685
00:33:56,601 --> 00:33:58,136
that Province?

686
00:33:58,136 --> 00:34:01,806
Or, when I see the cost
is very high and I'm not

687
00:34:01,806 --> 00:34:04,642
reaching very many people,
is it the case that my

688
00:34:04,642 --> 00:34:08,446
partner there just started
and they're opening up a

689
00:34:08,446 --> 00:34:12,750
new project in a place
we've never worked before?

690
00:34:12,750 --> 00:34:15,553
The title alone says
PLHIV, but when if I'm

691
00:34:15,553 --> 00:34:20,056
looking at the
cost of testing?

692
00:34:20,056 --> 00:34:22,759
When I test people
they're not PLHIV.

693
00:34:22,760 --> 00:34:24,829
Some of them are and
some of them aren't.

694
00:34:24,829 --> 00:34:27,063
So we have to be really
careful with not only

695
00:34:27,063 --> 00:34:32,569
explaining the inputs into
this analysis, but also

696
00:34:32,570 --> 00:34:36,774
how we want to explain
what this graph says as we

697
00:34:36,774 --> 00:34:40,344
look across the bars, and
also what title we should

698
00:34:40,344 --> 00:34:43,981
be using if the people
being touched by these

699
00:34:43,981 --> 00:34:47,351
activities are
and are not PLHIV?

700
00:34:47,351 --> 00:34:55,493
Efficiency is a very
hot topic right now.

701
00:34:55,493 --> 00:34:58,062
This graph shows the
estimated per person spend

702
00:34:58,062 --> 00:34:59,330
for each partner.

703
00:34:59,330 --> 00:35:00,998
Partners are
the blue bars.

704
00:35:00,998 --> 00:35:03,134
And the number of
beneficiaries reached,

705
00:35:03,134 --> 00:35:07,638
which are these, I guess,
they're yellow circles.

706
00:35:07,638 --> 00:35:11,909
At first glance, you'll
see two different partners

707
00:35:11,909 --> 00:35:14,579
here with very different
expenditure per person to

708
00:35:14,579 --> 00:35:19,150
reach approximately the
same number of results.

709
00:35:19,150 --> 00:35:21,486
You might assume that the
one on the left is more

710
00:35:21,486 --> 00:35:24,255
efficient, reaching the
same number of people at a

711
00:35:24,255 --> 00:35:27,692
much lower cost to
PEPFAR, but knowing the

712
00:35:27,692 --> 00:35:30,428
methodology of EA and
context should show that

713
00:35:30,428 --> 00:35:32,263
this is a false
assumption.

714
00:35:32,263 --> 00:35:36,534
For example, in this case,
one is primarily funded by

715
00:35:36,534 --> 00:35:39,604
Ministry of Health while
the other is primarily

716
00:35:39,604 --> 00:35:43,474
funded by PEPFAR and only
appears more expensive.

717
00:35:43,474 --> 00:35:49,080
So what we're showing
here, HTC is how we - an

718
00:35:49,080 --> 00:35:53,284
acronym we use for
testing, and so when we

719
00:35:53,284 --> 00:35:56,354
look at how much it costs
to test in a hospital,

720
00:35:56,354 --> 00:35:59,490
which you see on the left
compared to home base

721
00:35:59,490 --> 00:36:03,361
testing intervention, at
the hospital, all of the

722
00:36:03,361 --> 00:36:06,030
staff are paid for, the
electricity is paid for,

723
00:36:06,030 --> 00:36:08,966
the facility is paid for
and you're not traveling.

724
00:36:08,966 --> 00:36:11,836
In the community, you're
having to travel, you're

725
00:36:11,836 --> 00:36:14,305
having to hire workers,
you're having to buy the

726
00:36:14,305 --> 00:36:16,574
stock, and there are a
number of different costs

727
00:36:16,574 --> 00:36:19,677
associated with
community-based testing

728
00:36:19,677 --> 00:36:22,647
that are not associated
with facility-based

729
00:36:22,647 --> 00:36:23,781
testing.

730
00:36:23,781 --> 00:36:27,051
So, if I'm trying to
determine the best way to

731
00:36:27,051 --> 00:36:31,322
spend my dollars and I see
that that this graph shows

732
00:36:31,322 --> 00:36:37,161
me home-based testing is
more expensive, how should

733
00:36:37,161 --> 00:36:41,299
a choose to
spend my dollars?

734
00:36:41,299 --> 00:36:45,803
Is this enough to
make a decision on?

735
00:36:45,803 --> 00:36:49,106
So this is showing I'm
reaching the same number

736
00:36:49,106 --> 00:36:55,479
of individuals
with both services.

737
00:36:55,479 --> 00:37:03,754
Do I only go to
facility-based testing?

738
00:37:03,754 --> 00:37:09,827
I'm seeing some
head shakes.

739
00:37:09,827 --> 00:37:11,329
AUDIENCE MEMBER: Who
are you reaching?

740
00:37:11,329 --> 00:37:12,830
KARIN GICHUHI: Who are you
reaching at the facility?

741
00:37:12,830 --> 00:37:15,333
The people who are going
to the facility, and so

742
00:37:15,333 --> 00:37:17,468
one of the things we're
doing right now at this

743
00:37:17,468 --> 00:37:21,472
stage in the epidemic is,
we can still see that a

744
00:37:21,472 --> 00:37:24,041
percentage of the
population is infected and

745
00:37:24,041 --> 00:37:26,043
we're not reaching them,
and that percentage of the

746
00:37:26,043 --> 00:37:28,913
population is not showing
up to the clinic and

747
00:37:28,913 --> 00:37:31,716
asking to be tested.

748
00:37:31,716 --> 00:37:35,219
So context matters, and
context matters a lot when

749
00:37:35,219 --> 00:37:37,722
you're talking about
funding decisions because

750
00:37:37,722 --> 00:37:40,391
epidemics continue and
they change and they

751
00:37:40,391 --> 00:37:44,428
evolve, and if you decide
in a 10-minute discussion

752
00:37:44,428 --> 00:37:46,464
that you're no longer
going to fund one

753
00:37:46,464 --> 00:37:52,303
intervention based on a
look at a stellar graph,

754
00:37:52,303 --> 00:37:55,506
then you've lost a lot
of footwork that you've

755
00:37:55,506 --> 00:38:05,783
already made in trying
to reduce the epidemic.

756
00:38:05,783 --> 00:38:07,652
I wanted to go back
to this comment about

757
00:38:07,652 --> 00:38:11,188
critical consumer
of information.

758
00:38:11,188 --> 00:38:13,391
And I love this cartoon,
it's one of my favorite,

759
00:38:13,391 --> 00:38:16,394
and I like to ask people
to not be a Dogbert

760
00:38:16,394 --> 00:38:18,462
because there are a lot
of discussions right now

761
00:38:18,462 --> 00:38:20,965
about the importance of
Dashboard, so we have all

762
00:38:20,965 --> 00:38:23,134
this information available
to us, and if you could

763
00:38:23,134 --> 00:38:25,636
just compile it in one
place and put it on a

764
00:38:25,636 --> 00:38:28,039
Dashboard that would be
really easy for people to

765
00:38:28,039 --> 00:38:31,409
read, then we could just
go to that one place and

766
00:38:31,409 --> 00:38:34,645
make a lot of decisions.

767
00:38:34,645 --> 00:38:37,515
I won't read this for you,
but it really highlights

768
00:38:37,515 --> 00:38:42,153
the fact that sometimes we
ignore the data that's in

769
00:38:42,153 --> 00:38:44,388
front of us and make
decisions based on

770
00:38:44,388 --> 00:38:45,122
politics.

771
00:38:45,122 --> 00:38:49,894
And I hope that, as more
of you are fed information

772
00:38:49,894 --> 00:38:53,030
in the forms of graphs and
charts and data sets, and

773
00:38:53,030 --> 00:38:55,533
as more of you work with
them yourselves, that you

774
00:38:55,533 --> 00:38:58,235
will keep in mind the
importance of what we mean

775
00:38:58,235 --> 00:39:01,672
when we say: be a critical
customer of information.

776
00:39:01,672 --> 00:39:02,773
Be curious.

777
00:39:02,773 --> 00:39:05,943
Ask lots of questions, and
it's okay to ask questions

778
00:39:05,943 --> 00:39:09,180
because if the data can
be defended people will

779
00:39:09,180 --> 00:39:11,916
answer those questions.

780
00:39:11,916 --> 00:39:14,919
When you're asking for
information, please be

781
00:39:14,919 --> 00:39:17,354
clear about the question
that you want answered.

782
00:39:17,354 --> 00:39:20,024
It's really frustrating
for those people who are

783
00:39:20,024 --> 00:39:23,194
responsible for providing
the information to figure

784
00:39:23,194 --> 00:39:26,230
out what you
actually want.

785
00:39:26,230 --> 00:39:29,567
And lastly, maintain
data integrity.

786
00:39:29,567 --> 00:39:32,169
Don't cherry
pick your data.

787
00:39:32,169 --> 00:39:34,939
You may be really good at
data visualization, use it

788
00:39:34,939 --> 00:39:39,543
for good and not evil.

789
00:39:39,543 --> 00:39:41,412
And lastly, I'll go back
to the title of the

790
00:39:41,412 --> 00:39:45,583
presentation: Evidence:
Is it all about

791
00:39:45,583 --> 00:39:48,052
interpretation?

792
00:39:48,052 --> 00:39:51,655
And I would posit
that sometimes it is.

793
00:39:51,655 --> 00:39:54,458
So, depending on who's
reviewing the information

794
00:39:54,458 --> 00:39:57,261
and who's putting the
information together,

795
00:39:57,261 --> 00:39:59,597
everyone has a
responsibility here to be

796
00:39:59,597 --> 00:40:02,233
a critical consumer
and producer of that

797
00:40:02,233 --> 00:40:04,235
information.

798
00:40:04,235 --> 00:40:06,370
On this last slide, I've
just listed notes and

799
00:40:06,370 --> 00:40:10,808
attribution about where
some of these pictures and

800
00:40:10,808 --> 00:40:13,711
references came from.

801
00:40:13,711 --> 00:40:17,414
If you want additional
information, I'm happy to

802
00:40:17,414 --> 00:40:20,918
share links to other
resources for you, but at

803
00:40:20,918 --> 00:40:23,187
this time I wanted to open
it up to see if there are

804
00:40:23,187 --> 00:40:30,427
any questions.

805
00:40:30,427 --> 00:40:32,263
And they've asked if you
have a question, that you

806
00:40:32,263 --> 00:40:34,665
physically walk down
the stairs in front of

807
00:40:34,665 --> 00:40:36,300
everyone and stand in
front of the microphone

808
00:40:36,300 --> 00:40:37,101
and use the microphone.

809
00:40:37,101 --> 00:40:46,343
Okay I hope you're all
inspired to ask questions

810
00:40:46,343 --> 00:40:49,446
the next time someone
hands you a graph.

811
00:40:49,446 --> 00:40:50,481
Did that happen?

812
00:40:50,481 --> 00:40:51,882
Yes.

813
00:40:51,882 --> 00:40:52,550
Okay.

814
00:40:52,550 --> 00:40:53,350
Wonderful.

815
00:40:53,350 --> 00:40:54,318
Thank you so much
for your time.

816
00:40:54,318 --> 00:00:00,000
[Applause]


