1
00:00:00,930 --> 00:00:01,150
Hi.

2
00:00:01,150 --> 00:00:04,210
In the previous module, we started talking
about functional connectivity.

3
00:00:04,210 --> 00:00:09,010
So we talked about methods that use
correlation and partial correlation.

4
00:00:09,010 --> 00:00:11,370
In this module, we'll talk about a
different class of ways

5
00:00:11,370 --> 00:00:13,520
of measuring functional connectivity that
are

6
00:00:13,520 --> 00:00:16,220
based on using multivariate decomposition
methods.

7
00:00:17,690 --> 00:00:22,450
So we often use multivariate decomposition
methods to study functional connectivity.

8
00:00:22,450 --> 00:00:23,880
The reason is that they provide a

9
00:00:23,880 --> 00:00:26,500
decomposition of the data into separate
components.

10
00:00:26,500 --> 00:00:30,380
And this can be used to define coherent
brain networks and also

11
00:00:30,380 --> 00:00:32,300
provide information about how these
different

12
00:00:32,300 --> 00:00:34,130
brain regions interact with one another.

13
00:00:35,700 --> 00:00:39,804
Now, the two most common decomposition
methods that are used in fMRI data

14
00:00:39,804 --> 00:00:42,656
analysis are principal components
analysis, or

15
00:00:42,656 --> 00:00:46,380
PCA, and independent components analysis,
or ICA.

16
00:00:46,380 --> 00:00:47,480
There are other methods that are also

17
00:00:47,480 --> 00:00:49,950
used, such as partially squares method,
but

18
00:00:49,950 --> 00:00:53,080
these are the two most commonly used, and
we'll focus on them in this module.

19
00:00:55,710 --> 00:00:58,020
Throughout, we're going to organize the
data in a slightly different

20
00:00:58,020 --> 00:01:00,800
manner than we did when we were doing GLM
analysis.

21
00:01:00,800 --> 00:01:04,050
Here we're going to organize the fMRI data
over all voxels

22
00:01:04,050 --> 00:01:07,710
in the brain, as an M by N matrix X.

23
00:01:07,710 --> 00:01:09,730
So here, the row dimension is the number
of time

24
00:01:09,730 --> 00:01:12,550
points, and the column dimension
represents the number of voxels.

25
00:01:13,580 --> 00:01:16,580
So pictorially, we can represent X as
follows:

26
00:01:16,580 --> 00:01:21,030
where we have a time by voxels and matrix.

27
00:01:21,030 --> 00:01:25,740
So, to differentiate this from the GLM, we
would, would, in the GLM analysis we would

28
00:01:25,740 --> 00:01:30,650
just look at one column of X at a time
when we were, building these GLM models.

29
00:01:30,650 --> 00:01:32,980
Here we're looking at all the voxels at
the same time.

30
00:01:35,580 --> 00:01:40,020
Now principal components analysis is a
multivariate procedure which is concerned

31
00:01:40,020 --> 00:01:42,450
with explaining the variance-covariance
structure

32
00:01:42,450 --> 00:01:44,550
of a high dimensional random vector.

33
00:01:44,550 --> 00:01:47,240
So we have a high dimensional random
vector, and we want to find the

34
00:01:47,240 --> 00:01:49,720
direction that ex, in that vector that

35
00:01:49,720 --> 00:01:53,180
explains most of the variance, covariance
structure.

36
00:01:53,180 --> 00:01:57,780
In PCA a set of correlated variables are
transformed, or rotated, to a set of

37
00:01:57,780 --> 00:02:00,620
uncorrelated variables that are ordered by
the amount

38
00:02:00,620 --> 00:02:02,670
of variability in the data that they
explain.

39
00:02:04,140 --> 00:02:08,090
And so in fMRI principal components
analysis involves finding

40
00:02:08,090 --> 00:02:11,550
what they call spatial modes, or
eigenimages, in the data.

41
00:02:13,480 --> 00:02:15,505
These patterns, these are patterns that
account for

42
00:02:15,505 --> 00:02:18,240
most of the variance-covariance structure
in the data.

43
00:02:18,240 --> 00:02:20,830
So basically, the first principal
component

44
00:02:20,830 --> 00:02:23,150
corresponds to the first eigenimage, which
is

45
00:02:23,150 --> 00:02:27,330
the pattern that account for most of the
variance-covariance structure in the data.

46
00:02:27,330 --> 00:02:31,290
The second eigenimage corresponds to that,
that explains the second

47
00:02:31,290 --> 00:02:34,720
most variance-covariance conditional, that
it's

48
00:02:34,720 --> 00:02:37,250
uncorrelated with the first eigenimage,
etcetera.

49
00:02:40,040 --> 00:02:42,860
Now, the nice thing about these
eigenimages is that they're ranked

50
00:02:42,860 --> 00:02:46,310
in the, in order of the amount of
variation that they explain.

51
00:02:46,310 --> 00:02:47,930
So that tells us a little bit about

52
00:02:47,930 --> 00:02:50,510
which eigenimages are the most important
and what

53
00:02:53,360 --> 00:02:53,560
not.

54
00:02:53,560 --> 00:02:55,450
The nice thing is that these eigenimages
can

55
00:02:55,450 --> 00:02:59,680
be obtained using singular value
decomposition, or SVD, which

56
00:02:59,680 --> 00:03:03,240
decomposes the data into two sets of
orthogonal vectors

57
00:03:03,240 --> 00:03:06,390
that ultimate correspond to patterns in
space and time.

58
00:03:07,460 --> 00:03:10,240
Principal component analysis is very
commonly

59
00:03:10,240 --> 00:03:11,750
used in statistics, but singular value

60
00:03:11,750 --> 00:03:13,470
decomposition is commonly used in linear

61
00:03:13,470 --> 00:03:15,850
algebra and, and lots of other
disciplines.

62
00:03:15,850 --> 00:03:20,120
And there's a nice duality between PCA
and, and SVD, which allows

63
00:03:20,120 --> 00:03:22,244
us to basically perform SVD analysis

64
00:03:22,244 --> 00:03:24,350
to get these principal components or
eigenimages.

65
00:03:27,680 --> 00:03:32,630
In general, singular value decomposition
is an operation that takes a matrix

66
00:03:32,630 --> 00:03:38,400
X such as the one that we have, and
decomposes that into three new matrices.

67
00:03:38,400 --> 00:03:41,740
One called U, one called S and one called
V.

68
00:03:43,050 --> 00:03:45,210
U and V have certain properties.

69
00:03:45,210 --> 00:03:47,380
One is that V transpose V is equal to the

70
00:03:47,380 --> 00:03:50,850
identity, and U transpose U is equal to
the identity.

71
00:03:50,850 --> 00:03:53,370
So they both are normal.

72
00:03:53,370 --> 00:03:59,180
Also, S is a diagonal matrix whose
elements are called singular values, so

73
00:03:59,180 --> 00:04:03,550
S is 0 everywhere except in the diagonal
where it takes these singular values.

74
00:04:04,910 --> 00:04:09,810
And these singular values are often ranked
from largest to smallest.

75
00:04:09,810 --> 00:04:14,870
So in the, in the first column, first row,
we have the largest singular value, the

76
00:04:14,870 --> 00:04:16,920
second column second row, we have the next

77
00:04:16,920 --> 00:04:20,049
largest, etc, etc, as we go down the
diagonal.

78
00:04:22,320 --> 00:04:27,010
So basically, you're performing a single
value decomposition of the data x.

79
00:04:27,010 --> 00:04:31,270
We can decompose x into three matrices, u,
s, and v.

80
00:04:32,350 --> 00:04:37,040
And so we can write this as x equal to u,
s, v, transpose,

81
00:04:37,040 --> 00:04:42,420
or alternatively, we can break it down
because we have this diagonal format on s

82
00:04:42,420 --> 00:04:49,720
as the first singular value of s1 times
the first, column of u, which we're

83
00:04:49,720 --> 00:04:54,739
going to call u1 times the first column of
v, which we're going to call v1 transpose.

84
00:04:56,930 --> 00:04:57,760
Okay?

85
00:04:57,760 --> 00:05:03,490
And so basically, each of these gives us
some information about X.

86
00:05:03,490 --> 00:05:08,670
Interestingly, the columns of v, the v1 up
to vN,

87
00:05:08,670 --> 00:05:13,580
correspond to these eigenimages that we're
saying explain most of the

88
00:05:13,580 --> 00:05:17,810
variance, covariance matrix in order here,
and the, the columns

89
00:05:17,810 --> 00:05:21,680
of u correspond to the time courses
corresponding to those Eigenimages.

90
00:05:24,130 --> 00:05:30,240
So basically what the last equation shows
is that x can be decomposed into a number

91
00:05:30,240 --> 00:05:35,640
of different matrices, which all contain a
bit of information about the data.

92
00:05:36,830 --> 00:05:40,150
So here we see that in a different format
and so, and this

93
00:05:40,150 --> 00:05:45,990
real data is equal to series of matrices
as follows in these green blocks.

94
00:05:45,990 --> 00:05:48,590
And so, each of these matrices consists of
three components.

95
00:05:48,590 --> 00:05:51,860
One is s1, which is a singular value.

96
00:05:51,860 --> 00:05:55,120
The other is u1, which is a time course.

97
00:05:55,120 --> 00:05:59,400
And then finally, we have V1 transplus
which is the eigenimage.

98
00:05:59,400 --> 00:06:02,397
The second matrix consists of s2, which is

99
00:06:02,397 --> 00:06:05,799
scalar and a singular value, times u2,
which is

100
00:06:05,799 --> 00:06:08,958
a time course, times v to transpose, which
is

101
00:06:08,958 --> 00:06:13,108
the second eigenimage, et cetera, et
cetera, et cetera.

102
00:06:13,108 --> 00:06:16,496
So what we can do is if we take v1, and
we, now it's a

103
00:06:16,496 --> 00:06:18,806
vector, but if we reshape it into an

104
00:06:18,806 --> 00:06:22,443
image format, and basically, we get the
eigenimage.

105
00:06:22,443 --> 00:06:26,485
And so this is the, the, the pattern of
activation that

106
00:06:26,485 --> 00:06:31,662
explains most of the variance covariance
structure in the data set X.

107
00:06:31,662 --> 00:06:36,360
Now we see that U1 corresponds to the time
course that we see, in this

108
00:06:36,360 --> 00:06:41,686
particular eigenimage, and we see it looks
like a sort of drift of some sort.

109
00:06:41,686 --> 00:06:44,474
Similarly, we can reshape V2 to get the

110
00:06:44,474 --> 00:06:49,160
second eigenimage and that shows a, a
different pattern.

111
00:06:49,160 --> 00:06:52,850
And that pattern cor, has a time course
that corresponds to U2.

112
00:06:54,010 --> 00:06:57,120
So it's pretty neat here that we've
gotten, we've taken this data,

113
00:06:57,120 --> 00:07:01,710
X, and now, we've decomposed it into, you
know, a series of

114
00:07:01,710 --> 00:07:06,100
spatial patterns and corresponding time
courses, which we can use to interpret

115
00:07:06,100 --> 00:07:09,200
and try to figure out what's going on in
this data set.

116
00:07:09,200 --> 00:07:11,960
And we did this without even knowing what
the status that was all about.

117
00:07:11,960 --> 00:07:14,590
So this is a very data driven way of
analyzing the data.

118
00:07:15,990 --> 00:07:20,470
Here's some example, and this is taken
from web page of the late Keith Worsley.

119
00:07:20,470 --> 00:07:22,510
Here we,we see for this data set

120
00:07:22,510 --> 00:07:27,650
the corresponding first four, eigenimages
as now.

121
00:07:27,650 --> 00:07:30,640
We'll see, it's really more of a three
dimensional entity that way.

122
00:07:30,640 --> 00:07:32,204
So we have a bunch of slices here.

123
00:07:32,204 --> 00:07:36,413
And then we have the time course
corresponding to that, and we can also,

124
00:07:36,413 --> 00:07:38,897
using the singular values, we can compute

125
00:07:38,897 --> 00:07:42,172
the percent of variance explained by each
eigenimage.

126
00:07:42,172 --> 00:07:46,685
So the first eigenimage explains about 15%
of the variation.

127
00:07:46,685 --> 00:07:50,267
The second, 8% of the variation, etcetera,
etcetera.

128
00:07:50,267 --> 00:07:53,030
>> So they're ranked in order of, you
know, relative importance.

129
00:07:54,510 --> 00:07:57,050
So, this is a nice way of just decomposing

130
00:07:57,050 --> 00:08:00,000
an unknown data set into a bunch of
spacial and

131
00:08:00,000 --> 00:08:02,260
temporal components, which we later can
analyze and try

132
00:08:02,260 --> 00:08:04,190
to figure out what's going on in the data
set.

133
00:08:04,190 --> 00:08:08,696
So, that's basically what we try to do in
these multi-variant decomposition methods.

134
00:08:09,820 --> 00:08:12,010
The second type of method that we often
use

135
00:08:12,010 --> 00:08:16,850
in this class are, is independent
component analysis, or ICA.

136
00:08:16,850 --> 00:08:20,930
So, ICA is a family of techniques used to
extract independent signals from

137
00:08:20,930 --> 00:08:25,690
some source signal, and so it's often used
in some blind source separation problems.

138
00:08:25,690 --> 00:08:28,110
So, ICA provides a method to blindly

139
00:08:28,110 --> 00:08:30,890
separate the data into spatially
independent components.

140
00:08:32,640 --> 00:08:35,680
So the key assumption here is that the
data set consists

141
00:08:35,680 --> 00:08:40,890
of p spatial independent components, which
are linearly mixed and spatially fixed.

142
00:08:40,890 --> 00:08:44,160
So this differentiates a little bit from
PCA, where we said

143
00:08:44,160 --> 00:08:48,300
that the spatial components had to be at
uncorrelated with each other.

144
00:08:48,300 --> 00:08:51,329
Here they need to be independent which is
a slightly stronger condition.

145
00:08:53,400 --> 00:08:58,450
So in general, the ICA model decomposes X
into two different matrices, A and S.

146
00:08:59,940 --> 00:09:05,890
Here, A is often referred to as the mixing
matrix and S as the source matrix.

147
00:09:05,890 --> 00:09:09,790
And the goal here is, you want to find
these sources.

148
00:09:09,790 --> 00:09:13,650
So our goal is to find an un-mixing matrix
W such that

149
00:09:13,650 --> 00:09:17,600
Y is equal to W X provides a good
approximation to S.

150
00:09:18,670 --> 00:09:20,740
So basically what we want to do is, our

151
00:09:20,740 --> 00:09:24,760
goal is to find these independent sources
S, but

152
00:09:24,760 --> 00:09:29,590
it's kind of a tricky problem because here
we have X, which is equal to A times S.

153
00:09:30,910 --> 00:09:33,050
Now if A were known, if the mixing

154
00:09:33,050 --> 00:09:35,680
matrix is known, this problem is really
straightforward.

155
00:09:35,680 --> 00:09:39,330
We would just the, you know, the inverse
of x of some sort, or, or do a

156
00:09:39,330 --> 00:09:41,570
least wear solution or, or, or something
like

157
00:09:41,570 --> 00:09:43,700
that, depending on the rank of the mixing
matrix.

158
00:09:44,830 --> 00:09:47,150
However, in ICA we try to solve

159
00:09:47,150 --> 00:09:50,480
this problem without knowing the mixing
parameters.

160
00:09:50,480 --> 00:09:55,570
So we're basically assuming that both A
ans S are unknown.

161
00:09:55,570 --> 00:09:58,170
So this is a very difficult problem, and
we

162
00:09:58,170 --> 00:10:01,270
can't solve this without making additional
assumptions, so some

163
00:10:01,270 --> 00:10:04,420
constraints on the problem, and in ICA,
the key

164
00:10:04,420 --> 00:10:09,160
assumptions are that the, the mixing of
sources is linear.

165
00:10:09,160 --> 00:10:14,710
Second is that the components S I, so, so
basically the, the rows of S are

166
00:10:14,710 --> 00:10:17,630
going to be statistically independent of
one another, and

167
00:10:19,170 --> 00:10:22,000
then finally, the components S I are
non-Gaussian.

168
00:10:22,000 --> 00:10:25,070
So this can be relaxed to only one
component is

169
00:10:25,070 --> 00:10:28,250
allowed to be Gaussian, but all other have
to be non-Gaussian.

170
00:10:28,250 --> 00:10:31,700
So using those assumptions we're actually
able solve

171
00:10:31,700 --> 00:10:33,149
this problem in a, in a nice manner.

172
00:10:34,780 --> 00:10:37,680
So how do we perform ICA for fMRI?

173
00:10:37,680 --> 00:10:42,940
Well lets assume that the fMRI data can be
modeled by identifying sets of voxels,

174
00:10:42,940 --> 00:10:45,820
whose activity vary both together over
time and

175
00:10:45,820 --> 00:10:47,770
are different from the activity in other
sets.

176
00:10:49,270 --> 00:10:52,430
So what we do here is we typically try to
decompose the data set

177
00:10:52,430 --> 00:10:54,870
into a set of spatially independent
component

178
00:10:54,870 --> 00:10:57,590
maps with a set of corresponding time
courses.

179
00:10:58,740 --> 00:11:00,600
So here we have the cartoon again.

180
00:11:00,600 --> 00:11:04,440
Here we have our data set X, which is time
by voxels,

181
00:11:04,440 --> 00:11:06,800
and so what we want to do is we want to
se, we

182
00:11:06,800 --> 00:11:11,040
want to split this into two matrices, A
and S, where A

183
00:11:11,040 --> 00:11:14,560
is what we call the mixing matrix, and S
is the source matrix.

184
00:11:14,560 --> 00:11:16,170
And what's going to happen here is much
like

185
00:11:16,170 --> 00:11:19,710
PCA, the columns of A are going to
correspond

186
00:11:19,710 --> 00:11:23,010
to time courses, while the rows of S

187
00:11:23,010 --> 00:11:26,590
are going to correspond to spatially
independent components.

188
00:11:26,590 --> 00:11:29,130
And so what we do is we use an ICA

189
00:11:29,130 --> 00:11:33,710
algorithm to find both A and S using
simply the data.

190
00:11:33,710 --> 00:11:36,670
So, X is, is the data that we, we observe.

191
00:11:36,670 --> 00:11:40,390
We make assumptions about S, you know,
the, and, and,

192
00:11:40,390 --> 00:11:43,510
and then we use this to define both A and
S.

193
00:11:45,250 --> 00:11:47,720
Here's an example of the results.

194
00:11:47,720 --> 00:11:54,500
And, so here is one row of S that's kind
of strowing out to give us a spatial map.

195
00:11:54,500 --> 00:11:56,950
And here's the corresponding time course.

196
00:11:56,950 --> 00:11:58,940
And this is obtained through the MATLAB
toolbot gift.

197
00:11:58,940 --> 00:12:04,150
And as we'll see here there seems to be
some sort of kind of periodic signal here,

198
00:12:04,150 --> 00:12:05,770
if you look at the time course, and

199
00:12:05,770 --> 00:12:08,280
there's activation in the visual and the
motor tasks.

200
00:12:08,280 --> 00:12:12,500
So this was a visual-motor task where they
repeatedly got, you know,

201
00:12:12,500 --> 00:12:15,830
did some finger tapping, and got the, a
light flashed in their eyes.

202
00:12:17,340 --> 00:12:21,580
Here's another independent component which
is, in this case looks

203
00:12:21,580 --> 00:12:24,610
like its around the, the boundaries of, of
the skulls.

204
00:12:24,610 --> 00:12:27,710
So this might be movement related or
something like

205
00:12:27,710 --> 00:12:30,240
that, and so its probably some sort of
nuisance signal.

206
00:12:30,240 --> 00:12:33,100
So this is something that we can pick up
using ICA analysis.

207
00:12:35,440 --> 00:12:40,200
So, unlike PCA which assumes orthomorality
constraint, ICA is

208
00:12:40,200 --> 00:12:43,920
that assumes statistical independence
among the collection of spatial patterns.

209
00:12:44,930 --> 00:12:48,860
So, independence is a stronger statistical
requirement than orthonormality.

210
00:12:50,060 --> 00:12:54,200
However, in ICA the spatially independent
components are not ranked

211
00:12:54,200 --> 00:12:57,230
in order of importance as they were when
performing the PCA.

212
00:12:57,230 --> 00:13:00,830
So in the PCA analysis, the first
eigenimage is

213
00:13:00,830 --> 00:13:03,910
the one that explains most of the variance
programs matrix.

214
00:13:03,910 --> 00:13:07,670
In ICA, the components are varied
permutation, so we have

215
00:13:07,670 --> 00:13:10,550
no real, they're not ranked by any real
rhyme or reason.

216
00:13:10,550 --> 00:13:12,810
So we have to sift through or find some

217
00:13:12,810 --> 00:13:15,679
alternative way to classify them in order
of importance.

218
00:13:17,630 --> 00:13:20,320
So in general these multivariable
decomposition methods

219
00:13:20,320 --> 00:13:22,500
which is PCA and ICA are extremely

220
00:13:22,500 --> 00:13:24,000
useful when we don't have so much

221
00:13:24,000 --> 00:13:28,170
information about what's going on in the
experiment.

222
00:13:28,170 --> 00:13:32,200
For example, when we're doing task for
fMRI or resting state fMRI.

223
00:13:32,200 --> 00:13:35,510
They can also be very useful when, you
know, unexpected things are happening in

224
00:13:35,510 --> 00:13:40,750
the signal or, or, you, that we can can't
model using in a GLM analysis.

225
00:13:40,750 --> 00:13:42,800
So this allows us to kind of look through

226
00:13:42,800 --> 00:13:44,910
the data and try to figure out what's
going on.

227
00:13:44,910 --> 00:13:48,660
And so they're actually very, very useful
to use in that, and they're also

228
00:13:48,660 --> 00:13:52,200
very good for finding, you know, artifacts
in the data and things like that.

229
00:13:55,530 --> 00:13:57,310
Okay, so that's the end of this module.

230
00:13:57,310 --> 00:13:59,320
It was a very brief introduction to

231
00:13:59,320 --> 00:14:01,830
multivariate decomposition methods such as
PCA and

232
00:14:01,830 --> 00:14:04,262
ICA, and they're very useful ways for

233
00:14:04,262 --> 00:14:08,690
fin, detecting functional connectivity,
and things like that.

234
00:14:08,690 --> 00:14:12,280
In the next module we'll start talking a
little bit about effective connectivity.

235
00:14:12,280 --> 00:14:13,750
Okay, I'll see you then.

236
00:14:13,750 --> 00:14:13,850
Bye.


