1
00:00:14,058 --> 00:00:18,149
welcome to this fourth lecture in the

2
00:00:18,149 --> 00:00:19,500
series of lectures on numerical

3
00:00:19,500 --> 00:00:24,599
optimization so so far in the last two

4
00:00:24,599 --> 00:00:26,879
lectures we saw some mathematical

5
00:00:26,879 --> 00:00:30,079
background needed for this course and

6
00:00:30,079 --> 00:00:34,530
some of those concepts will be useful

7
00:00:34,530 --> 00:00:38,039
for understanding from the theory that

8
00:00:38,039 --> 00:00:41,399
we are going to see today especially the

9
00:00:41,399 --> 00:00:43,878
concepts on differential calculus and

10
00:00:43,878 --> 00:00:49,308
some analysis so in today's lecture we

11
00:00:49,308 --> 00:00:53,090
will look at unconstrained optimization

12
00:00:53,090 --> 00:00:57,539
problems how how to solve those problems

13
00:00:57,539 --> 00:01:00,149
what are the necessary and sufficient

14
00:01:00,149 --> 00:01:01,920
conditions for the solutions of those

15
00:01:01,920 --> 00:01:07,439
problems so to begin with let us look at

16
00:01:07,439 --> 00:01:11,790
a constrained optimization problem which

17
00:01:11,790 --> 00:01:17,280
is given here so you have a function f

18
00:01:17,280 --> 00:01:22,078
from the domain X to R and X is a subset

19
00:01:22,078 --> 00:01:26,280
of RN and our aim is to minimize the

20
00:01:26,280 --> 00:01:30,540
function f where X belongs to the set X

21
00:01:30,540 --> 00:01:32,040
so this is a constrained optimization

22
00:01:32,040 --> 00:01:35,368
problem but any unconstrained

23
00:01:35,368 --> 00:01:37,618
optimization problem is a special case

24
00:01:37,618 --> 00:01:39,840
of this constrained optimization problem

25
00:01:39,840 --> 00:01:42,840
in the sense that if I replace this X by

26
00:01:42,840 --> 00:01:45,868
RN then it becomes a unconstrained

27
00:01:45,868 --> 00:01:47,909
optimization problem so our aim is to

28
00:01:47,909 --> 00:01:50,250
solve this problem and for that we need

29
00:01:50,250 --> 00:01:55,769
the definition of a minimum ok so X star

30
00:01:55,769 --> 00:01:59,280
and belonging to the set X is said to be

31
00:01:59,280 --> 00:02:03,060
a global minimum of F over X if the

32
00:02:03,060 --> 00:02:05,938
value of x star the value of the

33
00:02:05,938 --> 00:02:09,899
function at X star is less than or equal

34
00:02:09,899 --> 00:02:11,639
to the value of the function at any

35
00:02:11,639 --> 00:02:14,490
other point in the set X so such a point

36
00:02:14,490 --> 00:02:17,400
X star is said to be a global minimum

37
00:02:17,400 --> 00:02:22,310
for example let us consider a problem

38
00:02:22,310 --> 00:02:26,489
where you have a function on the Left

39
00:02:26,489 --> 00:02:28,259
panel which is shown in the black line

40
00:02:28,259 --> 00:02:30,840
and let us assume that the constraint

41
00:02:30,840 --> 00:02:35,519
set is only this interval then you will

42
00:02:35,519 --> 00:02:38,519
see that the global minimum of this

43
00:02:38,519 --> 00:02:42,840
problem is it occurs at this X and the

44
00:02:42,840 --> 00:02:44,579
corresponding minimum value of the

45
00:02:44,579 --> 00:02:47,189
objective function is shown here now on

46
00:02:47,189 --> 00:02:48,479
the right panel you will see another

47
00:02:48,479 --> 00:02:51,598
function which is not as nice as the

48
00:02:51,598 --> 00:02:53,759
function on the Left panel so you will

49
00:02:53,759 --> 00:02:55,680
see that there are lots of peaks and

50
00:02:55,680 --> 00:02:58,079
lots of valleys and suppose we restrict

51
00:02:58,079 --> 00:03:02,098
ourselves to this interval and assuming

52
00:03:02,098 --> 00:03:06,750
that the function goes to plus infinity

53
00:03:06,750 --> 00:03:09,209
beyond these points although it is not

54
00:03:09,209 --> 00:03:11,280
of interest to us we are only interested

55
00:03:11,280 --> 00:03:14,400
in this small interval and you will see

56
00:03:14,400 --> 00:03:17,068
that in this interval this is the point

57
00:03:17,068 --> 00:03:19,789
where the function achieves the least

58
00:03:19,789 --> 00:03:23,549
value and the corresponding X star turns

59
00:03:23,549 --> 00:03:25,530
out to be this so among all possible

60
00:03:25,530 --> 00:03:29,639
points in the domain of that function on

61
00:03:29,639 --> 00:03:31,699
which we want to minimize the function

62
00:03:31,699 --> 00:03:34,560
this is the point where the function ID

63
00:03:34,560 --> 00:03:35,098
achieves

64
00:03:35,098 --> 00:03:38,299
the least value so this is called a

65
00:03:38,299 --> 00:03:44,400
global minimum similarly one can define

66
00:03:44,400 --> 00:03:47,789
what is called a global maximum so in

67
00:03:47,789 --> 00:03:49,318
the definition of global maximum this

68
00:03:49,318 --> 00:03:51,810
inequality will will be reversed so f of

69
00:03:51,810 --> 00:03:54,120
X star greater than or equal to f of X

70
00:03:54,120 --> 00:03:57,120
for all X belongs to X for X star to be

71
00:03:57,120 --> 00:04:01,439
a global maximum now in this course many

72
00:04:01,439 --> 00:04:03,530
times I will be talking about

73
00:04:03,530 --> 00:04:06,090
minimization problems of this type but

74
00:04:06,090 --> 00:04:09,180
the ideas that we discuss here can be

75
00:04:09,180 --> 00:04:12,120
easily extended by writing to the

76
00:04:12,120 --> 00:04:13,919
maximization problem by writing a

77
00:04:13,919 --> 00:04:15,750
maximization problem as a minimization

78
00:04:15,750 --> 00:04:20,430
problem which we saw how to do that was

79
00:04:20,430 --> 00:04:24,990
seen in the first lecture now the

80
00:04:24,990 --> 00:04:26,759
question is that are there any

81
00:04:26,759 --> 00:04:31,139
conditions on F or on X or both

82
00:04:31,139 --> 00:04:34,560
so that the function f does attain its

83
00:04:34,560 --> 00:04:38,040
maximum or minimum in the set X so can

84
00:04:38,040 --> 00:04:40,889
we have some conditions on this F as

85
00:04:40,889 --> 00:04:45,810
well as on X so that the the minimum or

86
00:04:45,810 --> 00:04:48,050
maximum is guaranteed in the set X now

87
00:04:48,050 --> 00:04:51,240
for that let us look at some examples so

88
00:04:51,240 --> 00:04:55,050
let us take a function FX equal to X

89
00:04:55,050 --> 00:04:59,550
cube where the domain of the function is

90
00:04:59,550 --> 00:05:01,800
a set of real numbers and the function

91
00:05:01,800 --> 00:05:04,920
is defined from R to R okay now the

92
00:05:04,920 --> 00:05:09,149
function is shown here in this plot so

93
00:05:09,149 --> 00:05:11,360
on the right side the function goes to

94
00:05:11,360 --> 00:05:14,639
plus infinity as X goes to plus infinity

95
00:05:14,639 --> 00:05:16,410
and on the left side the function goes

96
00:05:16,410 --> 00:05:18,600
to when X goes to minus infinity the

97
00:05:18,600 --> 00:05:21,509
function goes to minus infinity now you

98
00:05:21,509 --> 00:05:25,170
will see that over the domain X equal to

99
00:05:25,170 --> 00:05:27,839
R this function has neither a minimum

100
00:05:27,839 --> 00:05:29,639
nor a maximum because the function

101
00:05:29,639 --> 00:05:31,410
extends to plus infinity and minus

102
00:05:31,410 --> 00:05:37,379
infinity so it has no minimum or no

103
00:05:37,379 --> 00:05:41,550
maximum now in this case if you look at

104
00:05:41,550 --> 00:05:44,189
the domain set which is X that's a

105
00:05:44,189 --> 00:05:47,579
closed set the set of real numbers R

106
00:05:47,579 --> 00:05:49,740
that's a closed set but it's not a

107
00:05:49,740 --> 00:05:54,839
bounded set so so that means naturally X

108
00:05:54,839 --> 00:05:58,589
is not a compact set now do we want X to

109
00:05:58,589 --> 00:06:03,029
be bounded and not necessarily closed we

110
00:06:03,029 --> 00:06:04,769
do not know as of now let us see another

111
00:06:04,769 --> 00:06:13,470
example now here is an example where the

112
00:06:13,470 --> 00:06:18,379
domain X is a open interval A to B and

113
00:06:18,379 --> 00:06:21,149
the function f is from X to R and

114
00:06:21,149 --> 00:06:24,209
defined as f of X equal to X so the

115
00:06:24,209 --> 00:06:25,439
function is shown in this figure

116
00:06:25,439 --> 00:06:29,040
and so you will see that these endpoints

117
00:06:29,040 --> 00:06:31,620
are not included in the domain so that's

118
00:06:31,620 --> 00:06:36,000
why they are shown using some special

119
00:06:36,000 --> 00:06:41,360
symbols now if you look at this domain

120
00:06:41,360 --> 00:06:44,639
this domain is now bounded so

121
00:06:44,639 --> 00:06:47,459
like the previous case where the domain

122
00:06:47,459 --> 00:06:50,149
was not bounded here the domain is

123
00:06:50,149 --> 00:06:54,810
bounded but it is not closed so I again

124
00:06:54,810 --> 00:06:57,778
X is not a compact set now you will see

125
00:06:57,778 --> 00:07:04,110
that if attends neither a minimum nor a

126
00:07:04,110 --> 00:07:08,550
maximum on X because F of a and F of B

127
00:07:08,550 --> 00:07:12,589
are not defined so we cannot have a

128
00:07:12,589 --> 00:07:17,490
minimum value of the function in this

129
00:07:17,490 --> 00:07:21,538
open interval A to B but in this case

130
00:07:21,538 --> 00:07:25,468
you can get a bound on this minimum and

131
00:07:25,468 --> 00:07:28,550
maximum values and those are called

132
00:07:28,550 --> 00:07:32,519
infimum and supremum so the infimum of

133
00:07:32,519 --> 00:07:37,408
this function is f of a which is

134
00:07:37,408 --> 00:07:40,139
attained at X equal to a and the

135
00:07:40,139 --> 00:07:42,209
supremum of this function is f of B

136
00:07:42,209 --> 00:07:47,430
which is attained at B but remember that

137
00:07:47,430 --> 00:07:50,728
a and B do not belong to the domain so

138
00:07:50,728 --> 00:07:52,918
we cannot say that the minimum or

139
00:07:52,918 --> 00:07:55,649
maximum is attained but in this case the

140
00:07:55,649 --> 00:08:02,668
infimum and supremum are possible so so

141
00:08:02,668 --> 00:08:04,468
far we have considered two examples

142
00:08:04,468 --> 00:08:09,778
where the domain X in one case was not

143
00:08:09,778 --> 00:08:12,750
bounded but closed in the other case it

144
00:08:12,750 --> 00:08:16,439
was bounded but not closed so that that

145
00:08:16,439 --> 00:08:18,240
essentially means that the domain was

146
00:08:18,240 --> 00:08:24,930
not a compact set now so does that does

147
00:08:24,930 --> 00:08:28,019
that mean that do we need the X to be

148
00:08:28,019 --> 00:08:32,219
compact to attain a minimum okay so let

149
00:08:32,219 --> 00:08:34,948
us look at one example so let us

150
00:08:34,948 --> 00:08:38,190
consider domain X to be the closed

151
00:08:38,190 --> 00:08:41,190
interval minus 1 to 1 now it is a closed

152
00:08:41,190 --> 00:08:44,759
set as well as a bounded set so hence it

153
00:08:44,759 --> 00:08:47,220
is a compact set and let us define a

154
00:08:47,220 --> 00:08:50,909
function f from X to R as f of X equal

155
00:08:50,909 --> 00:08:53,818
to X if X is in the open interval minus

156
00:08:53,818 --> 00:08:57,269
1 to 1 and 0 otherwise so in the open

157
00:08:57,269 --> 00:08:58,590
interval minus 1/2

158
00:08:58,590 --> 00:09:02,009
the function is a straight line and at

159
00:09:02,009 --> 00:09:04,700
the endpoints the function has a value

160
00:09:04,700 --> 00:09:10,379
zero so clearly you will see that the

161
00:09:10,379 --> 00:09:12,870
function is a discontinuous function on

162
00:09:12,870 --> 00:09:17,610
the site X now again we can say that the

163
00:09:17,610 --> 00:09:23,028
F does not attain a minimum or a maximum

164
00:09:23,028 --> 00:09:29,879
in this set X so remember that here X is

165
00:09:29,879 --> 00:09:33,570
a compact set and F is not continuous

166
00:09:33,570 --> 00:09:39,289
so compactness of X alone was not enough

167
00:09:39,289 --> 00:09:41,669
probably we need something some more

168
00:09:41,669 --> 00:09:44,600
conditions on F and that condition is

169
00:09:44,600 --> 00:09:47,879
the continuity of function f on X so

170
00:09:47,879 --> 00:09:51,149
suppose if we have X to be a compact set

171
00:09:51,149 --> 00:09:53,940
and F to be a continuous function on the

172
00:09:53,940 --> 00:10:00,240
set X then can we say something so these

173
00:10:00,240 --> 00:10:04,589
leads us to the wise of stress the the

174
00:10:04,589 --> 00:10:08,490
Weierstrass theorem which states that if

175
00:10:08,490 --> 00:10:11,549
X which is a subset of RN is a nonempty

176
00:10:11,549 --> 00:10:16,409
compact set and F is from X to R is a

177
00:10:16,409 --> 00:10:19,710
continuous function on X then F does

178
00:10:19,710 --> 00:10:23,639
attain a maximum and a minimum of X that

179
00:10:23,639 --> 00:10:26,730
is there exists some x1 and some x2

180
00:10:26,730 --> 00:10:31,139
belonging to X such that f of X is less

181
00:10:31,139 --> 00:10:32,970
than or equal to f of X 1 for all X

182
00:10:32,970 --> 00:10:35,429
belongs to X and f of X greater than or

183
00:10:35,429 --> 00:10:37,409
equal to f of X 2 for all X belonging to

184
00:10:37,409 --> 00:10:41,580
X so so this theorem was proposed by

185
00:10:41,580 --> 00:10:43,830
Weierstrass will not go into the proof

186
00:10:43,830 --> 00:10:48,509
of this theorem but it gives us

187
00:10:48,509 --> 00:10:50,730
sufficient conditions for the existence

188
00:10:50,730 --> 00:10:55,409
of optima that is the if f is a function

189
00:10:55,409 --> 00:10:57,870
defined from X to R then we need X to be

190
00:10:57,870 --> 00:11:00,929
compact and F to be a continuous

191
00:11:00,929 --> 00:11:04,200
function on X then it is guaranteed that

192
00:11:04,200 --> 00:11:07,320
the there exists two points X 1 and X 2

193
00:11:07,320 --> 00:11:10,200
such that f of X 1 greater than or equal

194
00:11:10,200 --> 00:11:12,120
to f of X greater than or equal to f of

195
00:11:12,120 --> 00:11:12,480
X 2

196
00:11:12,480 --> 00:11:16,679
for all X belongs to X so as I said that

197
00:11:16,679 --> 00:11:18,870
this Weierstrass theorem just provides a

198
00:11:18,870 --> 00:11:20,909
sufficient consistent exist conditions

199
00:11:20,909 --> 00:11:26,250
for existence of Optima but it does not

200
00:11:26,250 --> 00:11:28,320
give you the necessary conditions for

201
00:11:28,320 --> 00:11:34,100
the existence of Optima for example

202
00:11:34,100 --> 00:11:38,789
suppose we take a function f which is I

203
00:11:38,789 --> 00:11:40,860
haven't written the explicit form of

204
00:11:40,860 --> 00:11:42,720
this function here but assume that the

205
00:11:42,720 --> 00:11:45,299
function has a form like this now the

206
00:11:45,299 --> 00:11:48,120
function is defined on a closed interval

207
00:11:48,120 --> 00:11:51,110
A to B the closed interval A to B is a

208
00:11:51,110 --> 00:11:54,000
closed and bounded set and hence it is

209
00:11:54,000 --> 00:11:56,820
compact but the function f is not

210
00:11:56,820 --> 00:11:59,519
continuous here you will see that there

211
00:11:59,519 --> 00:12:02,490
is a break in the function and this at

212
00:12:02,490 --> 00:12:06,208
this point now although if it is not

213
00:12:06,208 --> 00:12:09,059
continuous you will see that the

214
00:12:09,059 --> 00:12:12,929
function does attain a minimum at this

215
00:12:12,929 --> 00:12:14,370
point and the minimum value of this

216
00:12:14,370 --> 00:12:18,870
function is somewhere here so although F

217
00:12:18,870 --> 00:12:22,289
X is not continuous F does attain a

218
00:12:22,289 --> 00:12:24,659
minimum in the closed interval a B for

219
00:12:24,659 --> 00:12:31,259
this case so this example illustrates

220
00:12:31,259 --> 00:12:34,948
that and the conditions of vyse Trust

221
00:12:34,948 --> 00:12:38,429
theorem are not made but the function

222
00:12:38,429 --> 00:12:42,360
does have a minimum right so why stress

223
00:12:42,360 --> 00:12:45,240
theorem does not talk about necessary

224
00:12:45,240 --> 00:12:50,458
conditions for existence of Optima it

225
00:12:50,458 --> 00:12:51,958
only talks about the sufficient

226
00:12:51,958 --> 00:12:54,659
conditions for Optima now here is a

227
00:12:54,659 --> 00:12:59,818
another example let us define a function

228
00:12:59,818 --> 00:13:04,919
FX as X minus 2 square and defined from

229
00:13:04,919 --> 00:13:06,899
R to R and the function is shown here

230
00:13:06,899 --> 00:13:08,789
now you will see that the function is

231
00:13:08,789 --> 00:13:12,089
continuous what X is not compact the set

232
00:13:12,089 --> 00:13:15,500
R of real numbers is not a compact set

233
00:13:15,500 --> 00:13:18,958
so again the conditions required by

234
00:13:18,958 --> 00:13:21,539
Weierstrass theorem are not met because

235
00:13:21,539 --> 00:13:24,059
X is not compact but despite this fact

236
00:13:24,059 --> 00:13:25,720
the function

237
00:13:25,720 --> 00:13:28,539
does attained a minimum at this point

238
00:13:28,539 --> 00:13:30,089
and the corresponding X star is

239
00:13:30,089 --> 00:13:34,389
somewhere here so we'll see so you will

240
00:13:34,389 --> 00:13:38,350
see that Weierstrass theorem just gives

241
00:13:38,350 --> 00:13:40,389
sufficient conditions for existence of

242
00:13:40,389 --> 00:13:44,139
optima or up many optimum points but

243
00:13:44,139 --> 00:13:48,339
does not give the necessary conditions

244
00:13:48,339 --> 00:13:51,429
for the existence of Optima but anyway

245
00:13:51,429 --> 00:13:53,339
it is a useful theorem in the sense that

246
00:13:53,339 --> 00:13:56,769
under if those conditions are made like

247
00:13:56,769 --> 00:13:59,589
if F is a function from X to R and X is

248
00:13:59,589 --> 00:14:03,629
compact and F is continuous on X then

249
00:14:03,629 --> 00:14:07,958
the minimum and the maximum are of the

250
00:14:07,958 --> 00:14:14,370
function f are guaranteed in the set X

251
00:14:14,370 --> 00:14:21,149
so now let us look at a general problem

252
00:14:21,149 --> 00:14:24,789
so you will see that on the Left panel

253
00:14:24,789 --> 00:14:26,470
we will see that figure which we saw

254
00:14:26,470 --> 00:14:29,318
earlier in this lecture that the

255
00:14:29,318 --> 00:14:31,889
function is very nice and attains

256
00:14:31,889 --> 00:14:35,679
minimum at a single point and on the

257
00:14:35,679 --> 00:14:37,208
right panel you will see a function

258
00:14:37,208 --> 00:14:39,159
where there are lots of peaks and

259
00:14:39,159 --> 00:14:42,399
valleys and if you look at this point

260
00:14:42,399 --> 00:14:46,419
for example so in the neighborhood you

261
00:14:46,419 --> 00:14:48,159
will see that the function is increasing

262
00:14:48,159 --> 00:14:52,659
and the function has the least value in

263
00:14:52,659 --> 00:14:54,159
this if you consider a neighborhood

264
00:14:54,159 --> 00:14:57,429
around this point but this value is

265
00:14:57,429 --> 00:15:00,549
still higher than the value which is

266
00:15:00,549 --> 00:15:03,669
called a global minima right now it

267
00:15:03,669 --> 00:15:05,799
turns out that finding this global

268
00:15:05,799 --> 00:15:14,259
minima is difficult for example it is

269
00:15:14,259 --> 00:15:18,549
very difficult to characterize or find

270
00:15:18,549 --> 00:15:20,500
the global minimum for a general

271
00:15:20,500 --> 00:15:22,948
nonlinear function so if you have a

272
00:15:22,948 --> 00:15:26,889
function which is nice as on the left

273
00:15:26,889 --> 00:15:29,379
panel it is easy to find the global

274
00:15:29,379 --> 00:15:33,339
minimum while on the song on the right

275
00:15:33,339 --> 00:15:36,720
panel you will see a function whose

276
00:15:36,720 --> 00:15:39,448
global minimum is very difficult to find

277
00:15:39,448 --> 00:15:41,698
and suppose we find this point which is

278
00:15:41,698 --> 00:15:47,188
a global minimum then can we say that it

279
00:15:47,188 --> 00:15:49,470
is a global minimum so for that purpose

280
00:15:49,470 --> 00:15:51,418
we need to look at the value of the

281
00:15:51,418 --> 00:15:54,389
function at all the points and to

282
00:15:54,389 --> 00:15:58,078
ascertain that f of X star is less than

283
00:15:58,078 --> 00:16:00,328
or equal to f of X for all X belong to X

284
00:16:00,328 --> 00:16:02,668
and that's going to be a very difficult

285
00:16:02,668 --> 00:16:07,909
task so identifying a global minimum or

286
00:16:07,909 --> 00:16:10,828
characterizing it is very difficult for

287
00:16:10,828 --> 00:16:12,688
a general nonlinear function only in

288
00:16:12,688 --> 00:16:15,539
some special cases one can characterize

289
00:16:15,539 --> 00:16:18,149
them very easily we will see those cases

290
00:16:18,149 --> 00:16:20,068
sometime later in this course but as of

291
00:16:20,068 --> 00:16:23,308
now we know that it is difficult to

292
00:16:23,308 --> 00:16:26,399
characterize this one this global minima

293
00:16:26,399 --> 00:16:30,928
so it resulted in the concept of a local

294
00:16:30,928 --> 00:16:34,230
minimum so again let us consider X to be

295
00:16:34,230 --> 00:16:38,009
a subset of RN and F to be a function

296
00:16:38,009 --> 00:16:39,749
from X to R and let us consider a

297
00:16:39,749 --> 00:16:42,958
constrained optimization problem and we

298
00:16:42,958 --> 00:16:44,548
are trying to minimize f of X subject to

299
00:16:44,548 --> 00:16:47,818
X belongs to X now X star belongs to X

300
00:16:47,818 --> 00:16:53,879
is said to be a local minimum F if there

301
00:16:53,879 --> 00:16:58,528
is a delta greater than 0 such that in

302
00:16:58,528 --> 00:17:01,668
the Delta neighborhood of x star the

303
00:17:01,668 --> 00:17:04,828
value of the function is less than or

304
00:17:04,828 --> 00:17:07,648
equal to the value of the function at

305
00:17:07,648 --> 00:17:15,568
all points in the neighborhood so so if

306
00:17:15,568 --> 00:17:20,038
you look at this this point in the

307
00:17:20,038 --> 00:17:22,048
neighborhood you will see that the value

308
00:17:22,048 --> 00:17:25,949
of the function is less than or equal to

309
00:17:25,949 --> 00:17:27,388
all the points so if you take a small

310
00:17:27,388 --> 00:17:30,659
neighborhood remember that Delta in the

311
00:17:30,659 --> 00:17:32,369
definition of local minimum can be very

312
00:17:32,369 --> 00:17:33,509
small so if you take a small

313
00:17:33,509 --> 00:17:36,359
neighborhood around this point and find

314
00:17:36,359 --> 00:17:37,589
out the function values at all the

315
00:17:37,589 --> 00:17:40,798
points you will see that the value of

316
00:17:40,798 --> 00:17:46,288
the function is at least the value of

317
00:17:46,288 --> 00:17:48,599
all the value of the function at all the

318
00:17:48,599 --> 00:17:50,538
points in the neighborhood so similar

319
00:17:50,538 --> 00:17:52,779
similarly in this case it

320
00:17:52,779 --> 00:17:56,829
true that the value of the function is

321
00:17:56,829 --> 00:17:58,299
at least the value of the function in

322
00:17:58,299 --> 00:18:00,519
the neighborhood however small

323
00:18:00,519 --> 00:18:02,400
neighborhood that you would take and

324
00:18:02,400 --> 00:18:07,299
same is true in this case also but here

325
00:18:07,299 --> 00:18:09,369
interestingly the function is almost

326
00:18:09,369 --> 00:18:12,700
flat in some range and then in one

327
00:18:12,700 --> 00:18:14,289
direction it starts increasing in the

328
00:18:14,289 --> 00:18:17,769
other direction it starts decreasing so

329
00:18:17,769 --> 00:18:23,859
such points in which the function

330
00:18:23,859 --> 00:18:25,900
increases in one direction decreases in

331
00:18:25,900 --> 00:18:28,299
the other direction and they are they

332
00:18:28,299 --> 00:18:32,009
are called the saddle points so and

333
00:18:32,009 --> 00:18:35,470
similarly in this case of course this is

334
00:18:35,470 --> 00:18:38,079
a global minimum so it also is a local

335
00:18:38,079 --> 00:18:40,029
minimum because in the neighborhood the

336
00:18:40,029 --> 00:18:48,849
function value increases so from this

337
00:18:48,849 --> 00:18:52,869
definition one can extend this

338
00:18:52,869 --> 00:18:56,650
definition to the definition of local

339
00:18:56,650 --> 00:18:59,200
maximum so again as I said earlier that

340
00:18:59,200 --> 00:19:02,859
the inequality reverses in this

341
00:19:02,859 --> 00:19:04,599
definition so again one has to take a

342
00:19:04,599 --> 00:19:08,319
local neighborhood can be sufficiently

343
00:19:08,319 --> 00:19:10,599
small and find out the value of the

344
00:19:10,599 --> 00:19:12,549
function in the local neighborhood and

345
00:19:12,549 --> 00:19:16,960
you will see that these points are local

346
00:19:16,960 --> 00:19:23,589
Maxima ok so the definitions of local

347
00:19:23,589 --> 00:19:26,680
minimum and local maximum they are

348
00:19:26,680 --> 00:19:30,220
centered around the neighborhood of a

349
00:19:30,220 --> 00:19:35,589
given point now let us consider one

350
00:19:35,589 --> 00:19:39,970
example so here the function is shown

351
00:19:39,970 --> 00:19:44,319
here now let us take a point X star now

352
00:19:44,319 --> 00:19:47,319
X star by the definition of local

353
00:19:47,319 --> 00:19:48,940
minimum that there exists a neighborhood

354
00:19:48,940 --> 00:19:51,609
around X star so that the value of the

355
00:19:51,609 --> 00:19:54,369
function is f of X is greater than or

356
00:19:54,369 --> 00:19:58,210
equal to f of X star so because of that

357
00:19:58,210 --> 00:20:01,480
definition there is a problem in this

358
00:20:01,480 --> 00:20:04,150
case mainly that since we are allowing f

359
00:20:04,150 --> 00:20:06,220
of X star greater than or equal to F

360
00:20:06,220 --> 00:20:08,470
X for all X in the neighborhood of X

361
00:20:08,470 --> 00:20:11,288
star so you will see that X star turns

362
00:20:11,288 --> 00:20:13,869
out to be a local minimum because of the

363
00:20:13,869 --> 00:20:15,730
definition but by looking at the

364
00:20:15,730 --> 00:20:18,220
function assuming that the function goes

365
00:20:18,220 --> 00:20:21,429
to minus infinity in both the directions

366
00:20:21,429 --> 00:20:23,048
when X goes to plus infinity and minus

367
00:20:23,048 --> 00:20:26,079
infinity so you will see that X star

368
00:20:26,079 --> 00:20:31,390
indeed is a global maximum so you have X

369
00:20:31,390 --> 00:20:34,119
star is a to be a local mean and as well

370
00:20:34,119 --> 00:20:38,859
as a global maximum now to avoid such

371
00:20:38,859 --> 00:20:42,640
cases one can think of a definition of a

372
00:20:42,640 --> 00:20:45,490
strict local minimum now X star

373
00:20:45,490 --> 00:20:48,009
belonging to X is said to be a strict

374
00:20:48,009 --> 00:20:51,700
local minimum of F if f of X star is

375
00:20:51,700 --> 00:20:54,669
less than strictly less than f of X for

376
00:20:54,669 --> 00:20:58,750
all X in the neighborhood of X Delta

377
00:20:58,750 --> 00:21:01,240
neighborhood of X star and X not equal

378
00:21:01,240 --> 00:21:05,079
to X star so you will see that then if

379
00:21:05,079 --> 00:21:11,190
we use this definition then X star Iza

380
00:21:11,190 --> 00:21:14,470
is a local mean but not a strict local

381
00:21:14,470 --> 00:21:20,429
mean and also it is a global max because

382
00:21:20,429 --> 00:21:23,679
the function value is highest at X star

383
00:21:23,679 --> 00:21:26,500
of course at the other point in the at

384
00:21:26,500 --> 00:21:27,909
the other neighboring points also the

385
00:21:27,909 --> 00:21:29,980
function value is highest so remember

386
00:21:29,980 --> 00:21:33,700
that the global minimum or global

387
00:21:33,700 --> 00:21:36,069
maximum they need not be unique so this

388
00:21:36,069 --> 00:21:38,529
is one example where you have a multiple

389
00:21:38,529 --> 00:21:45,009
global Maxima and X star is not a strict

390
00:21:45,009 --> 00:21:47,079
local mean so all these points which are

391
00:21:47,079 --> 00:21:49,029
in the neighborhood they are not strict

392
00:21:49,029 --> 00:21:54,730
local minima so now let us look at the

393
00:21:54,730 --> 00:21:58,509
function again the fungus function we

394
00:21:58,509 --> 00:22:00,909
saw earlier so these are the points

395
00:22:00,909 --> 00:22:06,490
which are local minima and not only

396
00:22:06,490 --> 00:22:07,839
local minima they are strict local

397
00:22:07,839 --> 00:22:11,558
minima now this is a point which is a

398
00:22:11,558 --> 00:22:15,159
weak local minimum so you will see that

399
00:22:15,159 --> 00:22:18,250
f of X star is less than or equal to f

400
00:22:18,250 --> 00:22:20,140
of X in the neighborhood of

401
00:22:20,140 --> 00:22:23,140
extra so such points are weak local

402
00:22:23,140 --> 00:22:27,759
minima and this point is a global

403
00:22:27,759 --> 00:22:32,019
minimum now earlier I said that this

404
00:22:32,019 --> 00:22:34,179
global minima are difficult to

405
00:22:34,179 --> 00:22:37,480
characterize so one may wonder that how

406
00:22:37,480 --> 00:22:40,769
about finding different local minima

407
00:22:40,769 --> 00:22:45,089
strict local minima or big local minima

408
00:22:45,089 --> 00:22:48,759
collecting them together and then trying

409
00:22:48,759 --> 00:22:53,048
to find out the least among those so

410
00:22:53,048 --> 00:22:56,589
this idea may work in some cases but may

411
00:22:56,589 --> 00:23:00,579
not work always for example suppose I

412
00:23:00,579 --> 00:23:02,109
have a function which is shown here and

413
00:23:02,109 --> 00:23:05,619
it goes to minus infinity so the domain

414
00:23:05,619 --> 00:23:08,140
of the function is a set of real numbers

415
00:23:08,140 --> 00:23:11,558
and you will see that this X star is a

416
00:23:11,558 --> 00:23:13,269
local minimum in fact it is a strict

417
00:23:13,269 --> 00:23:15,279
local meaning because the value of the

418
00:23:15,279 --> 00:23:17,380
function in the neighborhood of x star

419
00:23:17,380 --> 00:23:25,419
that does not reach f of X star at any

420
00:23:25,419 --> 00:23:27,308
point other than X star so it's always a

421
00:23:27,308 --> 00:23:32,259
it always exceeds f of X star so now if

422
00:23:32,259 --> 00:23:34,599
you look at this function and if you

423
00:23:34,599 --> 00:23:36,730
collect all the local minima which is 1

424
00:23:36,730 --> 00:23:39,880
in this case that will not give you a

425
00:23:39,880 --> 00:23:41,558
global minima because the function does

426
00:23:41,558 --> 00:23:43,390
not have a global minimum the function

427
00:23:43,390 --> 00:23:47,650
goes to minus infinity in both positive

428
00:23:47,650 --> 00:23:50,230
x as well as then you get you x

429
00:23:50,230 --> 00:23:53,440
direction so by collecting all local

430
00:23:53,440 --> 00:23:55,859
minima it is not always guaranteed that

431
00:23:55,859 --> 00:23:58,869
we can find a global minimum of a

432
00:23:58,869 --> 00:24:01,269
function so one has to check whether the

433
00:24:01,269 --> 00:24:05,519
function has a proper form so that

434
00:24:05,519 --> 00:24:08,230
collecting all local minima can give us

435
00:24:08,230 --> 00:24:15,589
a

436
00:24:15,589 --> 00:24:19,190
so now let us look at the unconstrained

437
00:24:19,190 --> 00:24:22,369
optimization problems as I said earlier

438
00:24:22,369 --> 00:24:25,069
that if we are trying to solve a problem

439
00:24:25,069 --> 00:24:27,950
from to minimize the function f from

440
00:24:27,950 --> 00:24:32,028
where X belongs to capital X and F is a

441
00:24:32,028 --> 00:24:34,220
function from X to R and X is a subset

442
00:24:34,220 --> 00:24:36,798
of RN so this is a general constrained

443
00:24:36,798 --> 00:24:39,769
optimization problem and an

444
00:24:39,769 --> 00:24:41,690
unconstrained optimization problem is a

445
00:24:41,690 --> 00:24:44,269
special case of this now to solve a

446
00:24:44,269 --> 00:24:47,659
constrained optimization problem many

447
00:24:47,659 --> 00:24:50,210
times we need to solve an unconstrained

448
00:24:50,210 --> 00:24:54,109
optimization problem first I will so see

449
00:24:54,109 --> 00:24:57,740
those techniques later but solving an

450
00:24:57,740 --> 00:24:59,118
unconstrained property optimization

451
00:24:59,118 --> 00:25:03,278
problem is important to solve a

452
00:25:03,278 --> 00:25:06,798
constrained optimization problem now to

453
00:25:06,798 --> 00:25:07,788
solve an unconstrained optimization

454
00:25:07,788 --> 00:25:11,419
problem one has to solve many one

455
00:25:11,419 --> 00:25:13,009
dimensional unconstrained optimization

456
00:25:13,009 --> 00:25:18,470
problems which are of this type so in

457
00:25:18,470 --> 00:25:22,339
today's talk will focus on solving one

458
00:25:22,339 --> 00:25:23,569
dimensional unconstrained optimization

459
00:25:23,569 --> 00:25:26,409
problem they are very important to solve

460
00:25:26,409 --> 00:25:28,128
unconstrained multi dimensional

461
00:25:28,128 --> 00:25:30,950
optimization problem and these

462
00:25:30,950 --> 00:25:32,148
unconstrained multi dimensional

463
00:25:32,148 --> 00:25:34,099
optimization problems are important to

464
00:25:34,099 --> 00:25:36,470
solve the general constrained

465
00:25:36,470 --> 00:25:39,980
optimization problem so this being an

466
00:25:39,980 --> 00:25:41,509
important problem will spend some time

467
00:25:41,509 --> 00:25:45,769
studying about the solutions of these

468
00:25:45,769 --> 00:25:52,509
types of problems so here is a

469
00:25:52,509 --> 00:25:54,740
unconstrained one-dimensional

470
00:25:54,740 --> 00:25:57,589
optimization problem where F is a

471
00:25:57,589 --> 00:26:00,499
function from R to R so remember that

472
00:26:00,499 --> 00:26:05,288
now we are not worried about the set X

473
00:26:05,288 --> 00:26:08,269
in this case the domain of the function

474
00:26:08,269 --> 00:26:10,339
is the entire set of real numbers the

475
00:26:10,339 --> 00:26:12,048
range is also the entire set of real

476
00:26:12,048 --> 00:26:14,929
numbers now we want to solve this

477
00:26:14,929 --> 00:26:19,460
problem and as we saw earlier that it is

478
00:26:19,460 --> 00:26:22,069
very difficult to identify the global

479
00:26:22,069 --> 00:26:24,230
minima of a function so we will restrict

480
00:26:24,230 --> 00:26:27,378
ourselves to local minima now the ideas

481
00:26:27,378 --> 00:26:29,359
that we see here and

482
00:26:29,359 --> 00:26:31,880
easily extended to finding local maxima

483
00:26:31,880 --> 00:26:36,470
also so the natural question that one

484
00:26:36,470 --> 00:26:38,839
would like to ask is that what are the

485
00:26:38,839 --> 00:26:40,460
necessary and sufficient conditions for

486
00:26:40,460 --> 00:26:44,690
a local minimum now what what do we mean

487
00:26:44,690 --> 00:26:46,009
by the necessary and sufficient

488
00:26:46,009 --> 00:26:48,440
conditions so necessary conditions are

489
00:26:48,440 --> 00:26:50,000
the conditions which are satisfied by

490
00:26:50,000 --> 00:26:52,720
every local minimum for example if

491
00:26:52,720 --> 00:26:55,099
suppose X star is the local minimum of

492
00:26:55,099 --> 00:26:57,679
this problem then we can say that if X

493
00:26:57,679 --> 00:27:00,140
star is a local mean then certain

494
00:27:00,140 --> 00:27:02,000
conditions are satisfied and those

495
00:27:02,000 --> 00:27:03,619
conditions are called necessary

496
00:27:03,619 --> 00:27:07,000
conditions and by sufficient conditions

497
00:27:07,000 --> 00:27:10,160
we mean those conditions which guarantee

498
00:27:10,160 --> 00:27:15,819
a local minimum for example if at X star

499
00:27:15,819 --> 00:27:18,859
belong to are certain conditions are

500
00:27:18,859 --> 00:27:22,669
satisfied then we can say with guarantee

501
00:27:22,669 --> 00:27:28,609
that X star is a local minimum now as we

502
00:27:28,609 --> 00:27:32,589
saw earlier that there exists lots of

503
00:27:32,589 --> 00:27:35,000
there might exist lot of local minima

504
00:27:35,000 --> 00:27:38,269
for a general nonlinear function then

505
00:27:38,269 --> 00:27:40,009
how do we characterize such local minima

506
00:27:40,009 --> 00:27:41,539
that's the next question that we would

507
00:27:41,539 --> 00:27:44,750
like to answer because again we do not

508
00:27:44,750 --> 00:27:48,019
want to find a local neighborhood along

509
00:27:48,019 --> 00:27:50,808
around X star and check whether the

510
00:27:50,808 --> 00:27:55,609
value of the function exceeds the value

511
00:27:55,609 --> 00:27:59,869
of the function at X star because there

512
00:27:59,869 --> 00:28:01,308
could be infinitely many points again in

513
00:28:01,308 --> 00:28:04,990
the neighborhood and we do not want to

514
00:28:04,990 --> 00:28:07,910
ensure we do not want to have any

515
00:28:07,910 --> 00:28:10,460
conditions which will require finding

516
00:28:10,460 --> 00:28:12,230
different points in the neighborhood and

517
00:28:12,230 --> 00:28:13,519
checking the function matters

518
00:28:13,519 --> 00:28:16,069
so instead is there any algebraic

519
00:28:16,069 --> 00:28:20,808
approach to check whether a particular

520
00:28:20,808 --> 00:28:24,950
point X star is a local minimum now the

521
00:28:24,950 --> 00:28:27,019
ideas of differential calculus come

522
00:28:27,019 --> 00:28:30,740
become very useful in this case so if

523
00:28:30,740 --> 00:28:34,700
the function is sufficiently smooth that

524
00:28:34,700 --> 00:28:37,669
means it's a first order derivative

525
00:28:37,669 --> 00:28:40,250
second order derivative and so on they

526
00:28:40,250 --> 00:28:42,970
exist and they are continuous

527
00:28:42,970 --> 00:28:44,890
then one can arrive at different

528
00:28:44,890 --> 00:28:49,390
conditions for characterization of local

529
00:28:49,390 --> 00:28:53,710
minima so we are going to see those

530
00:28:53,710 --> 00:28:57,819
conditions now now let us consider a

531
00:28:57,819 --> 00:29:01,659
function from R to R and F belongs to C

532
00:29:01,659 --> 00:29:04,750
1 so this C 1 denotes the class of

533
00:29:04,750 --> 00:29:08,859
functions whose first derivatives are

534
00:29:08,859 --> 00:29:13,779
continuous ok so now we are restricting

535
00:29:13,779 --> 00:29:15,609
ourselves to functions whose first

536
00:29:15,609 --> 00:29:19,119
derivatives are continuous and let us

537
00:29:19,119 --> 00:29:20,919
consider the problem to minimize f of X

538
00:29:20,919 --> 00:29:24,250
where X belongs to R now here is the

539
00:29:24,250 --> 00:29:27,490
first set of conditions called first

540
00:29:27,490 --> 00:29:32,169
order necessary conditions the result is

541
00:29:32,169 --> 00:29:35,859
very simple if it says that if X star is

542
00:29:35,859 --> 00:29:39,190
a local minimum then f dash X star is

543
00:29:39,190 --> 00:29:42,130
equal to 0 so which means that the

544
00:29:42,130 --> 00:29:45,519
derivative of the function vanishes so

545
00:29:45,519 --> 00:29:47,230
if X star is a local minimum the

546
00:29:47,230 --> 00:29:49,710
derivative of the function at X star

547
00:29:49,710 --> 00:29:53,798
vanishes so these conditions are called

548
00:29:53,798 --> 00:29:55,150
the first order necessary conditions

549
00:29:55,150 --> 00:29:57,339
they are necessary conditions because if

550
00:29:57,339 --> 00:29:59,798
X star is a local minimum of F then

551
00:29:59,798 --> 00:30:01,659
these conditions are satisfied so that's

552
00:30:01,659 --> 00:30:03,419
why they are necessary conditions and

553
00:30:03,419 --> 00:30:05,470
they use the first order derivative

554
00:30:05,470 --> 00:30:08,650
information so that's why we call them

555
00:30:08,650 --> 00:30:12,329
as first order necessary conditions now

556
00:30:12,329 --> 00:30:15,599
let us look at the proof of this result

557
00:30:15,599 --> 00:30:21,009
the proof is based on the fact that let

558
00:30:21,009 --> 00:30:23,380
us so we have a statement which is of

559
00:30:23,380 --> 00:30:27,159
the type a implies B so what we do is

560
00:30:27,159 --> 00:30:30,220
that we assume that B is not true and

561
00:30:30,220 --> 00:30:35,409
then say that a is not true so we saw

562
00:30:35,409 --> 00:30:38,079
this method of proof in one of our

563
00:30:38,079 --> 00:30:43,630
earlier lectures so let us assume that F

564
00:30:43,630 --> 00:30:48,250
dash X star is greater than 0 so the

565
00:30:48,250 --> 00:30:50,349
result says that if X star is a local

566
00:30:50,349 --> 00:30:52,329
minimum then f dash X star equal to 0 so

567
00:30:52,329 --> 00:30:54,069
on the contrary let us assume that F

568
00:30:54,069 --> 00:30:56,500
dash X star is greater than 0

569
00:30:56,500 --> 00:30:59,500
and then we will prove that in such a

570
00:30:59,500 --> 00:31:03,269
case X star cannot be a local minimum

571
00:31:03,269 --> 00:31:08,349
okay now since F belongs to C 1 so it's

572
00:31:08,349 --> 00:31:10,960
derivative it's a continuous function so

573
00:31:10,960 --> 00:31:16,630
that means F dash belongs to C naught so

574
00:31:16,630 --> 00:31:23,500
let us take a interval a D which is of

575
00:31:23,500 --> 00:31:26,890
size 2 Delta around X star so X star

576
00:31:26,890 --> 00:31:28,960
minus Delta 2 X bar X star plus Delta is

577
00:31:28,960 --> 00:31:32,259
open interval so let this interval be

578
00:31:32,259 --> 00:31:34,420
chosen such that the derivative of the

579
00:31:34,420 --> 00:31:39,119
function at any point in this interval

580
00:31:39,119 --> 00:31:41,769
the derivative of the function at any

581
00:31:41,769 --> 00:31:43,180
point in this interval is greater than

582
00:31:43,180 --> 00:31:47,589
zero now that is possible a it is

583
00:31:47,589 --> 00:31:49,710
possible to get such an interval because

584
00:31:49,710 --> 00:31:53,529
we are assuming that F belongs to C 1 so

585
00:31:53,529 --> 00:31:55,779
at least in the neighborhood F Dash is

586
00:31:55,779 --> 00:31:58,630
continuous and we are assuming that F

587
00:31:58,630 --> 00:32:00,039
dash X star greater than 0 so at least

588
00:32:00,039 --> 00:32:03,160
in the neighborhood F dash X is greater

589
00:32:03,160 --> 00:32:06,279
than 0 now if you write the truncated

590
00:32:06,279 --> 00:32:10,480
Taylor series expansion of f of X around

591
00:32:10,480 --> 00:32:13,569
X star and truncate it to the first

592
00:32:13,569 --> 00:32:16,599
order so what we get is the f of X is

593
00:32:16,599 --> 00:32:20,140
nothing but f of X star plus F 2 dash X

594
00:32:20,140 --> 00:32:22,269
bar into X minus X star where X bar is a

595
00:32:22,269 --> 00:32:24,849
point on the line segment joining X star

596
00:32:24,849 --> 00:32:32,549
and X now remember that f dash X bar is

597
00:32:32,549 --> 00:32:36,160
greater than 0 because X bar is a point

598
00:32:36,160 --> 00:32:41,079
in the interval D so f dash X bar is

599
00:32:41,079 --> 00:32:43,779
greater than 0 now we are free to choose

600
00:32:43,779 --> 00:32:50,200
any X in the interval D so suppose we

601
00:32:50,200 --> 00:32:53,109
choose X to be in the open interval X

602
00:32:53,109 --> 00:32:59,559
star minus Delta to X star so X minus X

603
00:32:59,559 --> 00:33:04,809
bar X minus X star will become negative

604
00:33:04,809 --> 00:33:07,569
because X X lies in this interval so X

605
00:33:07,569 --> 00:33:09,940
minus X star becomes negative

606
00:33:09,940 --> 00:33:12,278
so X minus X star becomes negative f

607
00:33:12,278 --> 00:33:15,278
dash X bar is greater than zero so

608
00:33:15,278 --> 00:33:19,058
therefore f of X will become less than f

609
00:33:19,058 --> 00:33:23,079
of X star and that contradicts the fact

610
00:33:23,079 --> 00:33:27,730
that X star is a local minima now one

611
00:33:27,730 --> 00:33:34,329
can use similar ideas by assuming that f

612
00:33:34,329 --> 00:33:37,419
dash X star less than zero and then the

613
00:33:37,419 --> 00:33:40,179
rest of the things follow as they are

614
00:33:40,179 --> 00:33:42,308
except that you choose the point in the

615
00:33:42,308 --> 00:33:45,398
open interval X star to X star minus x

616
00:33:45,398 --> 00:33:46,329
star plus Delta

617
00:33:46,329 --> 00:33:49,119
so in that case f dash X bar will be

618
00:33:49,119 --> 00:33:51,099
less than zero and this quantity is

619
00:33:51,099 --> 00:33:52,419
greater than zero so that product will

620
00:33:52,419 --> 00:33:54,669
be less than zero and therefore f of X

621
00:33:54,669 --> 00:33:56,798
will be less than f of X star so in

622
00:33:56,798 --> 00:33:58,869
either case we come up with the

623
00:33:58,869 --> 00:34:01,720
contradiction that X star is not a local

624
00:34:01,720 --> 00:34:06,190
minimum so so if X is a local minimum

625
00:34:06,190 --> 00:34:08,199
then the first derivative of the

626
00:34:08,199 --> 00:34:14,190
function should vanish now here are some

627
00:34:14,190 --> 00:34:16,809
examples so on the Left panel you will

628
00:34:16,809 --> 00:34:19,898
see a function f of X which is X minus 2

629
00:34:19,898 --> 00:34:21,878
square and the derivative of the

630
00:34:21,878 --> 00:34:25,690
function at 2 vanishes so you will see

631
00:34:25,690 --> 00:34:29,260
that the function has a zero slope at X

632
00:34:29,260 --> 00:34:32,409
equal to 0 and then on either side the

633
00:34:32,409 --> 00:34:37,030
function is increasing now on the right

634
00:34:37,030 --> 00:34:38,818
panel you will see one function which is

635
00:34:38,818 --> 00:34:44,469
f of X equal to minus X square and at x

636
00:34:44,469 --> 00:34:47,039
equal to zero you will see that the

637
00:34:47,039 --> 00:34:51,699
slope of the function is zero but in

638
00:34:51,699 --> 00:34:54,159
this case X equal to zero turns out to

639
00:34:54,159 --> 00:34:57,460
be a global maximum and in this case it

640
00:34:57,460 --> 00:35:00,420
turns out to be a global minimum if

641
00:35:00,420 --> 00:35:02,588
assuming that the function extends to

642
00:35:02,588 --> 00:35:05,170
plus infinity here and function goes to

643
00:35:05,170 --> 00:35:09,219
minus infinity in this case so the slope

644
00:35:09,219 --> 00:35:11,380
of the function is zero at a local

645
00:35:11,380 --> 00:35:13,559
minimum as well as at the local Maxima

646
00:35:13,559 --> 00:35:16,088
so looking at the slope really does not

647
00:35:16,088 --> 00:35:21,400
tell us much about the minimum or

648
00:35:21,400 --> 00:35:23,969
maximum of a function at that point

649
00:35:23,969 --> 00:35:28,000
now let us consider one more case so on

650
00:35:28,000 --> 00:35:29,320
the right panel you will see a function

651
00:35:29,320 --> 00:35:34,630
f of X equal to X cube so this f dash X

652
00:35:34,630 --> 00:35:38,380
at X equal to 0 is 0 but you will see

653
00:35:38,380 --> 00:35:40,570
that the function increases in one

654
00:35:40,570 --> 00:35:43,210
direction and decreases in the other

655
00:35:43,210 --> 00:35:45,400
direction so if you move X to minus

656
00:35:45,400 --> 00:35:47,769
infinity the function goes to minus

657
00:35:47,769 --> 00:35:53,250
infinity so the slope of the function

658
00:35:53,250 --> 00:35:55,719
really does not tell us anything about

659
00:35:55,719 --> 00:35:58,840
the existence of a local minimum or

660
00:35:58,840 --> 00:36:02,139
local maximum at that point so the

661
00:36:02,139 --> 00:36:04,119
points at which the slope of the

662
00:36:04,119 --> 00:36:07,119
function is 0 the such points are called

663
00:36:07,119 --> 00:36:08,130
saddle points

664
00:36:08,130 --> 00:36:10,449
sometimes people also call it stationary

665
00:36:10,449 --> 00:36:15,760
points so the saddle points really do

666
00:36:15,760 --> 00:36:18,699
not give us any idea about the local

667
00:36:18,699 --> 00:36:21,519
minima or Maxima so we need some extra

668
00:36:21,519 --> 00:36:28,420
information and as I said that f dash X

669
00:36:28,420 --> 00:36:32,920
equal to 0 is just a necessary condition

670
00:36:32,920 --> 00:36:36,219
for a local minimum but not a sufficient

671
00:36:36,219 --> 00:36:39,000
condition so if you collect all points

672
00:36:39,000 --> 00:36:42,489
for which F dash X star equal to 0 such

673
00:36:42,489 --> 00:36:45,400
points are called stationary points now

674
00:36:45,400 --> 00:36:47,260
since we are interested in solving a

675
00:36:47,260 --> 00:36:49,570
problem of the type minimize f of X

676
00:36:49,570 --> 00:36:52,239
where X belongs to R the natural

677
00:36:52,239 --> 00:36:53,679
question that we would like to ask is

678
00:36:53,679 --> 00:36:56,469
that how do we ensure that a stationary

679
00:36:56,469 --> 00:37:00,099
point is indeed a local minimum and we

680
00:37:00,099 --> 00:37:03,789
have seen the case that checking the

681
00:37:03,789 --> 00:37:05,289
gradient of the function or taking the

682
00:37:05,289 --> 00:37:07,119
derivative of the function in the

683
00:37:07,119 --> 00:37:09,599
one-dimensional case does not really

684
00:37:09,599 --> 00:37:15,820
guarantee a local minimum right so let

685
00:37:15,820 --> 00:37:17,519
us look at the second order information

686
00:37:17,519 --> 00:37:19,989
now on the left panel you will see the

687
00:37:19,989 --> 00:37:22,780
same function f of X equal to X minus 2

688
00:37:22,780 --> 00:37:24,820
square now the derivative of the

689
00:37:24,820 --> 00:37:27,099
function vanishes at this point the

690
00:37:27,099 --> 00:37:30,489
second derivative of this function at X

691
00:37:30,489 --> 00:37:33,280
equal to 2 is 4 and that is greater than

692
00:37:33,280 --> 00:37:35,619
0 now on the right panel

693
00:37:35,619 --> 00:37:37,300
you will see the function f of X equal

694
00:37:37,300 --> 00:37:37,539
to

695
00:37:37,539 --> 00:37:40,389
six square the first derivative of the

696
00:37:40,389 --> 00:37:42,789
function at X equal to zero is zero and

697
00:37:42,789 --> 00:37:45,849
the second derivative of the function at

698
00:37:45,849 --> 00:37:48,460
X equal to zero is minus two so you will

699
00:37:48,460 --> 00:37:54,369
see that for this function the the

700
00:37:54,369 --> 00:37:55,750
second derivative of the function at

701
00:37:55,750 --> 00:37:57,969
this point is positive and for this

702
00:37:57,969 --> 00:38:00,489
function the second derivative of the

703
00:38:00,489 --> 00:38:03,969
function at this point is negative so

704
00:38:03,969 --> 00:38:05,739
the second derivative in some sense

705
00:38:05,739 --> 00:38:07,599
talks about the curvature of the

706
00:38:07,599 --> 00:38:10,690
function at that point so this function

707
00:38:10,690 --> 00:38:12,400
has a positive curvature at this point

708
00:38:12,400 --> 00:38:14,110
and this function has a negative

709
00:38:14,110 --> 00:38:16,239
curvature at this point so you will see

710
00:38:16,239 --> 00:38:20,460
that although the first derivatives of

711
00:38:20,460 --> 00:38:22,780
these two functions at these points are

712
00:38:22,780 --> 00:38:26,099
zero it is the second derivative which

713
00:38:26,099 --> 00:38:28,750
tells us about the curvature and that

714
00:38:28,750 --> 00:38:31,469
tells that this point it indeed is a

715
00:38:31,469 --> 00:38:34,500
local minimum and this point indeed is a

716
00:38:34,500 --> 00:38:38,769
local maximum so so let us look at the

717
00:38:38,769 --> 00:38:42,219
second order necessary conditions now

718
00:38:42,219 --> 00:38:43,329
for the second order necessary

719
00:38:43,329 --> 00:38:45,880
conditions we need the second derivative

720
00:38:45,880 --> 00:38:47,619
of the function so let us assume that

721
00:38:47,619 --> 00:38:50,920
the F belongs to C 2 that is the second

722
00:38:50,920 --> 00:38:52,900
derivative of the function is continuous

723
00:38:52,900 --> 00:38:55,539
and the function is from R to R and we

724
00:38:55,539 --> 00:38:57,400
are interested in solving in one

725
00:38:57,400 --> 00:38:58,630
dimensional unconstrained optimization

726
00:38:58,630 --> 00:39:02,289
problem now here is a result about the

727
00:39:02,289 --> 00:39:05,530
second order necessary conditions now if

728
00:39:05,530 --> 00:39:09,670
X star is a local minimum of F then the

729
00:39:09,670 --> 00:39:12,369
first derivative at X star vanishes and

730
00:39:12,369 --> 00:39:16,050
the second derivative is non-negative so

731
00:39:16,050 --> 00:39:21,699
this result gives us so the necessary

732
00:39:21,699 --> 00:39:24,250
conditions we choose the second order

733
00:39:24,250 --> 00:39:29,349
information of the function now how do

734
00:39:29,349 --> 00:39:31,570
we prove this so the proof is very easy

735
00:39:31,570 --> 00:39:33,489
now the by the first order necessary

736
00:39:33,489 --> 00:39:36,429
conditions we know that if X star is a

737
00:39:36,429 --> 00:39:38,619
local minimum then f dash X star equal

738
00:39:38,619 --> 00:39:41,500
to 0 we already proved that result now

739
00:39:41,500 --> 00:39:44,289
let us assume that the second derivative

740
00:39:44,289 --> 00:39:48,639
is less than 0 at X star and then we

741
00:39:48,639 --> 00:39:51,130
will come up with the contradiction now

742
00:39:51,130 --> 00:39:53,289
since F belongs to C 2 the second

743
00:39:53,289 --> 00:39:55,079
derivative of the function is a

744
00:39:55,079 --> 00:39:58,179
continuous function so again as in the

745
00:39:58,179 --> 00:39:59,980
previous case let us choose an interval

746
00:39:59,980 --> 00:40:04,929
around X star of size 2 Delta so this is

747
00:40:04,929 --> 00:40:08,679
the interval D X star minus Delta X star

748
00:40:08,679 --> 00:40:11,530
plus data now since F 2 Dash is

749
00:40:11,530 --> 00:40:13,539
continuous we can always choose this

750
00:40:13,539 --> 00:40:15,880
interval such that for any point in this

751
00:40:15,880 --> 00:40:18,760
interval D and the second derivative is

752
00:40:18,760 --> 00:40:21,030
less than 0 because we have assumed that

753
00:40:21,030 --> 00:40:26,079
F 2 dash X star is less than 0 so now

754
00:40:26,079 --> 00:40:28,570
let us write down the second-order

755
00:40:28,570 --> 00:40:32,320
truncated Taylor series so f of X is

756
00:40:32,320 --> 00:40:34,659
nothing but f of X star plus F dash X

757
00:40:34,659 --> 00:40:37,449
into X minus X star plus half F 2 dash X

758
00:40:37,449 --> 00:40:40,150
bar into X minus 6 X star square X where

759
00:40:40,150 --> 00:40:42,369
X bar is a point on the line segment

760
00:40:42,369 --> 00:40:47,039
joining X star and X now we know that

761
00:40:47,039 --> 00:40:48,550
from the first order necessary

762
00:40:48,550 --> 00:40:51,179
conditions that F dash X star equal to 0

763
00:40:51,179 --> 00:40:55,059
now we also have chosen this interval

764
00:40:55,059 --> 00:40:57,489
such that F 2 dash X is less than 0 for

765
00:40:57,489 --> 00:40:59,289
all X belongs to D and X bar belongs to

766
00:40:59,289 --> 00:41:01,389
this interval so if F 2 dash X bar is

767
00:41:01,389 --> 00:41:11,460
also less than 0 now now this quantity

768
00:41:11,460 --> 00:41:15,429
this should be X X star so this quantity

769
00:41:15,429 --> 00:41:21,250
is 0 and F 2 dash X bar is less than 0 X

770
00:41:21,250 --> 00:41:25,349
minus X star square is greater than 0 X

771
00:41:25,349 --> 00:41:28,329
minus X star square is greater than 0 so

772
00:41:28,329 --> 00:41:30,130
so you will see that this quantity

773
00:41:30,130 --> 00:41:33,670
vanishes and this quantity is less than

774
00:41:33,670 --> 00:41:36,369
0 so which means that f of X is less

775
00:41:36,369 --> 00:41:40,079
than f of X star and which is a

776
00:41:40,079 --> 00:41:45,969
contradiction so so we will see that we

777
00:41:45,969 --> 00:41:47,889
see that if X star is the local minimum

778
00:41:47,889 --> 00:41:50,559
then f dash X star equal to 0 and F 2

779
00:41:50,559 --> 00:41:55,650
dash X star greater than or equal to 0

780
00:41:55,650 --> 00:41:59,980
so remember that this is f dash X star

781
00:41:59,980 --> 00:42:02,829
and by first order necessary conditions

782
00:42:02,829 --> 00:42:06,358
F dash X star equal to 0

783
00:42:06,358 --> 00:42:08,619
now the question is are these

784
00:42:08,619 --> 00:42:13,300
second-order sufficient conditions are

785
00:42:13,300 --> 00:42:14,739
the second order necessary conditions

786
00:42:14,739 --> 00:42:17,369
also sufficient and the answer is no

787
00:42:17,369 --> 00:42:22,838
because if you consider a problem which

788
00:42:22,838 --> 00:42:25,358
is to minimize X cube subject to X

789
00:42:25,358 --> 00:42:26,980
belongs to R the function is drawn here

790
00:42:26,980 --> 00:42:30,400
now at this point the first derivative

791
00:42:30,400 --> 00:42:32,710
and the second derivative both are 0 but

792
00:42:32,710 --> 00:42:34,568
then extract turns out to be a saddle

793
00:42:34,568 --> 00:42:37,900
point so so these conditions are

794
00:42:37,900 --> 00:42:41,230
necessary for the existence of a local

795
00:42:41,230 --> 00:42:43,960
minimum but not sufficient so what are

796
00:42:43,960 --> 00:42:45,280
the sufficient conditions for the

797
00:42:45,280 --> 00:42:51,068
existence of a local minima so let F be

798
00:42:51,068 --> 00:42:53,940
a function from R to R a belongs to C to

799
00:42:53,940 --> 00:42:56,429
the second order sufficient conditions

800
00:42:56,429 --> 00:43:00,250
result states that if X star is a point

801
00:43:00,250 --> 00:43:03,909
on from the set R such that the first

802
00:43:03,909 --> 00:43:05,500
derivative of the function vanishes at

803
00:43:05,500 --> 00:43:08,260
that X star and the second derivative is

804
00:43:08,260 --> 00:43:11,650
greater than 0 then X star is a strict

805
00:43:11,650 --> 00:43:15,579
local minimum of of F over R so remember

806
00:43:15,579 --> 00:43:19,929
that this is a result which guarantees

807
00:43:19,929 --> 00:43:23,108
strict local minimum of F over R now the

808
00:43:23,108 --> 00:43:26,369
proof of this is again very easy that

809
00:43:26,369 --> 00:43:30,880
since I belongs to C - f - - is

810
00:43:30,880 --> 00:43:35,858
continuous so we'll choose the interval

811
00:43:35,858 --> 00:43:39,460
D such that in the in that interval for

812
00:43:39,460 --> 00:43:42,460
any X in that interval F 2 dash X is

813
00:43:42,460 --> 00:43:43,989
greater than 0 and that's possible

814
00:43:43,989 --> 00:43:47,230
because F 2 dash X star is greater than

815
00:43:47,230 --> 00:43:51,760
0 so let us write down the truncated

816
00:43:51,760 --> 00:43:54,429
Taylor series expansion of F around X

817
00:43:54,429 --> 00:43:57,219
star so f of X is nothing but f of X

818
00:43:57,219 --> 00:43:59,230
star plus F dash X star into X minus X

819
00:43:59,230 --> 00:44:01,929
star plus 1/2 F 2 dash X bar into X

820
00:44:01,929 --> 00:44:04,179
minus X star square where X bar is the

821
00:44:04,179 --> 00:44:06,880
point on the line segment X star - X now

822
00:44:06,880 --> 00:44:11,409
if F F dash X star equal to 0 then F F X

823
00:44:11,409 --> 00:44:14,619
minus f of X star is nothing but half F

824
00:44:14,619 --> 00:44:16,719
2 F 2 dash X bar into X minus X star

825
00:44:16,719 --> 00:44:18,760
square now X

826
00:44:18,760 --> 00:44:21,159
Bar is a point in the interval D so f 2

827
00:44:21,159 --> 00:44:22,630
- X bar is greater than 0

828
00:44:22,630 --> 00:44:24,789
this quantity is also greater than 0

829
00:44:24,789 --> 00:44:28,570
because it is square so the whole

830
00:44:28,570 --> 00:44:30,550
quantity is greater than 0 so which

831
00:44:30,550 --> 00:44:32,920
means that f of X is greater than f of X

832
00:44:32,920 --> 00:44:35,289
star for all X belongs to D so at least

833
00:44:35,289 --> 00:44:38,440
in the there exists some Delta greater

834
00:44:38,440 --> 00:44:41,219
than 0 such that in the neighborhood

835
00:44:41,219 --> 00:44:43,989
delta delta neighborhood of x star the

836
00:44:43,989 --> 00:44:49,690
value of the function does not attain f

837
00:44:49,690 --> 00:44:52,989
of X star in fact it is it exits f of X

838
00:44:52,989 --> 00:44:56,550
star so which implies that X star is a

839
00:44:56,550 --> 00:45:01,559
strict local minimum so remember that

840
00:45:01,559 --> 00:45:06,099
so these sufficient conditions guarantee

841
00:45:06,099 --> 00:45:11,099
a strict local minimum of F over R now

842
00:45:11,099 --> 00:45:18,300
are these conditions necessary do they

843
00:45:18,300 --> 00:45:21,639
though they guarantee that the local

844
00:45:21,639 --> 00:45:23,889
minimum is straight they turn out to be

845
00:45:23,889 --> 00:45:27,730
not necessary conditions because if we

846
00:45:27,730 --> 00:45:29,590
consider a case where f of X equal to X

847
00:45:29,590 --> 00:45:33,219
to the power 4 then X star equal to 0 is

848
00:45:33,219 --> 00:45:38,260
a strict local minimum but then f dash X

849
00:45:38,260 --> 00:45:39,610
star and F 2 dash X star

850
00:45:39,610 --> 00:45:42,789
are both 0 so this second order

851
00:45:42,789 --> 00:45:45,250
sufficient conditions they are not

852
00:45:45,250 --> 00:45:51,670
necessary in this case so so now we have

853
00:45:51,670 --> 00:45:54,969
a important result which will state

854
00:45:54,969 --> 00:45:57,699
without the proof so let us assume that

855
00:45:57,699 --> 00:46:01,030
f is a function from R to R and F

856
00:46:01,030 --> 00:46:06,010
belongs to C infinity that is all the

857
00:46:06,010 --> 00:46:07,900
derivatives of the function exists and

858
00:46:07,900 --> 00:46:11,739
they are continuous let us also assume

859
00:46:11,739 --> 00:46:14,170
that f is not a constant function so let

860
00:46:14,170 --> 00:46:16,559
us rule out that possibility also and

861
00:46:16,559 --> 00:46:19,059
let us denote the K the derivative of F

862
00:46:19,059 --> 00:46:24,340
at X as FK x and consider the problem to

863
00:46:24,340 --> 00:46:27,309
minimize f of X X belongs to PI now here

864
00:46:27,309 --> 00:46:28,929
is the result which say that X star is a

865
00:46:28,929 --> 00:46:32,829
local minimum of F if and only if

866
00:46:32,829 --> 00:46:35,260
the first non-zero element of the

867
00:46:35,260 --> 00:46:38,559
sequence fkx is positive and occurs at

868
00:46:38,559 --> 00:46:41,800
even positivity so this is a very

869
00:46:41,800 --> 00:46:43,960
important result which says that if you

870
00:46:43,960 --> 00:46:46,480
take a you start taking the derivative

871
00:46:46,480 --> 00:46:49,380
of the function from the first

872
00:46:49,380 --> 00:46:52,980
derivative second derivative and so on

873
00:46:52,980 --> 00:46:56,400
now if you consider that sequence of

874
00:46:56,400 --> 00:47:04,119
derivatives of function f at x so the

875
00:47:04,119 --> 00:47:06,760
the first nonzero element of the

876
00:47:06,760 --> 00:47:10,329
sequence fkx star is positive and occurs

877
00:47:10,329 --> 00:47:13,960
at positive even and positive K so

878
00:47:13,960 --> 00:47:16,989
that's very important so if you consider

879
00:47:16,989 --> 00:47:19,780
the example of f of X equal to X to the

880
00:47:19,780 --> 00:47:26,199
power 4 so you will see that at if X X

881
00:47:26,199 --> 00:47:28,659
star equal to 0 is a local minimum of

882
00:47:28,659 --> 00:47:33,309
that function and the first 3

883
00:47:33,309 --> 00:47:35,949
derivatives at X star are 0 and the

884
00:47:35,949 --> 00:47:40,750
fourth derivative is positive so the the

885
00:47:40,750 --> 00:47:44,139
fourth derivative that means that the K

886
00:47:44,139 --> 00:47:46,389
is equal to 4 which is a even and

887
00:47:46,389 --> 00:47:50,190
positive number and fourth derivative of

888
00:47:50,190 --> 00:47:53,619
the function X X to the power 4 is

889
00:47:53,619 --> 00:47:56,980
positive so that's why X star equal to 0

890
00:47:56,980 --> 00:47:59,829
is the local minimum of FX equal to X to

891
00:47:59,829 --> 00:48:02,590
the power 4 now one can have a similar

892
00:48:02,590 --> 00:48:05,920
result for the local maximum so the only

893
00:48:05,920 --> 00:48:08,559
change that one has to make is that the

894
00:48:08,559 --> 00:48:14,019
sequence at K X star is negative the

895
00:48:14,019 --> 00:48:15,760
first element of the first nonzero

896
00:48:15,760 --> 00:48:17,619
element of the sequence fkx star is

897
00:48:17,619 --> 00:48:19,449
negative and it occurs at even positive

898
00:48:19,449 --> 00:48:29,559
K now let us see some examples to find

899
00:48:29,559 --> 00:48:34,480
out the minima or Maxima minima of a

900
00:48:34,480 --> 00:48:38,860
problem now let us consider the problem

901
00:48:38,860 --> 00:48:42,760
minimize X square minus 1 square now the

902
00:48:42,760 --> 00:48:45,070
first step to solve such problems is

903
00:48:45,070 --> 00:48:46,599
always to identify the station

904
00:48:46,599 --> 00:48:49,088
points and the stationary points are

905
00:48:49,088 --> 00:48:51,579
found by equating the derivative to zero

906
00:48:51,579 --> 00:48:53,440
so if you take the first derivative of

907
00:48:53,440 --> 00:48:55,539
this function and that turns out to be

908
00:48:55,539 --> 00:48:58,420
for X into X square minus 1 and that

909
00:48:58,420 --> 00:49:01,630
equal to 0 implies that either X equal

910
00:49:01,630 --> 00:49:03,940
to 0 or X equal to plus 1 or X equal to

911
00:49:03,940 --> 00:49:06,130
minus 1 so it has three stationary

912
00:49:06,130 --> 00:49:12,728
points now we need to check which of

913
00:49:12,728 --> 00:49:15,670
these stationary points are local minima

914
00:49:15,670 --> 00:49:19,869
or local maxima so then we need to go

915
00:49:19,869 --> 00:49:21,478
for the second derivative information

916
00:49:21,478 --> 00:49:26,889
now F 2-1 in this case turns out to be 8

917
00:49:26,889 --> 00:49:29,079
and F 2 dash minus 1 also turns out to

918
00:49:29,079 --> 00:49:31,659
be 8 and that is the greater than 0 so

919
00:49:31,659 --> 00:49:33,518
you will see that the first derivative

920
00:49:33,518 --> 00:49:36,670
is 0 at 1 and minus 1 and the second

921
00:49:36,670 --> 00:49:38,380
derivative is positive so the first

922
00:49:38,380 --> 00:49:41,108
nonzero element of the sequence F 2 dash

923
00:49:41,108 --> 00:49:45,969
K if of the sequence fkx star is

924
00:49:45,969 --> 00:49:48,998
positive and occurs at e1 K which is 2

925
00:49:48,998 --> 00:49:51,728
in this case so 1 and minus 1 are strict

926
00:49:51,728 --> 00:49:55,179
local minima and along similar lines one

927
00:49:55,179 --> 00:49:57,880
can say that 2 - 0 which is minus 4 and

928
00:49:57,880 --> 00:50:01,719
less than 0 implies that 0 is a strict

929
00:50:01,719 --> 00:50:06,068
local maximum because this the second

930
00:50:06,068 --> 00:50:08,079
derivatives negative the first

931
00:50:08,079 --> 00:50:11,349
derivative 0 and 2 is a even positive

932
00:50:11,349 --> 00:50:17,099
integer so 0 is a strict local maximum

933
00:50:17,099 --> 00:50:21,960
now let us consider another case where

934
00:50:21,960 --> 00:50:25,329
minimize f of X where X f of X equal to

935
00:50:25,329 --> 00:50:27,670
X square minus 1 cube now if you look at

936
00:50:27,670 --> 00:50:30,248
the stationary points again we get 3

937
00:50:30,248 --> 00:50:32,829
stationary points 0 1 and minus 1 at

938
00:50:32,829 --> 00:50:34,539
which points the derivative of the

939
00:50:34,539 --> 00:50:37,389
function vanishes so so we look at the

940
00:50:37,389 --> 00:50:39,670
second derivative now second derivative

941
00:50:39,670 --> 00:50:43,659
information tells us that the second

942
00:50:43,659 --> 00:50:45,518
derivative of the function at 0 is 6

943
00:50:45,518 --> 00:50:47,588
which is positive which means that 0 is

944
00:50:47,588 --> 00:50:51,248
a strict local minimum but at the

945
00:50:51,248 --> 00:50:52,869
stationary points 1 and minus 1 the

946
00:50:52,869 --> 00:50:55,028
second derivative also vanishes so we

947
00:50:55,028 --> 00:50:56,559
have to look for the higher-order

948
00:50:56,559 --> 00:50:58,599
derivatives so let us look at the third

949
00:50:58,599 --> 00:51:00,469
derivative

950
00:51:00,469 --> 00:51:04,119
now if 3 - X is a function like this and

951
00:51:04,119 --> 00:51:09,500
you will see that f 3 - s F 3 - 1 is

952
00:51:09,500 --> 00:51:13,818
positive F 3 - minus 1 is negative so we

953
00:51:13,818 --> 00:51:15,409
really cannot conclude anything from our

954
00:51:15,409 --> 00:51:18,619
result about sufficient optimality

955
00:51:18,619 --> 00:51:21,019
conditions because that these

956
00:51:21,019 --> 00:51:24,409
derivatives are nonzero at K which is

957
00:51:24,409 --> 00:51:28,219
odd so we can just conclude that 1 and

958
00:51:28,219 --> 00:51:30,588
minus 1 are saddle points of this

959
00:51:30,588 --> 00:51:36,670
problem and 0 is a strict local minimum

960
00:51:36,670 --> 00:51:40,130
now if you look at example X to the

961
00:51:40,130 --> 00:51:43,849
power 4 and so you will see that F 3 F 4

962
00:51:43,849 --> 00:51:48,079
- X is 24 which is positive and all the

963
00:51:48,079 --> 00:51:50,000
earlier derivatives right from the first

964
00:51:50,000 --> 00:51:52,880
second and third all these derivatives

965
00:51:52,880 --> 00:51:57,170
are 0 so the first nonzero element of

966
00:51:57,170 --> 00:52:02,389
the sequence fkx star occurs at K which

967
00:52:02,389 --> 00:52:06,289
is even and positive and the element is

968
00:52:06,289 --> 00:52:09,130
positive so which means that zero is a

969
00:52:09,130 --> 00:52:15,440
strict local minima so so you will see

970
00:52:15,440 --> 00:52:18,170
that to solve such problems we need to

971
00:52:18,170 --> 00:52:21,199
find the derivative of the function and

972
00:52:21,199 --> 00:52:23,239
identify the stationary points by

973
00:52:23,239 --> 00:52:27,159
equating the derivative to zero and then

974
00:52:27,159 --> 00:52:30,050
go to the higher-order derivatives from

975
00:52:30,050 --> 00:52:32,210
the second order derivative onwards to

976
00:52:32,210 --> 00:52:34,400
see whether we can conclude anything

977
00:52:34,400 --> 00:52:38,539
about the stationary points whether they

978
00:52:38,539 --> 00:52:43,039
are local minima or local maxima now if

979
00:52:43,039 --> 00:52:45,068
the problem is such a simple problem

980
00:52:45,068 --> 00:52:49,000
then why do we need any algorithm to

981
00:52:49,000 --> 00:52:51,739
solve the problem so here is a simple

982
00:52:51,739 --> 00:52:55,250
example minimize X - 2 square and F dash

983
00:52:55,250 --> 00:52:57,739
X equal to 0 implies X star equal to 2

984
00:52:57,739 --> 00:53:00,949
so 2 is a stationary point and at that

985
00:53:00,949 --> 00:53:04,519
stationary point you will see that the F

986
00:53:04,519 --> 00:53:07,670
2 - 2 is greater than 0 which implies

987
00:53:07,670 --> 00:53:11,539
that X star is a strict local minimum so

988
00:53:11,539 --> 00:53:13,699
as I said that if it is such a

989
00:53:13,699 --> 00:53:18,349
easy thing to solve - to solve Lepidus X

990
00:53:18,349 --> 00:53:20,719
equal to zero and identify stationary

991
00:53:20,719 --> 00:53:23,150
points then where is the need of an

992
00:53:23,150 --> 00:53:26,420
algorithm so here is an example so let

993
00:53:26,420 --> 00:53:28,610
us consider the example to minimize f of

994
00:53:28,610 --> 00:53:30,500
X equal to X square plus C to the power

995
00:53:30,500 --> 00:53:33,980
X where f is a function from R to R now

996
00:53:33,980 --> 00:53:35,360
the derivative of the function which

997
00:53:35,360 --> 00:53:39,230
will denote in this course when using

998
00:53:39,230 --> 00:53:42,349
the function G of X that turns out to be

999
00:53:42,349 --> 00:53:46,159
2 X plus e to the power X and if we

1000
00:53:46,159 --> 00:53:51,340
equate it to 0 we really cannot find a

1001
00:53:51,340 --> 00:53:54,019
solution of this problem in a

1002
00:53:54,019 --> 00:53:55,460
straightforward way like what we did

1003
00:53:55,460 --> 00:54:00,739
here so we need some algorithm to find

1004
00:54:00,739 --> 00:54:05,599
out X which satisfies 2x plus e to the

1005
00:54:05,599 --> 00:54:08,179
power x equal to 0 and that sir we'll

1006
00:54:08,179 --> 00:54:13,719
need a numerical procedure or algorithm

1007
00:54:13,719 --> 00:54:20,030
to solve to find a stationary point of

1008
00:54:20,030 --> 00:54:23,260
these kinds of functions so in general

1009
00:54:23,260 --> 00:54:29,630
it is difficult to find the stationary

1010
00:54:29,630 --> 00:54:31,639
points using a simple way like this and

1011
00:54:31,639 --> 00:54:37,039
one has to go for some algorithm to

1012
00:54:37,039 --> 00:54:39,559
solve this now there exist different

1013
00:54:39,559 --> 00:54:44,110
kinds of algorithms to solve this

1014
00:54:44,110 --> 00:54:51,110
problem so some algorithms they do use

1015
00:54:51,110 --> 00:54:53,869
the derivative information some

1016
00:54:53,869 --> 00:54:55,610
algorithms do not use the derivative

1017
00:54:55,610 --> 00:54:58,400
information so but assume certain form

1018
00:54:58,400 --> 00:55:01,610
of the function called unimodal function

1019
00:55:01,610 --> 00:55:05,030
so we will see those things in the next

1020
00:55:05,030 --> 00:55:08,659
class so in the next class we look at

1021
00:55:08,659 --> 00:55:12,619
some algorithms to solve the problem of

1022
00:55:12,619 --> 00:55:14,900
the type f dash x equal to 0 or finding

1023
00:55:14,900 --> 00:55:16,900
the minimum of the function f of X

1024
00:55:16,900 --> 00:55:19,489
without resorting to any derivative

1025
00:55:19,489 --> 00:55:21,829
information so those are called the

1026
00:55:21,829 --> 00:55:24,530
derivative free methods and the methods

1027
00:55:24,530 --> 00:55:27,559
which use derivatives they are

1028
00:55:27,559 --> 00:55:29,778
derivative based methods so we'll see

1029
00:55:29,778 --> 00:55:34,608
some of them to solve the problems of

1030
00:55:34,608 --> 00:55:38,900
the type G of X equal to zero or the are

1031
00:55:38,900 --> 00:55:42,619
the problems where the the derivative of

1032
00:55:42,619 --> 00:55:44,568
the function vanishes or the function as

1033
00:55:44,568 --> 00:55:47,838
the minimum so we'll stop here and

1034
00:55:47,838 --> 00:55:49,119
continue in the next class

1035
00:55:49,119 --> 00:55:59,530
thank you

1036
00:55:59,530 --> 00:56:00,560
you