1
00:00:05,060 --> 00:00:06,170
Hi and welcome back.

2
00:00:06,320 --> 00:00:11,750
So from the previous videos, we are actually trying to build a recommendation system and we succeeded

3
00:00:11,750 --> 00:00:17,990
in getting a movie name from the user after getting that name we hit on the IMDB website and try to

4
00:00:17,990 --> 00:00:22,820
get the list of the movie names and then look for our users movie name in that list.

5
00:00:22,820 --> 00:00:26,330
And if that list contains the movie name, we don't do that.

6
00:00:26,750 --> 00:00:32,860
And after getting to that movie page, we extracted the director's name and then he can't do that.

7
00:00:33,140 --> 00:00:39,470
Now, in this session, we will try to extract the information, the top four movies of their director

8
00:00:39,470 --> 00:00:41,230
and recommend that to our user.

9
00:00:41,570 --> 00:00:42,080
So.

10
00:00:44,230 --> 00:00:49,810
This was the movie page, and when we get to the director will hit on that and this was the director page

11
00:00:50,470 --> 00:00:54,500
Now what we are looking for is we want these four movies, right.

12
00:00:54,520 --> 00:00:58,470
So let's just get back to the ide again.

13
00:00:58,480 --> 00:01:00,220
We will be doing the same thing.

14
00:01:00,790 --> 00:01:01,950
Let's just cut this out.

15
00:01:02,260 --> 00:01:10,360
So this section is responsible for taking them from the user, then extracting the IMDB movies and then

16
00:01:10,360 --> 00:01:15,610
creating a soup and doing all that stuff and checking that if the user entered movie name is in the

17
00:01:15,610 --> 00:01:21,880
IMDB list or not, and if it is spread, then we are just getting the movie where this chunk is responsible

18
00:01:21,880 --> 00:01:29,450
for extracting that, for hitting on that movie well and extracting the director's information directors

19
00:01:29,680 --> 00:01:31,030
general and director's name.

20
00:01:31,360 --> 00:01:38,310
Now we will try to replicate the same thing, but for extracting the top four movies of their director.

21
00:01:38,530 --> 00:01:40,390
So let's just take this Eurail.

22
00:01:42,970 --> 00:01:43,300
OK.

23
00:01:43,330 --> 00:01:46,710
And over here, let's just based this out for now.

24
00:01:47,140 --> 00:01:50,710
Now, what we have to do is we have to, again, replicate the same thing.

25
00:01:51,160 --> 00:01:58,510
Requests don't get and I want to get to that and.

26
00:01:59,690 --> 00:02:08,840
R e three equals to this thing now I just need to extract the e-mail from them Horia three dot html.

27
00:02:12,310 --> 00:02:18,310
Perfect, so now after getting the XHTML, what I will do is I will create a beautiful soup and then

28
00:02:18,310 --> 00:02:22,190
I will see that what will be the parts for getting these four movies?

29
00:02:22,540 --> 00:02:24,880
So if we go for soup three.

30
00:02:27,530 --> 00:02:32,150
Equals to beautiful soup and then the XHTML.

31
00:02:34,100 --> 00:02:35,750
And then the pastor.

32
00:02:39,220 --> 00:02:39,620
Perfect.

33
00:02:40,000 --> 00:02:46,270
So now, after getting on this, what we have to do, we have to look and look into this, a page that

34
00:02:46,270 --> 00:02:48,960
from where we can extract these four movie names.

35
00:02:49,420 --> 00:02:51,680
So here you can see that this is the movie name.

36
00:02:51,700 --> 00:02:53,800
This is the actual movie patch.

37
00:02:54,190 --> 00:02:58,120
And then if we scroll up a bit, scroll up a bit.

38
00:02:59,790 --> 00:03:02,940
Yeah, so this is the idea if we just shut that down.

39
00:03:05,560 --> 00:03:11,090
So these are the formal release, the first one, the second one, the third one and the fourth one.

40
00:03:11,290 --> 00:03:17,490
So if I narrowed down my scope to this, if I can then certainly look for these movie names.

41
00:03:18,040 --> 00:03:20,380
So let's just do this first.

42
00:03:20,410 --> 00:03:26,120
So, again, what they have to do is previously while we're working, that we encountered classified.

43
00:03:26,120 --> 00:03:29,500
So same rules will be applicable for the IDF.

44
00:03:29,980 --> 00:03:34,310
So again, we have to do that through three dort fine.

45
00:03:34,960 --> 00:03:40,290
So here there is no need for going find all because the idea will only be running throughout the page.

46
00:03:40,350 --> 00:03:44,480
We can certainly go for that idea and putting the fine.

47
00:03:44,500 --> 00:03:55,600
So now we need that deal with the idea of nine four and then let's just assign it known for.

48
00:03:58,040 --> 00:04:02,450
Equal to now, if we just simply for testing, let's just bring down on for stuff.

49
00:04:10,210 --> 00:04:12,630
OK, I think that this should be taxed right?

50
00:04:13,250 --> 00:04:14,000
Sorry, my bad.

51
00:04:14,540 --> 00:04:15,950
So now if I hate it again.

52
00:04:19,900 --> 00:04:25,350
Perfect, I'm getting the response now, what they have to do is I have to iterate on it, I have to

53
00:04:25,450 --> 00:04:28,330
trade on the device and then extract the movie name.

54
00:04:28,350 --> 00:04:34,160
So what I can do, I can do a find all with this thing on that div.

55
00:04:37,300 --> 00:04:43,300
So now what I can do, I can do nine, four, and then my playing on nine four, so it is only applicable

56
00:04:43,510 --> 00:04:48,610
inside this inside this non-negative, it is only applicable on these four stuff.

57
00:04:49,090 --> 00:04:57,190
So now if we go four, nine, four and then go four, find all and then we want that deals with the

58
00:04:57,190 --> 00:04:57,730
class.

59
00:05:00,590 --> 00:05:03,800
This stuff, so it will provide us with the movie names.

60
00:05:08,380 --> 00:05:09,840
What we did is we have done what we did.

61
00:05:11,080 --> 00:05:17,770
Now, what we have to do is we have to extract I trade on that for and then extract the movie names.

62
00:05:19,080 --> 00:05:28,530
So if I know I trade on this for Divx in movie is so for the first iteration, it will provide me Esteve

63
00:05:29,730 --> 00:05:34,080
for the next, it will provide me this for the next, it will provide me this and for the last will

64
00:05:34,080 --> 00:05:34,700
provide me this.

65
00:05:35,010 --> 00:05:38,040
So let's just head on for the first one and the rest will be the same.

66
00:05:38,430 --> 00:05:42,990
So overhead going on the Stav, we want to extract the movie name.

67
00:05:43,560 --> 00:05:45,450
So if we go for the.

68
00:05:48,760 --> 00:05:50,290
Divx dot.

69
00:05:52,510 --> 00:05:53,440
With the movie name.

70
00:06:08,920 --> 00:06:17,740
Yeah, this is so here we can find this thing for the movie name, so let me just copy this and save.

71
00:06:22,770 --> 00:06:24,270
If we close this one.

72
00:06:26,620 --> 00:06:27,020
Perfect.

73
00:06:27,310 --> 00:06:38,620
So now we just want to extract this thing like they've got find and I want to find that div with declasse.

74
00:06:41,380 --> 00:06:46,560
Of this thing, and then let's just say it is.

75
00:06:48,710 --> 00:06:49,400
The.

76
00:06:51,710 --> 00:06:52,130
Movie.

77
00:06:53,780 --> 00:06:56,180
And then if we go for movie dive

78
00:06:59,610 --> 00:07:00,870
Dot string, hopefully.

79
00:07:02,470 --> 00:07:05,470
If we open this thing, if we go down a bit.

80
00:07:11,370 --> 00:07:17,310
Yep, we are heading to this div and then actually we need to go to the anchor tag and then we can go

81
00:07:17,310 --> 00:07:18,050
for that string.

82
00:07:18,090 --> 00:07:18,560
Perfect.

83
00:07:18,570 --> 00:07:21,290
So we go for the a and then we go for the string.

84
00:07:22,350 --> 00:07:25,020
So let's just print this and see what we get.

85
00:07:36,790 --> 00:07:39,410
Perfect, now we are getting all the movie names from here.

86
00:07:39,850 --> 00:07:42,150
This one, this one, this one and this one.

87
00:07:42,670 --> 00:07:49,390
So now uptil here we have seen that how we can hit on this director page and how we can narrow it down

88
00:07:49,390 --> 00:07:51,100
for extracting these movie names.

89
00:07:51,100 --> 00:07:55,470
In the next session, we will try to sink all of these three chunks together.

