1
00:00:00,510 --> 00:00:03,600
So let's move forward and visualize the data we have.

2
00:00:04,620 --> 00:00:07,890
So here I'm visualizing some of the traffic signs from the dataset.

3
00:00:09,300 --> 00:00:11,100
And let's see how it looks.

4
00:00:14,480 --> 00:00:16,820
So here are some of the images from the dataset.

5
00:00:17,840 --> 00:00:25,220
And as you can see here, the dimension of the images are not the same for all the images, so as the

6
00:00:25,220 --> 00:00:28,220
size of the images are different, we have to make them equal.

7
00:00:28,400 --> 00:00:31,580
So we will take the mean of these images here.

8
00:00:31,580 --> 00:00:33,470
I'm finding out the meaning of these images.

9
00:00:33,980 --> 00:00:40,760
So if you can see, we already know that there are 43 glasses in this graphic same dataset, so I'm

10
00:00:40,760 --> 00:00:43,070
running a our loop from zero to 40.

11
00:00:43,070 --> 00:00:44,240
That is 40 glasses.

12
00:00:45,020 --> 00:00:50,000
And I'll take each image and find out the meaning of that image, Dimension one and Dimension two.

13
00:00:50,210 --> 00:00:51,080
And we distorted.

14
00:00:52,860 --> 00:00:57,710
Let's run this and store that dimension of these images to find out the mean of these images.

15
00:00:58,160 --> 00:00:59,540
It's running right now.

16
00:01:00,020 --> 00:01:01,100
So we have stored this.

17
00:01:01,580 --> 00:01:05,630
Next, we will find out that many of these dimensions and print them out.

18
00:01:07,820 --> 00:01:13,880
So here we can see that 50 by 50 is the average ship for all the majors.

19
00:01:14,810 --> 00:01:20,710
Here we can see that the damage and one win is fifty point three to Andamans I two minutes 23.

20
00:01:20,990 --> 00:01:23,990
So we can conclude that 50 50 is the average ship.

21
00:01:24,410 --> 00:01:28,300
Now we will convert all these images into this ship 50 by 50.

22
00:01:30,290 --> 00:01:33,380
For that, I'm using images, an empty list.

23
00:01:33,650 --> 00:01:40,730
Also, our neighbor lady is an activist and we ship all these images and also still in the early label.

24
00:01:40,800 --> 00:01:49,670
And for this, we have we are running a for loop here for 43 classes and storing all the images, resizing

25
00:01:49,670 --> 00:01:51,530
them and also their IDs.

26
00:01:51,530 --> 00:02:00,410
Also Mr Anderson and convert these images or resize these images into the desired dimensions that is

27
00:02:00,410 --> 00:02:01,340
50 by 50.

28
00:02:01,940 --> 00:02:06,600
And also storing that, ladies, when it's done.

29
00:02:07,160 --> 00:02:12,890
So the next step would be to convert these images into number edit and also to normalize them.

30
00:02:12,890 --> 00:02:19,430
So in order to normalize an image, we will divide it by 255 because the pixel value of each image ranges

31
00:02:19,430 --> 00:02:20,690
from zero to 255.

32
00:02:20,930 --> 00:02:26,120
So we'll divide each image by two fifty five to normalize them and to convert them into edit will simply

33
00:02:26,120 --> 00:02:26,950
use the function.

34
00:02:26,960 --> 00:02:32,750
And if we don't edit, so let's run this in, and the converter image isn't to no edit and normalize

35
00:02:32,750 --> 00:02:32,930
them.

36
00:02:35,560 --> 00:02:36,190
So it's done.

37
00:02:38,070 --> 00:02:42,720
Similarly, we'll store the label, I listen to no edit and we'll see the shape of it.

38
00:02:45,210 --> 00:02:49,750
So here we can observe that there are three nine two zero nine zero ladies.

39
00:02:50,700 --> 00:02:53,070
Similarly, will print the shape of the images.

40
00:02:55,350 --> 00:03:02,120
So here we can see that there are three nine two zero nine images with a shape of fifty two fifty to

41
00:03:02,160 --> 00:03:02,460
three.

42
00:03:03,000 --> 00:03:07,890
So here three is the channel, which means that their color images and AGP format.

43
00:03:07,920 --> 00:03:11,160
Let's move forward now will visualize the number of the glasses.

44
00:03:11,400 --> 00:03:13,870
To find out if the data is unbalanced or not.

45
00:03:13,950 --> 00:03:20,940
So here we can see that it is around twenty two hundred in each class, so we can see that the data

46
00:03:20,940 --> 00:03:22,200
is not imbalanced.

47
00:03:22,560 --> 00:03:23,340
That is balanced.

48
00:03:23,370 --> 00:03:25,200
We do not need to balance the data.

49
00:03:25,210 --> 00:03:26,430
Data is already balanced.

50
00:03:27,510 --> 00:03:30,180
Now we'll split that data using the best transmit.

51
00:03:33,080 --> 00:03:35,390
Into a ratio of 80 percent.

52
00:03:35,510 --> 00:03:38,000
Training data and 20 percent validation data.

53
00:03:40,960 --> 00:03:48,580
Similarly, will convert the training data that will convert that label, that is which class it belongs

54
00:03:48,580 --> 00:03:52,600
to and do not encoding using two categorical function.

55
00:03:53,230 --> 00:03:58,780
So why not encoding is important because if we don't apply, one not including the machine will prioritize

56
00:03:58,780 --> 00:03:58,900
it.

57
00:03:58,910 --> 00:04:03,520
So to avoid prioritization, we apply AV1 encoding, so we'll apply.

58
00:04:03,520 --> 00:04:05,770
We're not encoding on the training and the validation data.

59
00:04:06,190 --> 00:04:12,130
The target variable that is Weitering categorical and viral categorical it.

