1
00:00:01,380 --> 00:00:06,090
In this lesson, we are going to look at mounting the other containers that we have.

2
00:00:06,090 --> 00:00:09,180
In the last lesson, we mounted our demo container.

3
00:00:09,180 --> 00:00:14,910
But as you know, we have three more containers that are required for our project, which are raw,

4
00:00:14,940 --> 00:00:16,800
processed and presentation.

5
00:00:17,250 --> 00:00:19,140
Let's look at mounting them.

6
00:00:19,350 --> 00:00:26,520
We can simply copy the command here that we use to mount our demo container and replace that with a

7
00:00:26,520 --> 00:00:28,740
raw, processed and presentation.

8
00:00:28,740 --> 00:00:31,860
But we don't want to do that because that's a lot of duplication.

9
00:00:31,860 --> 00:00:37,380
And every time we mount a container we copy the same code and make the corresponding changes.

10
00:00:37,680 --> 00:00:41,940
In a large project we will have a lot of storage accounts and containers.

11
00:00:41,940 --> 00:00:48,540
So instead of hard coding everything, we could simply create a python function and call it with a parameter

12
00:00:48,540 --> 00:00:51,090
such as storage account name and the container.

13
00:00:51,620 --> 00:00:58,640
If you're new to Python functions in Python are just a set of reusable code which gets executed when

14
00:00:58,640 --> 00:00:59,900
you call that function.

15
00:01:00,050 --> 00:01:05,300
Also, you can send parameters to a function and it can also return results back to you.

16
00:01:05,330 --> 00:01:12,590
So let's create a function to which we can pass in the storage account name and the container name,

17
00:01:12,590 --> 00:01:14,420
and it mounts that for us.

18
00:01:14,450 --> 00:01:18,170
Instead of working on this notebook, I'm going to create a new notebook.

19
00:01:18,170 --> 00:01:20,870
So that's separate for you when you come to look at it.

20
00:01:21,050 --> 00:01:26,900
And I'm going to call this one as eight mount containers for the project.

21
00:01:29,850 --> 00:01:33,990
So let's start a new cell here in Python.

22
00:01:34,020 --> 00:01:40,380
A function is defined using the def keyword, and then you need to provide a name for the function.

23
00:01:40,380 --> 00:01:43,070
I'm going to call this as Mount ADLs.

24
00:01:43,110 --> 00:01:46,830
Then you provide the parameters inside brackets.

25
00:01:47,070 --> 00:01:53,940
I'm going to have two parameters as a set and I'm going to call them as storage account name and container

26
00:01:53,940 --> 00:01:54,420
name.

27
00:01:55,580 --> 00:02:00,920
And one more important thing is we need to end the line with a colon and you hit enter.

28
00:02:00,920 --> 00:02:04,790
And this is where you start to write your body of the function.

29
00:02:04,790 --> 00:02:08,750
And that's what gets executed every time you call that function.

30
00:02:08,750 --> 00:02:13,940
Let's start with the secrets from the key vault, which will be the client ID, tenant ID and the client

31
00:02:13,940 --> 00:02:17,030
secret that will go as the first statement here.

32
00:02:17,840 --> 00:02:23,930
Please make sure that you've indented your statements here because otherwise that won't be treated as

33
00:02:23,930 --> 00:02:26,510
part of the function within Python.

34
00:02:26,510 --> 00:02:29,090
So I'm going to simply add a little comment as well.

35
00:02:30,140 --> 00:02:33,860
And the next thing we want to do is to set the configurations.

36
00:02:35,040 --> 00:02:39,330
Again, I'm going to simply copy what we got here and just paste it there.

37
00:02:41,220 --> 00:02:47,280
And the last thing we want here is to mount our storage, which is using the Databricks file system

38
00:02:47,280 --> 00:02:48,030
utility.

39
00:02:51,210 --> 00:02:52,980
I'm going to intend this again.

40
00:02:52,980 --> 00:02:59,760
And as you can see here, we've got the hardcoded values for the container and the storage account.

41
00:02:59,760 --> 00:03:05,070
So we'll have to change that because for this function we are going to send them as parameters so we

42
00:03:05,070 --> 00:03:06,600
can use those values here.

43
00:03:06,600 --> 00:03:10,080
So storage account name will contain the storage account name.

44
00:03:10,080 --> 00:03:16,110
So instead of having the hardcoded value, let's replace that with the name of the variable, which

45
00:03:16,110 --> 00:03:19,050
is the storage account name within curly brackets.

46
00:03:19,050 --> 00:03:22,260
And we're going to use the F string interpolation here.

47
00:03:22,260 --> 00:03:24,780
So prefix that with the F as well.

48
00:03:24,900 --> 00:03:31,020
And similarly for the container name, we're going to use the container name variable instead of demo.

49
00:03:31,020 --> 00:03:33,960
And again, I'm putting that within curly brackets.

50
00:03:33,960 --> 00:03:36,600
We have to do the same thing for the mount point.

51
00:03:36,600 --> 00:03:44,880
So let's prefix the value here with the letter F and instead of Formula One, let's use the storage

52
00:03:44,880 --> 00:03:46,230
account name variable.

53
00:03:46,560 --> 00:03:53,920
And for the demo we are going to use the container name variable and we enclose both of them within

54
00:03:54,460 --> 00:03:55,780
curly brackets as well.

55
00:03:56,080 --> 00:04:00,340
So that's everything we need to do in order to have the function.

56
00:04:00,340 --> 00:04:06,010
But you can also add other things within the function so that you don't have to repeat anything later

57
00:04:06,010 --> 00:04:06,400
on.

58
00:04:06,400 --> 00:04:13,720
So for example, if I wanted to also have the list of amounts so that I can see as soon as it's mounted,

59
00:04:13,720 --> 00:04:15,670
my amount is being listed as well.

60
00:04:15,670 --> 00:04:20,680
So let me quickly copy this statement here and put that into the function as well.

61
00:04:21,320 --> 00:04:22,280
So that's brilliant.

62
00:04:22,280 --> 00:04:23,120
That's done as well.

63
00:04:23,120 --> 00:04:28,580
So let's attach our notebook to the cluster and we can run that now.

64
00:04:28,700 --> 00:04:31,280
So let me quickly execute this function.

65
00:04:31,310 --> 00:04:34,550
When you execute a function, it just creates the function.

66
00:04:34,550 --> 00:04:35,990
It doesn't do anything else.

67
00:04:35,990 --> 00:04:41,660
You will have to invoke the function with the parameter values for it to do something for you.

68
00:04:41,720 --> 00:04:45,680
So let's say we want to mount our raw container first.

69
00:04:46,530 --> 00:04:51,870
In order to do that, you just have to call the function mount underscore adls.

70
00:04:53,430 --> 00:04:57,660
And you need to pass in the value of the storage account name.

71
00:04:57,660 --> 00:05:05,180
In my case, it is Formula One and you need to pass in the value of the container name, which is a

72
00:05:05,190 --> 00:05:06,270
raw in this case.

73
00:05:06,270 --> 00:05:07,830
So let me execute that.

74
00:05:08,910 --> 00:05:12,800
And as you can see, it's now mounted the raw container as well.

75
00:05:12,810 --> 00:05:19,380
So we got the mount here, slash slash Formula one slash raw, and it's being mounted to the storage

76
00:05:19,380 --> 00:05:21,780
account location here, which is brilliant.

77
00:05:21,780 --> 00:05:29,130
So when we want to do it for the presentation or the process layer, you would simply call the function

78
00:05:29,130 --> 00:05:31,980
with the value as presentation or processed.

79
00:05:32,010 --> 00:05:36,690
You can even go a step further if you're keen on getting 100% production ready.

80
00:05:36,690 --> 00:05:43,020
The one problem at the moment with this function is that if you rerun this one, it will fail because

81
00:05:43,020 --> 00:05:44,820
the mount is already been mounted.

82
00:05:44,820 --> 00:05:48,360
So we'll have to unmount it in order for it to be mounted again.

83
00:05:48,360 --> 00:05:55,200
In order to do that, you can check to see whether a mount exists and then unmount that mount before

84
00:05:55,200 --> 00:05:56,280
trying to mount it.

85
00:05:56,280 --> 00:05:59,090
So this statement here would do that for you.

86
00:05:59,100 --> 00:06:01,860
So let me just intend it right first.

87
00:06:02,690 --> 00:06:09,500
So all it's doing is it's getting the list of mounds here into a mount variable and then it's going

88
00:06:09,500 --> 00:06:15,740
through the mount points and checking to see if one of them is the one you're trying to create.

89
00:06:15,770 --> 00:06:19,940
In our case, that will be slash MBA, slash Formula one slash arrow.

90
00:06:20,120 --> 00:06:26,420
And if that is the case, it will simply go and unmount it and then it will do the mount for you.

91
00:06:26,510 --> 00:06:32,150
If the mount doesn't exist, it will simply ignore that step here and just go and mount it.

92
00:06:32,150 --> 00:06:34,340
So that is a bit more robust.

93
00:06:34,340 --> 00:06:37,040
It doesn't fail when you already have a mount.

94
00:06:37,070 --> 00:06:40,610
Instead it will just unmount that and then mount it back again.

95
00:06:40,610 --> 00:06:42,740
So let's execute this function now.

96
00:06:42,740 --> 00:06:49,400
And if you rerun this statement here to mount the raw container, it will simply unmount the mount,

97
00:06:49,400 --> 00:06:52,610
which is already there and then mount it back again for you.

98
00:06:54,740 --> 00:06:56,330
As you can see, it's done.

99
00:06:56,330 --> 00:07:02,240
The unmount here and then it's listed all the mounts here for you and the raw mount is still there because

100
00:07:02,240 --> 00:07:03,310
it's been mounted.

101
00:07:03,320 --> 00:07:05,960
So let me quickly add a comment here as well.

102
00:07:06,860 --> 00:07:12,620
So let's now mount our presentation and the process to containers as well here.

103
00:07:12,620 --> 00:07:20,690
So I'm simply going to copy and paste that statement and change the value from raw to processed.

104
00:07:20,690 --> 00:07:26,270
And also I'm going to have another cell and that's going to have the presentation container.

105
00:07:28,460 --> 00:07:32,830
As you can see, we've successfully mounted all three containers for the project.

106
00:07:32,840 --> 00:07:36,660
So this is exactly what you would do if you are in a production project.

107
00:07:36,680 --> 00:07:42,770
One person in the project will write the function with all the capabilities that we need and everybody

108
00:07:42,770 --> 00:07:44,750
else will just invoke the function.

109
00:07:44,750 --> 00:07:45,800
So that's brilliant.

110
00:07:45,800 --> 00:07:50,810
So let's quickly delete all the other statements that we don't need in this notebook.

111
00:07:52,770 --> 00:08:00,060
I hope you now have a good understanding about how to mount a data lake storage container to your database

112
00:08:00,090 --> 00:08:02,480
and also get that production ready.

113
00:08:02,490 --> 00:08:03,900
That's the end of this lesson.

114
00:08:03,900 --> 00:08:05,130
I'll see you in the next one.


