0
00:00:04,256 --> 00:00:08,716
>> This is an introduction to the data and analytic cycle, now this obviously vary

1
00:00:08,716 --> 00:00:13,466
in different context including access to data within an organization and the kinds of tools

2
00:00:13,466 --> 00:00:16,776
that you have available and even just organizational processes.

3
00:00:16,836 --> 00:00:21,166
But this is roughly the types of things that you'll be aware of that you'll be engaging

4
00:00:21,166 --> 00:00:23,006
in when you're working with educational data.

5
00:00:23,306 --> 00:00:26,216
So first of all, when we're dealing with educational data,

6
00:00:26,216 --> 00:00:34,166
you have a range of data source options, now these sources are growing in quantity regularly

7
00:00:34,166 --> 00:00:39,686
as there's no approaches for data collection, as there are different types

8
00:00:39,686 --> 00:00:44,116
of integration opportunities between datasets, which means you can get some level

9
00:00:44,116 --> 00:00:48,576
of additional insight when you bring in multiple datasets but these sources can include anything

10
00:00:48,576 --> 00:00:52,206
from student information systems to student -- or to the institution LMS data,

11
00:00:52,206 --> 00:00:56,606
it might also be used through some mining of social media, use of swipe cards

12
00:00:56,606 --> 00:00:59,436
within a university or within a school system so there's a range of things

13
00:00:59,436 --> 00:01:01,176
where we use our basic data sources with.

14
00:01:01,726 --> 00:01:03,846
Now I haven't seen a lot of systems that have done a great job

15
00:01:03,846 --> 00:01:07,976
with putting together an integrated repository, which brings these different datasets together

16
00:01:07,976 --> 00:01:11,306
and partly because it's not a very simple process to do

17
00:01:11,416 --> 00:01:14,966
but the general idea is you have a range of datasets and then you have a range

18
00:01:14,966 --> 00:01:16,946
of questions that you're asking of that data.

19
00:01:16,946 --> 00:01:22,146
Now in some cases, you're looking at things such as how does being connected socially

20
00:01:22,146 --> 00:01:26,136
within a course impact students' success and are there ways

21
00:01:26,136 --> 00:01:31,076
that we can help foster better social connections if there's actually a relationship

22
00:01:31,076 --> 00:01:35,836
between social positioning in a network and the performance

23
00:01:35,886 --> 00:01:37,526
of that student in an academic setting.

24
00:01:37,656 --> 00:01:42,396
It may also be around trying to identify or sensitizing models early on based

25
00:01:42,396 --> 00:01:45,956
on student profiles so if a student comes from a particular type of a background,

26
00:01:45,956 --> 00:01:50,256
first in family in terms of degree completion, there's a possibility

27
00:01:50,256 --> 00:01:54,726
that that information can be helpful for a university in terms of tailoring resources

28
00:01:54,726 --> 00:01:58,686
that the student might need or at least in sensitizing the analysis work that they do

29
00:01:58,686 --> 00:02:03,326
so that certain behavior from a student who's deemed to be at risk due to a variety

30
00:02:03,326 --> 00:02:09,826
of factors are addressed in advance and the behavior of that individual would be different

31
00:02:09,826 --> 00:02:13,626
from a student that's perhaps not deemed to be at risk so try that again,

32
00:02:13,626 --> 00:02:18,806
they could both be exhibiting similar behavior but the system would treat them differently

33
00:02:18,806 --> 00:02:22,466
in terms of how they're assessing or evaluating that student.

34
00:02:22,466 --> 00:02:26,316
Now this comes out in two perspectives and there's academic analytics,

35
00:02:26,316 --> 00:02:31,406
which is really using data and analytics to improve organizational performance.

36
00:02:31,796 --> 00:02:34,456
There's learning analytics, which we're obviously focused on in this course

37
00:02:34,456 --> 00:02:39,176
where we're targeting what the student is actually doing, what the faculty is doing

38
00:02:39,346 --> 00:02:43,086
and ways to improve both the teaching and the learning aspect of it.

39
00:02:43,366 --> 00:02:47,956
So the analytics model that we're looking at looks something like this, as I mentioned,

40
00:02:47,956 --> 00:02:52,156
it can change for a variety of context and a variety of reasons so I'll go through each

41
00:02:52,156 --> 00:02:55,266
of these elements individual fairly -- individually fairly quickly.

42
00:02:55,666 --> 00:03:00,656
So the first part obviously involves getting a hold of the data and this can come from a range

43
00:03:00,656 --> 00:03:04,926
of options, both institutional or outside the institutional.

44
00:03:04,926 --> 00:03:09,936
There may be factors such as you're looking at getting data to --

45
00:03:09,936 --> 00:03:15,756
for the organization to improve its marketing or promotion to a particular type of student

46
00:03:15,756 --> 00:03:23,796
or student profile or it may be you're trying to build a particular model of pedagogy

47
00:03:23,796 --> 00:03:29,396
or assess a pedagogical model based on certain attributes of that student or based

48
00:03:29,396 --> 00:03:31,066
on certain practices of the educator.

49
00:03:31,186 --> 00:03:35,036
So once you've really sort of defined or looked at what is it that we're trying to do

50
00:03:35,146 --> 00:03:39,426
with this particular analytics activity, then you can define how you get a hold of

51
00:03:39,426 --> 00:03:41,726
and how you make sense of the data that you're collecting.

52
00:03:42,266 --> 00:03:45,176
From there, there can be questions that relate to storage.

53
00:03:45,566 --> 00:03:51,386
In many cases, the data is automatically going to be stored with the native application

54
00:03:51,386 --> 00:03:55,506
that we're going to use to do analytics work with, meaning that the storage

55
00:03:55,626 --> 00:03:59,786
of LMS data isn't an issue that you necessarily need to look at, it's getting a hold

56
00:03:59,786 --> 00:04:01,376
of that data that's more consequential.

57
00:04:01,376 --> 00:04:06,436
On the other hand, if you're going to use some data that is being generated

58
00:04:06,436 --> 00:04:09,246
through social media or that is being generated

59
00:04:09,246 --> 00:04:15,076
through other organizational data collection practices, then issues of storage and security

60
00:04:15,076 --> 00:04:21,586
of storage and moving it out of that initial database into a data format and a database

61
00:04:21,586 --> 00:04:24,856
that you can use for analysis work is an important consideration.

62
00:04:25,856 --> 00:04:30,036
In some instances, you're going to have reasonably clean data or data at least that's

63
00:04:30,036 --> 00:04:36,436
in an analytics friendly state and that's typically if you have, as mentioned already,

64
00:04:36,436 --> 00:04:41,186
LMS data, which is institutionally configured against certain variables, often that's related

65
00:04:41,186 --> 00:04:43,726
at some level at least to a student information system data

66
00:04:44,036 --> 00:04:45,526
so that's in a reasonably good shape.

67
00:04:45,526 --> 00:04:53,166
On the other hand, if you have data that comes from social media or that comes from a variety

68
00:04:53,166 --> 00:04:57,806
of different systems that aren't necessarily connected meaningfully, then you likely do have

69
00:04:57,806 --> 00:05:03,636
to go through a process of both cleaning and then ultimately integrating those datasets.

70
00:05:04,236 --> 00:05:06,266
From there, it's the type of analysis work that you're doing,

71
00:05:06,266 --> 00:05:09,506
now obviously this is a question actually that's decided early on.

72
00:05:09,676 --> 00:05:13,736
It's not -- you're -- you don't start thinking about analysis at the point of having all

73
00:05:13,736 --> 00:05:17,916
of your data together; quite often, you're going to be looking at specific questions

74
00:05:17,916 --> 00:05:22,996
and the process of collection and acquisition of data is going to be related to the types

75
00:05:22,996 --> 00:05:25,476
of questions that you're asking institutionally or that you're asking

76
00:05:25,476 --> 00:05:29,106
at a particular class level, whether that's a specific analytics technique

77
00:05:29,106 --> 00:05:33,766
or whether you have broader goals as a system in terms of being able to get a sense

78
00:05:33,766 --> 00:05:37,326
of which students are at risk of dropping out or what are the best course sequences

79
00:05:37,326 --> 00:05:38,896
that a student should take through a program

80
00:05:39,296 --> 00:05:42,116
that produces the greatest possibility for success.

81
00:05:42,976 --> 00:05:48,086
Now as mentioned in the previous video, there are important considerations to be dealt

82
00:05:48,086 --> 00:05:49,896
with around how you present that data.

83
00:05:50,076 --> 00:05:53,396
It certainly isn't sufficient to just take a --

84
00:05:53,826 --> 00:05:58,406
the analytics output of work that you've been doing and just present it

85
00:05:58,406 --> 00:06:01,356
to someone in a CSV or in a similar format.

86
00:06:01,496 --> 00:06:07,026
You need to present a visualization that ideally is interactive, which means that the individual

87
00:06:07,026 --> 00:06:10,736
that is viewing the data or working with the data has the ability to change variables

88
00:06:10,736 --> 00:06:15,776
or to begin to look at well, what happens if we change this here or ask specific questions

89
00:06:16,026 --> 00:06:22,776
of that data; that's why the tool we're using in week 2 is tableau enables a good explanation

90
00:06:22,776 --> 00:06:26,916
or a good overview of what that process is and how that actually works.

91
00:06:27,006 --> 00:06:30,636
And then finally, which we aren't going to look at in this course but it's something

92
00:06:30,636 --> 00:06:35,716
to be aware of, once you've gone through this experience or this entire cycle

93
00:06:35,716 --> 00:06:42,056
of collecting your data and cleaning it and analyzing it and representing it to learners,

94
00:06:42,186 --> 00:06:47,666
faculty, teachers or administrators, it's about what you do now and that's not something

95
00:06:47,666 --> 00:06:52,246
that we're looking at in this course but that's an important consideration that should feed

96
00:06:52,246 --> 00:06:55,696
into the subsequent process of additional data collection.


