DP3 Learning Analytics

“In God we trust. All others must bring data.”
- Deming 

Educational Data Analysis or Learning Analytics is a major initiative in Australian schools to make use of the large amounts of data collected on students, particularly in various testing programmes. Understanding how to unpack and make use of such data to inform teaching and learning is of importance to all educators. The alternative is to allow external bodies to interpret the data and apply their own perspectives on school and classroom performance. Schools able to manage their own data, teachers taking control of the analysis of their own students, and student exerting ownership of data collect bout them, all have implications for future classroom practice. Information technologies are making it ever easier to collect vast quantities of data, but also the tools to make interpretation and meaning from the data easier. It will be the teachers and school leaders who master these technologies who gain advantage from such analysis, for their schools, careers, and student learning,

Dr Jason Zagami

iBook Version - iPad only (23MB)


To effectively teach, you need to understand your students. While much of the richness of teaching lies in the interpersonal relationships between teacher and student, it is also necessary to understand your students academically. We do this primarily through assessment, usually the measurement of student achievement of set tasks, but this is not the only measures we can make use of to better understand our students and ways to improve their learning.

One of the key criticisms of most student assessment is that it only reports after the fact, often well after, and beyond the time in which changes can be made to improve student learning. This diagram (Cromfrey, 2000) suggests the ideal timeframe for feedback and how specific this feedback should be to individual students.

Learning Analytics

Detailed feedback is however time consuming, and despite understanding the importance of timely and individualised feedback, generally beyond the capacity of even the most dedicated and experienced teachers to fully achieve. ICT however can assist, especially as more student activities are conducted on computers and online, many of the time consuming processes of testing and reporting can be automated.

Unfortunately, while a lot of data about student activity and their learning is often generated by their use of computers and online systems, this data is rarely in a format that is easily used by teachers and students to analyse and evaluate their learning and use the data to make improvements.

Learning Analytics (LA) and Data Mining (DM) are approaches to make data and the analysis of data more accessible. The key difference between the two is that LA looks at systems and contextual factors (such as classrooms or schools), while DM looks at specific variables and factors (such as demographics or test results).

While teachers and schools are currently focused on the use of LA and DM to improve test scores, both can be used to provide far more information than that, and in the future as the techniques, methods of presenting, and understanding by educators improve, LA and DM will become used everyday to provide the best possible learning experiences for students.

Learning Analytics at an academic/research level makes use of various software tools and approaches to draw understanding from sets of data:

Social network analysis (SNA) involves mapping and measuring relationships and flows between people, groups, organisations, computers, URLs, and other connected information/knowledge entities. The nodes in the network are the people and groups while the links show relationships or flows between the nodes.

Social Ecological Modelling (SEM) involves identifying and measuring influencing factors of behaviour.

Behavioral trust analysis uses instances of conversation and propagation (people communicating and using information to generate new information) as an indicator of trust.

Influence and passivity measure assesses the influence of people and information by measuring the number of times it is passed on, cited, or retweeted.

  • Content analysis
  • Discourse Analytics
  • Impact of interaction
  • Prediction
  • Personalisation & Adaptation
  • Intervention
  • Information visualisation

At an institutional (i.e. school or classroom) level, data is most often drawn from Learning Management System (LMS) usage and diagnostic tests such as the National Assessment Program — Literacy and Numeracy (NAPLAN). Such data is usually analysed using data tables, graphs and data visualisations.

Educational Data Mining

Educational Data Mining (called EDM) uses the data that come from educational environments to better understand students, and the settings in which they learn. Key areas of EDM involve mining computer logs of student performance and enrolment data to predict student performance in order to recommend improvements to current educational practice.

EDM methods

Baker (2010) classifies the areas of EDM as:
  • Prediction (in which data is used to predict future performance);
  • Clustering  (to identify similarities in students, tasks, etc.);
  • Relationship mining (between students, teachers, concepts, etc.);
  • Modeling (creating a model and using this for further prediction or analysis); and
  • Distillation of data for human judgment (the most common use by teachers using statistics and visualisations).

Examples from your course
  • Quiz results;
  • Frequency and duration of access to course websites;
  • Views of lecture presentations;
  • Participation in lecture discussions;
  • Participation in online tutorials; and
  • Participation in discussion groups.


An EDM taxonomy (Romero & Ventura) includes:
  • Analysis and visualization of data;
  • Providing feedback for supporting teachers;
  • Recommendations for students;
  • Predicting student performance;
  • Student modeling;
  • Detecting undesirable student behaviors;
  • Grouping students;
  • Social network analysis;
  • Developing concept maps;
  • Constructing courseware;
  • Planning and scheduling.
Introducing Learning analytics into school and classroom uses can occur in stages (LAK12, 2012):
  • Extracting and analysing data from learning management systems;
  • Building an analytics matrix that incorporates data from multiple sources (social media, LMS, student information systems, etc);
  • Profile or model development of individual learners (across the analytics matrix);
  • Predictive analytics: determining at-risk learners;
  • Automated intervention and adaptive analytics: i.e. the learner model should be updated rapidly to reflect near real-time learner success and activity so that decisions are not made on out-dated models;
  • Development of "intelligent curriculum" where learning content is semantically defined;
  • Personalisation and adaptation of learning based on intelligent curriculum where content, activities, and social connections can be presented to each learner based on their profile or existing knowledge; and
  • Advanced assessment: comparing learner profiles with architecture of knowledge in a domain for grading or assessment.

Open Data

The open data movement is encouraging governments, companies, institutions, researchers and individuals to make ‘raw’ data freely available to everyone to use, reuse and redistribute. The intent is to encourage innovation and make the organisations responsible for the data more accountable and efficient.

The Australian federal government, Queensland Government and Brisbane City Council. are making some data open via various websites but the coverage is patchy. Available Queensland school datasets include home education registrations, school disciplinary absences, attendance rates, and enrolments. 

Open data is somewhat more problematic in schools as privacy laws and policies regarding student data restrict open sharing of some data. Nevertheless public data is available for education , particularly when aggregated and de-identified so that individual students cannot be identified (easily). MySchool aggregates data from many sources to compare schools, and demographic data is available from the Australian Bureau of Statistics. Schools are sharing more data publicly via websites but the bulk of educational data collected by teachers and schools remain closed and often unexamined.

While Google is providing tools to facilitate data sharing, and it is becoming common to share scientific data, standards are still being established and we have some way to go before the vision of an internet of connected raw data is available to drive innovation,social improvement, and of course educational improvements.


The main educational dataset on Australian students is drawn from a test taken by most Australian students in years 3, 5, 7 and 9 - the National Assessment Program - Literacy and Numeracy (NAPLAN). These tests assess students’ reading, writing, language (spelling, grammar and punctuation) and numeracy, common to all states and territories.

While ostensibly to determine if students are performing above, at or below national benchmarks, the data provided can be used for a range of purposes. Individual students receive a detailed report, but teachers and schools also receive reports on how their students have performed compared to other classes in their school, state, and nationally. This can then provide information on which to make changes to teaching programs and pedagogy.

Unfortunately, NAPLAN results can take a while to be processed and returned to schools, beyond the time in which specific interventions can be made to address the misconceptions and errors made by students. This will hopefully change in the future but the data is nevertheless useful in identifying those concepts that students did not know and allows individual teachers to reflect on their teaching of these concepts in the future.

Guttmann space pattern analysis

While each state provides specific software to assist teachers in mining and analysing the data from NAPLAN tests, a simple technique know as the Guttmann Space Patterns (adapted from Griffin, 2012) can be used for such tests to determine what actions teachers can take to address individual student learning needs.

First order columns of student results from Easy to Hard items, and then order students from most capable to least capable in rows.

To do this you only need some skills in using a spreadsheet to sort the data for the class according to the total number of questions a student has correct, and the total number of students that an item has with the correct answer. Then sort the data according to marginal totals, to form what is known as a contingency table (or cross tabulation). In simple terms, it is a count on the right hand side of how many questions each students answered correctly.

This sorted skills audit of test questions, both by student and by item form a pattern, called a Guttmann space pattern.

A Guttmann space pattern is a collection of ones and zeroes – one meaning the correct answer, one means they were able to demonstrate a particular skill, zero means they didn’t – and if we sort those in both directions, you get this pattern with a diagonal where above the diagonal, mostly ones, and below the diagonal, mostly zeroes.

So if you can imagine a grid of ones and zeroes with a split down the middle across the diagonal, the top of the spreadsheet would have all the students with the correct answers, those who got mostly correct answers to all the questions, and at the bottom you have the students with mostly incorrect answers to the questions. And on the left hand side you’d have most of the easy items that almost everybody answered correctly, and on the right hand side you have a pattern of had items that almost no-one answered correctly.

This splits the spreadsheet not down the middle but across the diagonal, above the diagonal will mostly be ones, and below the diagonal, mostly zeroes, but it is rarely a perfect diagonal (as with James in the example). As you move from left to right across the spreadsheet... typically a student will get a few questions all correct, and then there will be some right, some wrong, and then mostly wrong.

The interesting thing about doing this analysis is the area where there are some right and some wrong answers. This is the area that Vygotsky called the zone of proximal development for each student, and that’s where the student is most ready to learn, and where teaching intervention will most likely succeed.

But the problem with the NAPLAN data is that it comes out four or five months after students do the test, so the data is dated, but you can make an assumption that the student has only moved on a little bit, and so still use the Guttman space analysis to identify for each student in the class where is the zone of proximal development is in which teachers can intervene and teach to the construct rather than going through every question on the test and drilling them in all questions they answered incorrectly.

If a question is so difficult that it is a long way to the right of the diagonal for a student, it is beyond their ability to learn how to do that skill, if it’s a long way to the left of the diagonal it is way below their ability and they will be bored.

With this simple skill to analyse test questions, teachers can interpret and analyse test data to identify where to intervene as a result of a test, identifying the ZPD for individual students. This analysis does not point to a level of achievement that the student has reached, instead it points to where the student is most likely to benefit from instruction.

For tests when data is delayed (such as NAPLAN results) all teachers need to do is make an assumption that the student will have moved only a little bit to the right of the diagonal and they they can check the data to see whether that is true (by checking student understanding) and teach the student more effectively.

This approach, of using test data to identify areas for improvement is what testing programs such as NAPLAN are intended to achieve, it is only when teachers and schools focus on average scores rather than testing being part of a developmental approach, that problems arise.

“it’s not about fixing up problems, it’s about scaffolding the kids’ learning rather than going through a deficit model looking at what they get wrong and trying to fix that. That leads to teaching to the test.”
- Griffin


Types of educational data

There are four categories of data are commonly used when analysing educational environments: demographics, student learning, school processes, and perceptions (Technology Alliance, 2005). Used in combinations, these measures can help you better understand the effectiveness of the learning environment and make informed decisions and changes.


Statistical characteristics of human populations; descriptive information about the school community. Examples include attendance, enrolment, grade level, ethnicity, gender, native language, crime rate, and socioeconomic status.

Student Learning

What students know and are able to do as a result of their schooling. Examples include standardised, norm-referenced, and criterion referenced tests; performance and standards-based assessments; teacher-made tests; grades and grade point averages.

School Processes

Educational and psychological events and practices occurring in the classroom (e.g., instructional strategies) and at the school level (e.g., academic programs); what educators do to produce results. Examples include instructional and learning strategies; instructional time and environment; organisation of instructional components; assessment practices; classroom management; relationships among students; relationship among educators; relationships among students and educators.


Individual views, values, and beliefs regarding the way the world operates. Examples include student views of teachers; teacher satisfaction with administration; and school safety.

Top 10 uses of data in schools

  1. Data can uncover problems that might otherwise remain invisible.
  2. Data can convince people of the need for change.
  3. Data can confirm or discredit assumptions about students and school practices.
  4. Data can get to the root cause of problems, pinpoint areas where change is most needed, and guide resource allocation.
  5. Data can prevent over-reliance on standardised tests.
  6. Data can help schools evaluate program effectiveness and keep the focus on student learning results.
  7. Data can provide feedback that teachers and administrators need to keep going and stay on course.
  8. Data can prevent one-size-fits-all and quick solutions.
  9. Data can give schools the ability to respond to accountability questions.
  10. Data can build a culture of inquiry and continuous improvement.
(Love, 2008)

Informing Practice

The following collaborative process is being used in Queensland state schools to set aspirational targets, work towards continuous improvement of teaching and learning, build a culture of data inquiry, and improve teacher pedagogy.

Goals and targets

Goals and targets are integral to setting strategic direction.

Begin the process by being explicit about the purpose of the collaborative inquiry. Examining school strategic documents guides the setting of goals and targets.

What do I do?

  • Align with systemic priorities & school vision.
  • Refer to systemic targets.
  • Consult the school annual operational plan.
  • Set the expectations.

What do I ask?

  • What are our current goals?
  • Where do we want to be?
Note: Return to goals and targets after interrogating the data. This ensures decisions about actions related to the data are still aligned to goals and targets before proceeding to planning. If changes to targets and goals are required, they are made after interrogating the data and verifying the decisions to change the goals and targets.


Collect all the data sets related to the specific inquiry and the identified goals and targets.
Have a range and balance in data sets. It is important that patterns and trends identified in one data set are then matched to evidence across other data.

Depending on the goals and targets, data sets may include whole school information, specific learning area data, and data on student groups and student achievement

What do I do?

  • Strategically gather data for interrogation from a variety of sources related to the identified goals and targets.
  • Use OneSchool to access student data or use hard copy data sets.
  • Access data analysis support available on OneSchool.

What do I ask?

  • Which data sets are relevant to the goals and targets?
  • What whole school and year level data is available?


Interrogate to ask a series of questions that delve deeper and deeper into the data.

What do I do?

  • Interrogate by asking questions of the data.
  • Transform data into meaningful information.
  • Look for patterns, trends and changes in the data.
  • Arrange different data sets to provide comparative views of data.

What do I ask?

  • What is this data set telling me?
  • Does this align with what I already know from other data sources?
  • How does the cohort perform in relation to comparative groups?
  • What patterns and trends can I identify?
  • What factors contribute to these patterns and trends?


Draw an inference by concluding or judging from the evidence.
Use the information from data interrogation and synthesise it into a statement that describes what is happening.

What do I do?

  • Use information derived from the data analysis to make inferences about student achievement.
  • Make statements about teaching and learning from these inferences.

What do I ask?

  • Which pattern in the data is most significant?
  • Which aspect of the curriculum does it relate to?
  • Which year level/s are most strategically affected?


Verify to confirm or substantiate inferences/hypotheses by comparing with other data information.
Consult other relevant data to ensure that the information and resulting inferences made from the data analysis correlate with other sources.

What do I do?

  • Identify supporting data evidence to affirm initial inferences.
  • Refer back to data information from a variety of sources, e.g:
    • school assessment data A-E
    • Year 2 Diagnostic Net
    • QCATs
    • Assessment Bank
    • NAPLAN
    • P-9 Literacy and Numeracy indicators
    • Commercially produced assessments.

What do I ask?

  • Do the patterns identified correlate with those found in other data sets and support proposed inferences?
  • Is the evidence strong enough to warrant a modification of current teaching and learning?


Plan by designing a scheme of action to meet the needs identified in the data analysis.

What do I do?

  • Formulate an idea for improved student learning outcomes.
  • Determine an action plan for reaching goals.
  • Plan and organise all aspects of the proposed action plan.

What do I ask?

  • Which students will be involved in the planned action?
  • What will be the sequence of teaching and learning?
  • Are there organisational aspects to consider (such as personnel, timetabling, data collection and collation)?
  • What are the time frames for the planned action?


Implement by putting planned action into practice.
Maintain the evidence-based focus of the teaching and learning sequences. This will ensure that intervention responds directly to the teaching and learning needs uncovered in the data analysis.

What do I do?

  • Implement the planned curriculum.
  • Embed assessment for learning, as learning and of learning throughout the planned action.
  • Implement the most appropriate sequence of teaching and learning by using a variety of teaching strategies including direct teaching, interactive teaching, indirect teaching and experiential teaching.
  • Modify teacher practice or school practice.

What do I ask?

  • Are there aspects of teaching and learning that require professional development?
  • Are modifications to teaching required?
  • What are the implications for daily classroom routines?


Assess by measuring and evaluating the impact of the implemented action plan.
The focus of assessment is to gather information on the impact and results of the planned whole-school or targeted intervention on student achievement. Ensure that effective assessments and assessment tools are planned and implemented. These should be in place before, during and after the implementation of the planned action.

What do I do?

  • Test the impact and results of implementation.
  • Reassess the inferences made from the initial data inquiry.

What do I ask?

  • What assessment measures were used?
  • How well did the student sample perform?
  • Were the nominated targets achieved?


Reflect on the results achieved by the implemented action and the implications for ongoing improvement of teaching and learning.

What do I do?

  • Reflect on the effectiveness of action taken.
  • Engage in professional dialogue.

What do I ask?

  • Was the intended purpose achieved? If not, why not?
  • Do outcomes require a new direction?
  • What is the next step to achieve major goals and targets?
  • Are there curriculum implications?


“Computers are good at swift, accurate computation and at storing great masses of information. The brain, on the other hand, is not as efficient a number cruncher and its memory is often highly fallible; a basic inexactness is built into its design. The brain's strong point is its flexibility. It is unsurpassed at making shrewd guesses and at grasping the total meaning of information presented to it.”
- Campbell

Data Visualisations

Presenting data as visualisations or Information Graphics (infographics) can make complex information easier to understand as our ability to see patterns and trends in visual information is much greater than for textual or numeric data.


Gapminder is a data visualisation tool to dynamically display changes in data, usually over time. It provides a good example of how visualisation can be used to see patterns in data that may be difficult to detect with static tables and graphs.

Many Eyes

Many Eyes is set of visualisation tools and examples that can display data in various dynamic and static ways.

Creating Visualisations

There are many software tools that will aggregate data to automatically generate visualisations or assist in creating effective infographics.

Wolfram Alpha

Wolfram Alpha Personal Analytics for Facebook will produce a complex visualisation of the data contain in your Facebook database including cluster analysis.


Timeline and Timeline.JS are tools that will create an interactive timeline of events that you can place on your own website while tools such as dipity, tiki-toki, myHistro, and timetoast will let you create and host timeline visualisations online.


Maps were one of the first visualisation tools, depicting geographic information. Google Maps and Google Earth are commonly used for Geovisualisation.

Word Clouds

Tools such as Wordle, Tagedo, Tagul, ABCya! and Worditout take text and by changing the size and/or the colour of words to indicate the number of times they occur in a text, produce Word Clouds (also called Tag Clouds or weighted lists) that can provide a quick means of visualising the important points in a text.


Infographics can summarise complex ideas and present them in ways that highlight the important information through engaging graphs and images. A common example of an infographic is how weather data is presented in newspapers. Examples of effective infographics can be found at datavisualization.ch and informationisbeautiful.

There are a range of online tools for creating infographics such as  Visual.ly, Easel.ly, Infogr.am, and guides to creating effective infographics.

Visualisation Types

  • Time Series - showing changes over time;
  • Statistical - revealing trends;
  • Maps - representing geographical data;
  • Hierarchies - showing formal relationships; and
  • Networks - depicting complex relationships e.g. friendships.
  • Data Driven Visualisations
Increasing visualisations are being linked to live data sources, where the visualisation is constantly updated. Google Charts can link to live data and display this on websites, more complex tools exist to link data from remote devices.

LMS Discussion Analysis

Resources such as Griffith University’s Learning@Griffith or Education Queensland’s The Learning Place, use a Learning Management Systems to organise and provide the content and tools to assist students learn and teachers teach. There are a large number of LMS’s available and new one’s being developed all of the time, however there are two dominant LMS’s, Blackboard and Moodle. Both provide similar environments, though Moodle is Open Source.

More detail on LMS’s can be found in the Digital Pedagogies module: Learning Management Systems.

One advantage of using a LMS is that data on student interaction can be captured easily and if the LMS is used for many aspects of their learning - content, discussion, testing, etc. then all of this data is available from one collection.

Of course data is not just available on students, but on teachers - measuring usage of the system, access times and duration, discussions, etc. and these may all contribute to management analytics, but let us remain focused on learning, and the data we can draw from LMS to aid this.

SPAPP Social Networks Adapting Pedagogical Practice

SNAPP allows you to analyse the discussions occurring within LMS bulletin boards. It provides a very easy introduction to such analytics, you sign up at www.snappvis.org and place a bookmark into a Safari or Internet Explorer browser, then while looking at your discussion threads in the LMS, click the bookmark and you will generate an analysis of the discussions. This will graphically show the conversational relationships between participants and statistical analysis of their postings to the discussion forum.

Computer Games

Many computer games can be used to collect data on student activity and this can be applied to analysing their learning. Simulation based games in particular rely upon large data sets to manage their simulated environments and student interaction with this data can then be examined. SimCity is one example where data on student activity in the game environment can be explored. Many games record scores, achievements, discussions, and other data that can be used to better understand the learning occurring.

More detail on the use of games to enhance student learning can be found in the Educational Technologies module: Educational Gaming.


simSchool is a classroom simulation for teachers to analyse student differences, adapting instruction to individual learner needs, gathering data about the impacts of instruction, and seeing the results of their teaching.

You can explore a limited version, simSchoolLite at simschool.org/lite or you can register at www.simschool.org/register?type=demo and a workbook of activities from www.scribd.com/doc/3024555/simManual
Paid accounts can give you access to deeper levels of the simulation and are available at www.simschool.org/registration_select

simSchool creates a simulation in which students respond to your actions as a teacher, creating an experimental inquiry process in which you are aware of some influencing variables but unaware of others. These you need to infer from student responses to build an understanding of what works well with particular students.

From a Learning Analytics perspective, simSchool models what may occur in a classroom environment over many weeks as you use various data collection instruments to build an understanding of your students, what works for each of them individually, and how you can structure individual and whole class activities to best suit the learning needs of your unique combination of students.

Subpages (1): DP3 Activities