Digitalisation, Education, Machine Learning

Drawing on machine learning in the quest for effective teaching and learning

9 minute read

By Gabor Fulop (Gabor.FULOP@oecd.org) & Noémie Le Donné (Noemie.LEDONNE@oecd.org), Directorate for Education & Skills (OECD)

With most students having experienced remote learning over the past year due to COVID-19 related restrictions, the importance of teachers and schools has become only more evident. Temporary school closures underline the significant benefits students receive from being in school with their teachers and classmates. But what exactly do teachers do that helps students perform well academically, socially and emotionally?

Identifying the teacher and school factors that help younger generations to succeed and thrive later in life is a longstanding challenge for education policy. Past education research has shown that how teachers, school leaders and schools shape the quality of instruction and that students’ environment is closely related to student academic and social-emotional development (Darling-Hammond, 2017; OECD, 2018).

Past studies have found that teachers’ value-added accounts for significant variation in student achievement (Chetty, Friedman and Rockoff, 2014; Jackson, Rockoff and Staiger, 2014; Rivkin, Hanushek and Kain, 2005; Rockoff, 2004). There is also evidence that, as with test scores, teachers vary considerably in their ability to support students’ social and emotional development (Jackson, 2018; Kraft, 2019; Ladd and Sorensen, 2017).

The literature indicates that teachers and schools matter. However, the evidence is less conclusive as to the specific characteristics and actions of teachers and schools that matter the most for student achievement and social-emotional development. By applying a machine learning technique to a dataset that combines two large international surveys, the OECD report, Positive, High-achieving Students? What Schools and Teachers Can Do (OECD, 2021), pinpoints some of the most effective teacher and school practices.

The TALIS-PISA link data

The two surveys in question are the OECD Teaching and Learning International Survey (TALIS), which is the largest international and periodic survey asking teachers and school leaders about their working conditions and learning environments, and the OECD Programme for International Student Assessment (PISA), which provides the most comprehensive and rigorous international assessment of student learning outcomes to date, delivering insights into the cognitive and social-emotional skills of 15-year-old students. We call this the TALIS-PISA link.

The TALIS-PISA link 2018 data comprises thousands of variables from more than 30,000 students and more than 15,000 teachers of the same schools from nine countries and sub-national entities: Australia, Ciudad Autónoma de Buenos Aires (referred to as CABA [Argentina]), Colombia, the Czech Republic, Denmark, Georgia, Malta, Turkey and Viet Nam.

That being said, the specific survey design of the TALIS-PISA link data comes with its limitations. First, the data do not allow for matching a teacher and her or his students; rather, the data only permit matching a sample of teachers teaching 15-year-old students in a school and a sample of 15-year-old students at that same school. Information on teachers is therefore averaged at the school level and then analysed together with students’ outcomes. Given that teachers of the same school differ significantly in terms of their characteristics and practices, linking data by averaging teachers’ variables at the school level constitutes a considerable loss of information. Second, the cross-sectional design of the TALIS and PISA studies prevents causal interpretation of the analyses based on the TALIS-PISA link data.

Drawing on a machine learning technique to let the data speak

Applied education research has yet to tap into the rapidly expanding field of machine learning. Advanced data-driven methods are rarely applied in research looking at the nexus between teaching and learning. The latest OECD report analysing the TALIS-PISA link dataset (OECD, 2021) seeks to break new ground by extracting maximum relevant information from this complex dataset.

A machine learning technique called “lasso” (i.e. least absolute shrinkage and selection operator) is used as a compass to guide the identification of the teacher and school characteristics and practices that matter the most for student outcomes. Lasso is an attractive tool for analysing data patterns emerging from the many variables collected through the TALIS survey and the many student outcomes measured by PISA. In particular, lasso can select variables that are highly correlated with the outcome variable even when the number of potential variables is high relative to the number of observations.

Lasso selects variables that correlate well with the outcome in one dataset (training sample) and then tests whether the selected variables predict the outcome well in another dataset (validation sample) (Hastie, Tibshirani and Friedman, 2017). It proceeds with variable selection by minimising the prediction error of the regression model subject to the constraint that the model is not too complex. Lasso reduces model complexity by omitting certain variables given the underlying assumption that the number of coefficients that are non-zero (therefore signalling a positive or negative association with the outcome variable) in the true model is small relative to the sample size (known as the sparsity assumption).

Theory and previous research findings are also carefully considered to inform the analyses and interpret, validate, or contextualise the findings. In addition, standard statistical methods complement the findings from lasso regressions. Variance decomposition techniques are applied to measure the share of variance in student outcomes explained by each of the high-level teacher and school dimensions considered in the analysis. Standard linear and logistic regressions are then used country by country to determine the significance and sense of the relationships between teacher and school factors, and student outcomes.

Figure 1: Teacher and school factors that matter for student academic success and social-emotional development

Notes: The relationships between teacher and school dimensions are often characterised by reciprocity and inter-connectedness. For example, professional development influences classroom practices, and in turn, those practices have an effect on the type of professional development provided to teachers. Certain factors can be both an input and an output of schooling. Indeed, the reciprocity also holds for the relationship between various teacher and school factors, and student achievement. For instance, student performance can have an impact on the choice of teaching strategies applied in the classroom, but it can also influence other factors such as school culture (e.g. teacher-student relations), the type of professional development provided to teachers or teacher well-being, job satisfaction and self-efficacy.

Source: OECD (2021), Positive, High-achieving Students?: What Schools and Teachers Can Do, TALIS, OECD Publishing, Paris, https://doi.org/10.1787/3b9551db-en.

Teacher and school factors that matter the most for student academic success and social-emotional development

The report tests the potential influence of many teacher and school characteristics and practices – almost 150 variables grouped into 18 high-level dimensions – from teachers’ initial teacher education, motivations to join the profession, opportunities for collaboration through to classroom composition and teaching practices (Figure 1). As a result, it highlights five key predictors of student academic achievement and social-emotional development: teachers’ classroom practices (in particular, the share of class time spent on actual teaching and learning), teachers’ use of working time (in particular, the amount of time teachers spend on marking and correcting and on extracurricular activities), teachers’ well-being and job satisfaction, classmates’ characteristics and school culture (in particular, the involvement of parents and community in school-related activities as well as teacher-student relations).

More specifically, it is the time teachers spend actually teaching in class, not disciplining or taking care of administrative work, and the hours they spend marking and correcting work, and going over this feedback with their students that links to how well students do academically, and how motivated and optimistic they are about their learning and prospects. Indeed, students tend to perform better, on average, the more class time teachers spend on actual teaching and learning. Additionally, the more time teachers spend on marking and correcting student work, the better students perform academically (Figure 2) and the more likely they are to expect to complete at least a tertiary degree.

The report also suggests that positive teachers can help to form positive and high-achieving students. Indeed, students tend to find their teachers more interested in their teaching when teachers report lower levels of work-related stress. Further to this, the more satisfied teachers are with their work environment, the better students tend to perform in school. Moreover, positive teacher-student relations are found to matter for student achievement and social-emotional development. In short, the more teachers report nurturing good relationships with students, the more students perceive them as enjoying teaching, the better it is for classroom disciplinary climate, and the better students perform academically.

Results also show that spending quality time with students outside of the usual lessons supports student growth. The more time teachers spend on extracurricular activities, the better students behave in class (Figure 3), the more students report that the teacher is interested and motivated to teach, and that they expect to complete at least a tertiary degree.

Figure 2: Relationship between time spent by teachers on marking and correcting student work and student academic achievement

Notes: Teacher variables are averaged for all teachers within the school. PISA scores are scaled to fit approximately normal distributions, with the OECD’s means around 500 score points and standard deviations around 100 score points. Results of linear regression based on responses of 15-year-old students and teachers. Controlling for the following elements of teachers’ use of working time: total working hours, total teaching hours and teachers’ use of working time on tasks other than marking and correcting (such as individual planning or preparation of lessons either at school or out of school, or general administrative work); and for the following student characteristics: gender, immigrant background and index of economic, social and cultural status. TALIS-PISA link data are more likely to provide insights for the Czech Republic and Turkey, where differences in school average performances represent about half of the total variance in student achievements, as opposed to countries, including Australia, Denmark and Malta, where 25% or less of the total variation in student outcomes lie between schools. In addition, it is less likely to have statistically significant results for countries and economies with smaller sample sizes (e.g. Ciudad Autónoma de Buenos Aires [Argentina] and Malta). The TALIS-PISA link average corresponds to the arithmetic mean of the estimates of participating countries and economies, excluding Viet Nam. Statistically significant coefficients are marked in a darker tone.

Source: Adapted from OECD (2021), Positive, High-achieving Students?: What Schools and Teachers Can Do, TALIS, OECD Publishing, Paris, https://doi.org/10.1787/3b9551db-en, Figure 2.7.

Figure 3: Relationship between teachers’ engagement in extracurricular activities and student perception of classroom disciplinary climate

Change in the index of student perception of the classroom disciplinary climate associated with teachers’ engagement in extracurricular activities

Notes: Teacher variables are averaged for all teachers within the school. Positive values on the index of classroom disciplinary climate mean that the student enjoys a better disciplinary climate in language-of-instruction lessons than the average student in OECD countries. The index was scaled with a mean of 0 and a standard deviation of 1 across senated weighted OECD countries. Results of linear regression based on responses of 15-year-old students and teachers. Controlling for the following classroom characteristics: class size, share of students whose first language is different from the language(s) of instruction, low academic achievers, students with special needs, students with behavioural problems, students from socio-economically disadvantaged homes, academically gifted students, students who are immigrants or with a migrant background and students who are refugees; and for the following student characteristics: gender, immigrant background and index of economic, social and cultural status. TALIS-PISA link data are more likely to provide insights for the Czech Republic and Turkey, where differences in school average performances represent about half of the total variance in student achievements, as opposed to countries, including Australia, Denmark and Malta, where 25% or less of the total variation in student outcomes lie between schools. In addition, it is less likely to have statistically significant results for countries and economies with smaller sample sizes (e.g. Ciudad Autónoma de Buenos Aires [Argentina] and Malta). Statistically significant coefficients are marked in a darker tone.

Source: Adapted from OECD (2021), Positive, High-achieving Students?: What Schools and Teachers Can Do, TALIS, OECD Publishing, Paris, https://doi.org/10.1787/3b9551db-en, Figure 3.8.

However, teachers and school leaders are not the sole stakeholders in student academic and social emotional skills. Students do better when their parents and guardians, and local communities involve themselves in school activities. As expected, classmates seem to matter a great deal for student performance and student self-concept. As the average concentration of students from socio economically disadvantaged homes in the classrooms increases, students tend to perform worse academically and be less likely to aspire to tertiary education studies. In addition, the greater the number of academically gifted classmates enrolled in the classroom, the more a student feels able to succeed in the PISA test and the better they perform on average. While these findings may signal the presence of academic segregation, as high achievers tend to be concentrated in certain schools in most education systems, they can also point to the presence of peer effects. Indeed, a student’s performance can be positively affected by classmates with higher innate ability through an increase in motivation, competition and career aspirations (Sacerdote, 2011). Yet, high-performing students still tend to be less affected than their low-achieving peers by the composition of their classes. This indicates that addressing socio-economic and academic segregation of schools may be beneficial for both increasing student performance at the country level as well as improving equity.

With this report, Positive, High achieving Students? What Schools and Teachers Can Do, the OECD has trialled lasso regressions to shed light on the key factors that can be activated to raise student cognitive and social-emotional outcomes. Similar methods could be applied to any other high-dimensional dataset, in synergy with other approaches, to support better diagnosis for better policies.

References