Activating Learning at Scale: A Review of Innovations in Online Learning Strategies
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Activating Learning at Scale: A Review of Innovations in Online Learning Strategies Dan Davis, Guanliang Chen, Claudia Hauff, Geert-Jan Houben Delft University of Technology {d.j.davis,guanliang.chen,c.hauff,g.j.houben}@tudelft.nl Abstract Taking advantage of the vast history of theoretical and empirical findings in the learning literature we have inherited, this research offers a synthesis of prior findings in the domain of empirically evaluated active learning strate- gies in digital learning environments. The primary concern of the present study is to evaluate these findings with an eye towards scalable learning. Massive Open Online Courses (MOOCs) have emerged as the new way to reach the masses with educational materials, but so far they have failed to maintain learners’ attention over the long term. Even though we now un- derstand how effective active learning principles are for learners, the current landscape of MOOC pedagogy too often allows for passivity — leading to the unsatisfactory performance experienced by many MOOC learners today. As a starting point to this research we took John Hattie’s seminal work from 2008 on learning strategies used to facilitate active learning. We considered research published between 2009-2017 that presents empirical evaluations of these learning strategies. Through our systematic search we found 126 pa- pers meeting our criteria and categorized them according to Hattie’s learn- ing strategies. We found large-scale experiments to be the most challenging environment for experimentation due to their size, heterogeneity of partici- pants, and platform restrictions, and we identified the three most promising strategies for effectively leveraging learning at scale as Cooperative Learning, Simulations & Gaming, and Interactive Multimedia. Keywords: teaching/learning strategies, adult learning, evaluation of CAL systems, interactive learning environments, multimedia/hypermedia systems Preprint submitted to Computers and Education June 7, 2018
1. Introduction In the dense landscape of scalable learning technologies, consideration for sound pedagogy can often fall by the wayside as university courses are retrofitted from a classroom to the Web. Up against the uncertainty of how to best rethink and conceive of pedagogy at scale, we here synthesize the previous findings as well as highlight the possibilities going forward with the greatest potential for boosting learner achievement in large-scale digital learning environments. Now that the initial hype of Massive Open Online Courses (MOOCs) has passed and the Web is populated with more than 4,000 of these free or low- cost educational resources, we take this opportunity to evaluate and assess the state-of-the art in pedagogy at scale while identifying the best practices that have been found to significantly increase learner achievement. This study conducts a review of the literature by specifically seeking inno- vations in scalable (not requiring any physical presence or manual grading or feedback) learning strategies that aim to create a more active learning expe- rience, defined in Freeman et al. (2014) as one that “engages students in the process of learning through activities and/or discussion in class, as opposed to passively listening to an expert. It emphasizes higher-order thinking and often involves group work.” By limiting the selection criteria to empirical re- search that can be applied at scale, we aim for this survey to serve as a basis upon which future MOOC design innovations can be conceived, designed, and tested. We see this as an important perspective to take, as many learn- ing design studies provide design ideas, but do not contain a robust empirical evaluation. We certainly do not intend to discount the value of observational or qualitative studies in this domain; rather, for the following analyses we are primarily concerned with results backed by tests of statistical significance because this offers a more objective, quantitative measure of effectiveness. 2. Method The driving question underpinning this literature survey is: Which active learning strategies for digital learning environments have been em- pirically evaluated, and how effective are they? To begin the literature search we utilized John Hattie’s Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement Hattie (2008) as a basis. It provides a comprehensive overview of findings in the 2
domain of empirically tested learning strategies in traditional classroom envi- ronments. As Hattie’s work was published in 2008, we used that as a natural starting point for our review, working forward to July 2017. It creates a nar- row enough scope (nine years: 2009-2017) and temporally relevant (MOOCs went mainstream in 2012) time constraints for the review. We manually scanned all publications released from our selected venues in this time pe- riod and determined for each whether or not they met our criteria: (1) the learning strategy being analyzed must have been scalable — it must not require manual coding, feedback, physical presence, etc., (2) the evidence must come from empirical analyses of randomized controlled experi- ments with a combined sample size of at least ten across all conditions, and (3) the subjects of the studies must be adult learners, i.e. at least 18 years old. We included the age criterion based on the profile of the typical MOOC learner — aged 25-35 according to Tomkin and Charlevoix (2014), which aligns with our own institution’s data as well. From Hattie’s synthesis of meta-analyses we identified the 10 core learn- ing strategies that best apply to open online education — only selecting from those which Hattie found to be effective. With these learning strate- gies fixed, we systematically reviewed all publications in five journals and eight conferences (listed in Table 1) that have displayed a regular interest in publishing work on testing these categories of innovative online learn- ing strategies. These venues were identified and selected based on an ex- ploratory search through the literature—we began with a sample of studies we were previously familiar with that fit the scope of the present review and perused the references of each to identify more potential venues worth exploring. This process was repeated for each identified study thereafter. The lead author also reached out to experts in the field to assure that this method did not overlook any potential venues. The thirteen venues used for the final review are those which showed the most consistent interest in publishing studies that meet our criteria. We employed this method over a search/query term method because our criteria (namely that of being a ran- domized controlled trial among adult populations) are not reliably gleanable from standard search engine indexing. We acknowledge there are other journals and conference proceedings that may have been applicable for this survey, but given our search criteria, we found these thirteen venues to be the most appropriate based on our initial exploratory search. Of the 7,706 papers included in our search, we found 126 (1.6%) to 3
Table 1: Overview of included venues. The most recent included issue from each publica- tion is indicated in parentheses. Unless otherwise indicated with a †, the full proceedings from 2017 have been included. Computers & Education (Vol. 114) Journal of Learning Analytics (Vol. 4, No. 2) Journal of Educational Data Mining (Vol. 8, No. 2) The Open Education Journal eLearning Papers (Issue 43) IEEE Transactions of Learning Technologies (Vol. 10, Issue 1) ACM Learning @ Scale (L@S) Learning Analytics & Knowledge (LAK) European Conference on Technology-Enhanced Learning (EC-TEL) † International Conference on Educational Data Mining (EDM) ACM Conference on Computer-Supported Cooperative Work (CSCW) European MOOCs Stakeholders Summit (EMOOCs) European Conference on Computer-Supported Collaborative Work (ECSCW) Human Factors in Computing Systems (CHI) meet our criteria. The criterion requiring randomized controlled trials proved to be a strong filter with many studies not reporting randomization or a baseline condition to compare against. Overall, these 126 papers report on experiments with a total of 132,428 study participants. We then classified each work into one of the ten learning strategy categories (listed in Table 2). Figure 1 illustrates the number of studies that met our selection criteria organized by the year published. It shows the increasing frequency of such experiments in recent years, with the most notable increase from 2014 to 2015. We could propose any number of explanations for the decrease in studies from 2015 to 2016, but it would be purely speculation. However, when ex- amining the studies themselves, we do notice a prominent trend with some explanatory power. With the dawn of MOOC research emerging around 2013 and 2014, the experiments carried out in this window can be viewed now, in hindsight, as foundational. Such interventions in this era included sending out emails to learners (Kizilcec et al., 2014a) or dividing the course discussion forum and controlling instructor activity (Tomkin and Charlevoix, 2014). However, in 2016 and 2017 we begin to see an elevated level of com- plexity in interventions such as the adaptive and personalized quiz question delivery system (Rosen et al., 2017) implemented and evaluated at scale in 4
Table 2: Overview of considered learning categories. The selected papers per category are shown in parentheses. The sum of the numbers is 131 and not 126, as five papers apply to two categories. Mastery Learning (1) Meta-Cognitive Strategies (24) Questioning (9) Spaced vs. Massed Practice (1) Matching Learning Styles (3) Feedback (21) Cooperative Learning (17) Simulations & Gaming (18) Programmed Instruction (6) Interactive Multimedia Methods (31) a MOOC. It is also worth noting that a number of journal issues and con- ference proceedings from 2017 had not yet been released at the time of this writing (indicated in Table 1). Figure 2a shows the proportion of results (positive, null, or negative) with respect to the experimental environment employed by the selected arti- cles/studies. Noting the difference between MOOCs and native environments (those designed and implemented specifically for the study), we see native environments yielding positive results at a much stronger rate than MOOCs (59% vs. 42% respectively). We see two main factors contributing to this difference: (i) native environments can be modeled specifically for the exper- iment/tested concepts, whereas experiments done in MOOCs must adapt to the existing platforms and (ii) no MOOC studies provide participants any incentive to participate, whereas this is common to experiments in native environments. Figure 2c further visualizes this discrepancy in illustrating the proportion of positive, negative, and null results across three subject pool sizes: small- scale studies with between 10 and 100 participants, medium-sized studies with 101– 500 participants and large-scale studies with more than 500 study participants. We here find a statistically significant difference in the propor- tion of reported positive findings in large (42% in studies with 500+ partic- ipants) and small (60% in studies with 10–100 participants) studies using a χ2 test (p < 0.05). As the focus of this study is on large-scale learning, we specifically ran this analysis to evaluate the impact that scale and, in turn, 5
30 Environment Class ITS 20 Lab LMS Count Mobile MOOC Mturk Native 10 0 2009 2010 2011 2012 2013 2014 2015 2016 2017 Year of Publication Figure 1: The number of papers by year and learning environment meeting our selection criteria. Each environment is defined in detail in Section 3.1. Best viewed in color. sample size and heterogeneity can have on an experiment. We registered this project with the Center for Open Science1 , and the reg- istration which includes all data gathered as well as scripts used for analysis & visualization are available at https://osf.io/jy9n6/. 3. Terminology We now define the terminology used in the reporting of our results. Not only is this explicit terminology elucidation important for the clarity of this review, it can also serve as a reference for future experiments in this area to ensure consistency in how results are reported and replicated. In dis- cussing each study, we refer to “learners”, “students”, or “participants” as the authors do in the referenced work. 3.1. Environment The first dimension by which we categorize the studies is the environment wherein the experiment/intervention took place. We distinguish between the following: • Intelligent Tutoring System (ITS): Digital learning systems that monitor and adapt to a learner’s behavior and knowledge state. 1 https://cos.io 6
90 36% Result − + o Count 60 59% 30 56% 45% 33% 14% 29% 71% 62% 50% 62% 42% 71% 14% 38% 5% 5% 2% 5% 0 LMS Mobile ITS Mturk Lab MOOC Native Environment (a) Papers partitioned by environment 60 Result 49% − + Count 40 o 34% 32% 43% 20 49% 39% 61% 66% 51% 50% 0 11% 5% 2% 6% 1% Credit $ n/r Class None Incentive (b) Papers partitioned by incentive. Result − 75 34% + o 35% Count 50 56% 60% 25 60% 42% 0 5% 5% 2% [10,100] [101,500] 501+ # Subjects (c) Papers partitioned by number of study participants. Figure 2: Reported results (y-axis) from papers meeting our selection criteria partitioned by environment, incentive, and size. The red - indicates studies reporting a significant negative effect of the intervention; the green + indicates a significant positive effect of 7 the intervention; and the blue o indicates findings without a statistically significant effect. Best viewed in color.
• Laboratory Setting (Lab): Controlled, physical setting in which participants complete the experimental tasks. • Learning Management System (LMS): Software application used to host & organize course materials for students to access online at any time. • Mobile Phone Application (Mobile): Participants must download and use an application on their mobile phone to participate in the experiment. • Massive Open Online Course (MOOC): Online course which offers educational materials free of cost and with open access to all. • Amazon Mechanical Turk (Mturk): Online marketplace used to host low-cost micro-payment tasks for crowdsourcing and human com- putation. Participants are recruited and paid through MTurk and often redirected to an external application. • Native: A piece of software designed and implemented specifically for the study. Figure 1 shows the breakdown of our studies with respect to the envi- ronment. Note that despite the widespread availability of MOOC and LMS environments in 2015, native environments still dominated that year. We speculate that this may be because researchers find it more efficient to build their own environment from scratch rather than adapt their study to the limitations of a pre-existing platform—which is the case with all MOOC ex- periments included in this study; each intervention had to be designed within the confines of either the edX2 or Coursera3 platforms. We also note a sud- den spike in popularity for studies using Mturk from 2015 to 2016. While it is more expensive to carry out research with Mturk compared to MOOCs (which provide no incentive or compensation), Mturk ensures a certain level of compliance and engagement from the subjects in that they are rewarded for their time with money. 2 www.edx.org 3 www.coursera.org 8
3.2. Incentive The second dimension we distinguish is the incentive participants in each study received for their participation: • Monetary Reward ($): Participants receive either a cash reward or a gift certificate. • Required as part of an existing class (Class): An instructor con- ducts an experiment in her own course where all enrolled students are participants. • Class Credit (Credit): By participating in the study, participants receive course credit which can be applied to their university degree. • None: Participants were not provided any incentive or compensation. • n/r: Not reported. 3.3. Outcome Variables As experiments on learning strategies can evaluate a multitude of out- comes, here we provide an overview of all learning outcomes reported in the included studies. • Final Grade: the cumulative score over the span of the entire course which includes all graded assignments. • Completion Rate: the proportion of participants who earn the re- quired final passing grade in the course. • Learning Gain: the observed difference in knowledge between pre- treatment and post-treatment exams • Exam Score: different from the final grade metric in that this only considers learner performance on one particular assessment (typically the final exam). • Long-Term Retention: measured by assessing a learner’s knowledge of course materials longitudinally, not just during/immediately after the experiment. 9
• Learning Transfer: measuring a learner’s ability to apply new knowl- edge in novel contexts beyond the classroom/study. • Ontrackness: the extent to which a learner adheres to the designed learning path as intended by the instructor. • Engagement: a number of studies measure forms of learner activ- ity/behavior and fall under this category. Specific forms of engagement include: – Forum Participation: measured by the frequency with which learners post to the course discussion forum (including posts and responses to others’ posts). – Video Engagement: the amount of actions (pause, play, seek, speed change, toggle subtitles) a learner takes on a video compo- nent. – Revision: the act of changing a previously-submitted response. – Persistence/Coverage: the amount of the total course content accessed. For example, a learner accessing 75 out of the 100 com- ponents of a course has 75% persistence. • Self-Efficacy: a learner’s self-perceived ability to accomplish a given task. • Efficiency: the rate at which a learner progresses through the course. This is most commonly operationalized by the amount of material learned relative to the total time spent. 4. Review In the following review we synthesize the findings and highlight par- ticularly interesting aspects of certain experiments. Unless otherwise in- dicated, all results presented below come from intention-to-treat (ITT) anal- yses, meaning all participants enrolled in each experimental condition are considered without exception. Each category has a corresponding a table detailing the total sample size (“N”), experimental environment (“Env.”), incentive for participation (“Incentive”), and reported results (“Result”). In the Result column, statistically significant positive outcome variables as a result of the experimental treatment are indicated with a +; null findings 10
where no significant differences were observed are indicated with a ◦; and negative findings where the treatment resulted in an adverse effect on the outcome variable are indicated with a -. 4.1. Mastery Learning Teaching for mastery learning places an emphasis on learners gaining a full understanding of one topic before advancing to the next (Bloom, 1968). Given that students’ understanding of new topics often relies upon a solid un- derstanding of prerequisite topics, teaching for mastery learning only exposes students to new material once they have mastered all the preceding material, very much in line with constructivist thinking as outlined by Cunningham and Duffy (1996). In the traditional classroom, teaching for mastery learning presents a major challenge for teachers in that they must constantly moni- tor each individual student’s progress towards mastery over a given topic—a nearly impossible task in a typical classroom with 30 students, never mind 30,000. However, with the growing capabilities of education technologies, individualized mastery learning pedagogy can now be offered to students at scale. While mastery learning is so frequently found to be an effective teaching strategy in terms of student achievement, it often comes at the cost of time. This issue of time could be a reason behind there being only one paper in this category. Mostafavi et al. (2015) implemented a data-driven knowledge tracing system to measure student knowledge and release content according to their calculated knowledge state. Students using this system were far more engaged than those using the default problem set or that with on-demand hints. A strict implementation of mastery learning — as in Mostafavi et al. (2015), where learners in an ITS are required to demonstrate concept mastery before advancing in the system — would be useful to understand its effect on the heterogeneous MOOC learner population. Table 3: Mastery Learning Mastery Learning: + : 1 Ref. N Env. Incentive Result Mostafavi et al. (2015) 302 ITS Class +Engagement + = positive effects 11
4.2. Metacognitive Strategies Metacognitive behavior is defined by Hattie (2008) as “higher-order think- ing which involves active control over the cognitive processes engaged in learning.” Metacognition is an invaluable skill in MOOCs, where learners cannot depend on the watchful eye of a teacher to monitor their progress at all times. Instead, they must be highly self-directed and regulate their own time management and learning strategies to succeed. The papers in this category explore novel course designs and interventions that are intended to make learners more self-aware, reflective, and deliberate in the planning of (and adherence to) their learning goals. Davis et al. (2016) conducted two experiments: in study “A” they pro- vided learners with retrieval cue prompts after each week’s lecture, and in study “B” they provided study planning support to prompt learners to plan and reflect on their learning habits. Overall, neither intervention had any ef- fect on the learners in the experimental conditions, likely because the learners could ignore the prompts without penalty. However, when narrowing down to the very small sample of learners who engaged with the study planning module, the authors found desirable significant increases in learner behavior. Maass and Pavlik Jr (2016) also ran an experiment testing support for re- trieval practice. They found that (i) retrieval prompts increase learning gain and (ii) the complexity of the retrieval prompt had a significant impact on the prompts effect, with deeper prompts leading to better learning gains. In contrast, the retrieval prompts used by Davis et al. (2016) assessed shallow, surface-level knowledge, which could be a reason for the lack of a significant effect. Even though the education psychology literature suggests that boosting learners’ metacognitive strategies is highly effective for increasing learning outcomes (Hattie, 2008), 23 of the 38 results (61%) in this category report null or negative findings. Furthermore, with the reporting of a negative impact of an intervention, Kizilcec et al. (2014a) found a certain form of participation encouragement (collectivist-framed prompting) to actually decrease learners’ participation in the course discussion forum. Nam and Zellner (2011) conducted a study evaluating the effect of framing a group learning activity in different ways. Compared to a “group processing” frame of mind (where group members are asked to assess the contribution of each group member), the “positive interdependence” frame of mind (where group members are reminded that boosting one’s individual performance can 12
have a great impact on the overall group achievement) group had higher post assessment scores. In lieu of an actual learning platform, crowdworker platforms are also beginning to be used for learning research. One example is the study by Gadiraju and Dietze (2017), who evaluated the effect of achievement priming in information retrieval microtasks. While completing a crowdworker task aimed at teaching effective information retrieval techniques, the participants were also assessed on their learning through a test at the end of the task. By providing achievement primers (in the form of inspirational quotes) to these crowdworkers, the authors observed no significant difference in persistence or assessed learning. Given the ease with which these experiments can be deployed, more work should go into exploring the reproducibility of findings from a crowdworker context to an actual learning environment. In summary, the current body of work in supporting learners’ metacogni- tive awareness indicates how difficult it is to affect such a complex cognitive process, as more than half of the reported results from this category led to non-significant results. While some studies do indeed report positive re- sults, the overall trend in this category is an indication that we have not yet mastered the design and implementation of successful metacognitive support interventions that can effectively operate at scale. Setting this apart from other categories is the difficulty to measure metacognition; compared to other approaches such as questioning (where both the prompt and response are easily measurable), both eliciting and measuring responses to metacognitive prompts is far more challenging. 4.3. Questioning Hattie (2008) found questioning to be one of the most effective teaching strategies in his meta-analysis. Questioning is characterized by the posing of thoughtful questions that elicit critical thought, introspection, and new ways of thinking. The studies in this category explore new methods of prompting learners to retrieve and activate their prior knowledge in formative assess- ment contexts. Yang et al. (2016) evaluated the effectiveness of a two-tier questioning technique, described as “...a set of two-level multiple choice ques- tions [in which the] first tier assesses students’ descriptive or factual knowl- edge...while the second tier investigates the reasons for their choices made in the first tier.” They found this questioning technique to be highly effec- tive in their experiment, with learners in the two-tier condition achieving 0.5 13
Table 4: Metacognitive Strategies Metacognitive Strategies: + : 15 / ◦ : 20 / - : 3 Ref. N Env. Incentive Result Kizilcec et al. (2016) 653 MOOC None ◦Persistence ◦Final Grade Lang et al. (2015) 950 ITS None ◦Learning Gain ◦Engagement Lamb et al. (2015) 4,777 MOOC None +Forum Participation Sonnenberg and Bannert (2015) 70 Native $ +Learning Gain Dodge et al. (2015) 882 LMS Class ◦Final Grade Tabuenca et al. (2015) 60 Native $ ◦Exam Score Kizilcec et al. (2014a) 11,429 MOOC None ◦Forum Participation -Forum Participation Margulieux and Catrambone (2014) 120 Native Credit +Exam Score Xiong et al. (2015) 2,052 Native None +Learning Gain +Completion Rate Noroozi et al. (2012) 56 Native n/r +Learning Gain Davis et al. (2016)A 9,836 MOOC None ◦Final Grade ◦Engagement ◦Persistence Davis et al. (2016)B 1,963 MOOC None ◦Final Grade ◦Engagement ◦Persistence Maass and Pavlik Jr (2016) 178 Mturk $ +Learning Gain Kizilcec et al. (2017a) 1,973 MOOC None +Final Grade +Persistence +Completion Rate Yeomans and Reich (2017)A 293 MOOC None -Completion Rate Yeomans and Reich (2017)B 3,520 MOOC None -Completion Rate ◦Engagement Gadiraju and Dietze (2017) 340 Mturk* $ ◦Final Grade ◦Persistence Kim et al. (2017) 378 Mturk $ +Final Grade Hwang and Mamykina (2017) 225 Native n/r +Learning Gain De Grez et al. (2009) 73 Native Class ◦Learning Gain Nam and Zellner (2011) 144 Native Class ◦Engagement +Final Grade Huang et al. (2012) 60 Mobile None +Final Grade Poos et al. (2017) 80 Lab None ◦Final Grade ◦Learning Transfer Gamage et al. (2017) 87 MOOC n/r +Learning Gain + = positive effects, ◦ = null results, - = negative effects 14
standard deviations better learning gains than learners receiving standard one-tier questions. Instructional questioning was explored in the Mturk setting by Williams et al. (2016a) who compared the effectiveness of different questioning prompt wordings. They found prompts that directly ask the learner to provide an explanation of why an answer is correct leads learners to revise their answers (to the correct one) more than a prompt asking for a general explanation of the answer. de Koning et al. (2010) conducted a study where half of the learners were cued to generate their own inferences through self-explaining and half were provided pre-written instructional explanations. Taking place in the context of a course about the human cardiovascular system, results show that learners prompted to self-explain performed better on the final test, but did not show any difference in persistence or learning transfer from the given explanation group. Given its effectiveness and relative simplicity to implement, two-tier ques- tioning should be further investigated in the MOOC setting to stimulate learners critical thought beyond surface-level factual knowledge. Related to the tactic of questioning is the learning strategy known as retrieval practice, or the testing effect, which is characterized by the process of reinforcing prior knowledge by actively and repeatedly recalling relevant information Adesope et al. (2017). Recent work has found retrieval practice to be highly effective in promoting long-term knowledge retention Adesope et al. (2017); Clark and Mayer (2016); Roediger and Butler (2011); Henry L Roediger and Karpicke (2016); Lindsey et al. (2014); Karpicke and Roediger (2008); Karpicke and Blunt (2011). Accordingly, we recommend that future research interested in questioning tactics is designed to stimulate learners to engage in retrieval practice. 4.4. Spaced vs. Massed Practice Hattie (2008) describes the difference between spaced learning (some- times referred to as distributed practice) and massed practice as “the fre- quency of different opportunities rather than merely spending more time on task.” In other words, distributing one’s study sessions over a long period of time (e.g., 20 minutes per day for 2 weeks) is characteristic of high spacing, whereas studying in intense, concentrated sessions (one four-hour session) is characteristic of massed practice (Wulf and Shea, 2002). Historically, studies have found that the desired effect of spaced learning (long-term knowledge 15
Table 5: Questioning Questioning: + : 7 / ◦ : 7 Ref. N Env. Incentive Result Yang et al. (2016) 43 Native n/r +Learning Gain Thompson et al. (2015) 43 Native Class +Learning Gain Williams et al. (2016a) 659 Mturk $ +Revision Şendağ and Ferhan Odabaşı (2009) 40 Native Class ◦Learning Gain ◦Final Grade Chen (2010) 84 Native Class +Learning Gain de Koning et al. (2010) 76 Native Credit +Final Grade ◦Persistence ◦Learning Transfer Yang et al. (2015) 79 Native Class +Final Grade ◦Engagement Attali (2015) 804 Mturk $ +Learning Gain Attali and van der Kleij (2017) 2,445 Native None ◦Persistence ◦Final Grade + = positive effects, ◦ = null results retention) is found most commonly in tasks of low difficulty, and the effect decreases as the difficulty increases (Rohrer and Taylor, 2006). Dearman and Truong (2012) developed a mobile phone “Vocabulary Wall- paper” which aimed to implicitly teach (through the learners mobile phone background) learners new vocabulary in a second language in highly spaced microlearning sessions. Their findings show that, compared to learners receiv- ing the lessons at less distributed rates, learners with highly-spaced exposure showed a significant increase of second language vocabulary learned. As evidenced by the lone study in the category, it is difficult to design and implement experiments that effectively get learners to commit to high spacing (ideally enacted as a learned self-regulation skill). Even still, given its proven effectiveness elsewhere in the learning literature (Hattie, 2008), practitioners and researchers should tackle this design challenge in creating and evaluating environments that encourage spaced practice. 4.5. Matching Learning Styles Brown et al. (2009) conducted an experiment testing the efficacy of “learn- ing style-adapted e-learning environments.” In the study, where students self-proclaimed learning styles were either matched or unmatched, yielded 16
Table 6: Spaced vs. Massed Practice Spaced vs. Massed Practice: + : 1 Ref. N Env. Incentive Result Dearman and Truong (2012) 15 Mobile $ +Learning Gain + = positive effects no significant differences in terms of learner achievement between conditions. Consistent with the current popular literature on the topic (Kirschner and van Merriënboer, 2013; Kirschner, 2017), the authors found that adapting the courses to students’ learning styles did not result in any significant benefit. Soflano et al. (2015) employed a game-based learning environment to evaluate the impact of adapting instruction to learning styles in a computer programming learning context. The authors report that compared to the groups using a non-adaptive version of the SQL language tutor software, the adaptive system yielded no difference in final grades (Soflano et al., 2015). However, there does still exist some evidence in favor of this learning strategy. Chen et al. (2011) created a online learning environment where the teaching strategy was adapted to each of the learners’ individual thinking styles. With three teaching strategies (constructive, guiding, or inductive) either matched or unmatched to three thinking styles (legislative, executive, or judicial, respectively), the authors found that the group who had their thinking style matched accordingly outperformed those who did not. Instead of adapting to a single modality that a learner prefers (such as being a “visual learner”), the literature on learning styles emphasizes that while one modality may be preferred by the learner (and can lead to positive experimental results in certain contexts), providing them instruction in a variety of modalities will provide the greatest benefit overall (Kirschner and van Merriënboer, 2013). 4.6. Feedback Hattie (2008) defines feedback as “information provided by an agent (e.g., teacher, peer, book, parent, or one’s own experience) about aspects of one’s performance or understanding.” Strategically providing students with feed- back offers them the chance to reflect and reassess their approach to a given situation. Feedback can best be thought of as a mirror for learners; it serves to encourage them to stop and mindfully evaluate their own behavior or 17
Table 7: Matching Learning Styles Matching Learning Styles: + : 2 / ◦ : 2 Ref. N Env. Incentive Result Brown et al. (2009) 221 Native Class ◦Exam Score Chen et al. (2011) 223 Native n/r +Final Grade Soflano et al. (2015) 120 Native None ◦Final Grade +Efficiency + = positive effects, ◦ = null results learning processes — which are otherwise unconscious or unconsidered — and make them readily visible. However, this act of mindfully evaluating and altering one’s behavior should not be taken for granted. Self-regulating one’s own learning processes (especially in response to feedback) is a skill which is highly correlated with and caused by prior education (Winters et al., 2008). Especially in the MOOC context, where the learners come from many diverse backgrounds, it is imperative that feedback offered to the learner is adaptive and aligned to their ability to process, understand, and act upon it. While Hattie (2008) finds feedback to be the most effective teaching strat- egy in his entire meta-analysis, we find very mixed results in our selected studies in terms of its effectiveness. Of the 38 results reported within the 21 papers of this category, only 14 (37%) are positive findings. Zooming in on two of the MOOC studies in this category, Coetzee et al. (2014) and Tomkin and Charlevoix (2014) evaluated the effectiveness of feed- back in the context of the discussion forum. Coetzee et al. (2014) tested the effectiveness of implementing a reputation system in a MOOC discussion fo- rum — the more you post to the forum, the more points you accumulate (this paper also applies to the Simulations & Gaming category for this reason). The authors found that providing this positive feedback did indeed lead learn- ers to post more frequently in the forum, but this did not have any impact on their final course grade. Tomkin and Charlevoix (2014) ran an experiment in which learners were divided into one of two course discussion forums — in one forum the instructor was active in providing individualized feedback to learners and engaging in discussion, and in the other no instructor feedback was provided. The authors report no differences in either completion rate or course engagement between the two conditions. To address the challenge of providing in-depth feedback on students’ 18
learning in a coding context, Wiese et al. (2017) tested the effectiveness of a code style tutor which offered adaptive, real-time feedback and hints to students learning to code. Compared to a control group receiving a simpli- fied feedback system consisting of a single unified score assessing the code, students who used the adaptive feedback system did not show any difference in the extent to which they improved their coding style (Wiese et al., 2017). Bian et al. (2016) developed and evaluated an animated pedagogical agent which was able to provide different types of emotional feedback to partici- pants in a simulated environment. They found that positive emotional feed- back (expressing happiness and encouragement in response to desirable be- havior) led to significantly higher test scores than negative feedback (where the agent expressed anger and impatience to undesirable behavior). Also taking place in a simulated environment, the experiment carried out by Se- drakyan et al. (2014) evaluated the effectiveness of a feedback-enabled simu- lation learning environment. Compared to students in the intervention group who interacted with the feedback-enabled simulation environment, those in the control condition, who did not have access to the simulation, performed more poorly on a final assessment. While navigational feedback (support for learners in optimizing their learning path through a course) like that introduced by Borek et al. (2009) is common in ITS to help learners through problems, the challenge now arises to provide personalized feedback at scale on other factors such as learner behavior patterns. This way, feedback can be used as a mechanism to make learners more aware of their learning habits/tendencies and, in turn, better at self-regulating. However, with only 37% of results reported in this cate- gory being positive, this highlights the fact that simply providing feedback is insufficient in promoting positive learning outcomes—these results are an indication that, even though we now have developed the technology to en- able the delivery of feedback at scale, attention must now be shifted towards understanding the nuance of what type of feedback (and with what sort of frequency) will help the learner in a given context or state. 4.7. Cooperative Learning Interventions targeting cooperative learning explore methods to enable learners in helping and supporting each other in the understanding of the learning material. Cooperative learning is one of the major opportunity spaces in MOOCs for their unprecedented scale and learner diversity, as evidenced by the prevalence of reported positive findings (71%). The studies 19
Table 8: Feedback Feedback: + : 14 / ◦ : 22 / - : 2 Ref. N Env. Incentive Result Kulkarni et al. (2015a) 104 Native None +Exam Score ◦Revision Williams et al. (2016b) 524 MTurk $ +Exam Score ◦Self-Efficacy Eagle and Barnes (2013) 203 Native Class +Persistence Kardan and Conati (2015) 38 Native n/r +Learning Gain ◦Exam Scores ◦Efficiency Fossati et al. (2009) 120 Native Class +Learning Gain Borek et al. (2009) 87 Native Class ◦Exam Score ◦Learning Transfer +Learning Gain Coetzee et al. (2014) 1,101 MOOC None ◦Final Grade ◦Persistence +Forum Participation Tomkin and Charlevoix (2014) 20,474 MOOC None ◦Completion Rate ◦Engagement Beheshitha et al. (2016) 169 Class None ◦Forum Participation Rafferty et al. (2016) 200 Mturk $ ◦Learning Gain Bian et al. (2016) 56 Lab None +Self-Efficacy +Exam Score Nguyen et al. (2017) 205 Mturk $ +Final Grade - Revision Wiese et al. (2017) 103 Native Class ◦Learning Gain ◦Revision Davis et al. (2017) 33,726 MOOC None +Completion Rate ◦Engagement Mitrovic et al. (2013) 41 Native Class ◦Learning Gain ◦Engagement ◦Final Grade +Efficiency Corbalan et al. (2010) 34 Native n/r ◦Engagement ◦Final Grade González et al. (2010) 121 Native Class +Final Grade van der Kleij et al. (2012) 152 Native Class ◦Final Grade Erhel and Jamet (2013) 41 Lab n/r ◦Final Grade Christy and Fox (2014) 80 Lab Credit -Final Grade Sedrakyan et al. (2014) 66 Native n/r +Final Grade 20
in this category develop and test solutions which try to find new ways to bring learners together no matter where they are in the world to complete a common goal. One successful example of this is the study by Kulkarni et al. (2015b) where MOOC learners were divided into small groups (between 2 and 9 learners per group) and allowed to have discussions using real-time video calls over the internet. Each group was given prompts encouraging the learners to both discuss course materials and share general reflections of the course experience. The authors found that learners in groups with a larger diversity of nationalities performed significantly better on the course final exam than learners in groups with low diversity. This result shows promise that the scale and diversity of MOOC learners can actually bring something novel to the table in learners’ apparent interest in cultural diversity. On a similar note, Zheng et al. (2014) developed an algorithm which aimed to divide MOOC learners into small groups in a more effective fashion compared to randomization. This model took into consideration the follow- ing factors: collaboration preferences (local, email, Facebook, Google+ or Skype), gender, time zone, personality type, learning goal, and language. The authors found this algorithmic sorting of students into groups to not have any effect on overall engagement, persistence, or final grade. Whereas this algorithm grouped largely for similarity (for example, grouping learn- ers in the same time zone together), the study presented by Kulkarni et al. (2015b) suggests that diversity may be a better approach to automated group formation. There are also possibilities for cooperative learning in which the learners do not meet face-to-face. In this light, Bhatnagar et al. (2015) evaluated a cooperative learning system which crowd-sourced learner explanations. After answering an assessment question, learners were prompted to give an expla- nation/justification. These explanations were then accumulated and shared with their peers; the authors found that providing learners the explanations of their peers increased the likelihood of a learner revising their answer to the correct one. In this scenario, the prompting for explanations not only serves as a reflective activity for the individual learner, it also leverages the social aspect by allowing him or her to contribute to the larger course community and potentially help a peer in need. Cho and Lee (2013) investigated the effectiveness of co-explanation (where learners are instructed to collaboratively explain worked examples) as com- pared to self-explanation (where learners work alone) in a Design Principles 21
course context. The authors found that learners in the co-explanation condi- tion were not as engaged with the assessment questions (identifying a design’s strengths and weaknesses) as their counterparts in the self-explanation con- dition. In the domain of peer review, Papadopoulos et al. (2012) compared “free- selection” peer review (where students could freely choose which of their peers’ work to review) against an “assigned-pair” design (where the peer review pairings are assigned by the instructor) in the context of a computer networking course. The authors found that students in the free-selection group achieved greater learning outcomes and provided better reviews than those in the assigned-pair group (Papadopoulos et al., 2012). Labarthe et al. (2016) developed a recommender system within a MOOC to provide each learner with a list of peers in the same course who they would likely work well with based on profile similarity modeling. Compared to the control group with no recommendations, the experimental group (re- ceiving the list of peer recommendations) displayed significantly improved persistence, completion rate and engagement. Given the consistency of positive results in this category (71%—the high- est of any category), the above studies should be used as building blocks or inspiration for future work in finding new ways to bring learners together and increase their sense of community and belonging in the digital learning environment. Advances made in this vein would work towards harnessing the true power of large-scale open learning environments where learners not only learn from the instructor but from each other as well through meaningful interactions throughout the learning experience Siemens (2005). 4.8. Simulations & Gaming Hattie (2008) categorizes simulations and games together and defines them as a simplified model of social or physical reality in which learners compete against either each other or themselves to attain certain outcomes. He also notes the subtle difference between simulations and gaming in that simulations are not always competitive. The studies in this category are carried out predominantly in native environments. While understandable given the games could have been developed for purposes other than exper- imentation, this raises potential issues with an eye towards reproducibility. However, considering 19 of the 28 reported results (68%) in the category pertain to desirable benefits in learner achievement or behavior, this also 22
Table 9: Cooperative Learning Cooperative Learning: + : 17 / ◦ : 6 / - : 1 Ref. N Env. Incentive Result Bhatnagar et al. (2015) 144 Native Class +Revision Lan et al. (2011) 54 Native Class +Learning Gain Tritz et al. (2014) 396 Native Class +Final grade Ngoon et al. (2016) 75 Native $ +Learning Gain Kulkarni et al. (2015a) 104 Native None +Exam Score ◦Revision Kulkarni et al. (2015b) 2,422 MOOC None +Exam Score Cambre et al. (2014) 2,474 MOOC None +Exam Score Culbertson et al. (2016) 42 Native $/Credit +Learning Gain Coetzee et al. (2015) 1,334 MTurk $ +Exam Scores Zheng et al. (2014) 1,730 MOOC None ◦Engagement ◦Persistence ◦Final Grade Khandaker and Soh (2010) 145 Native Class ◦Exam Score Konert et al. (2016) 396 Class None +Engagement +Persistence Labarthe et al. (2016) 8,673 MOOC None +Persistence +Completion Rate +Engagement AbuSeileek (2012) 216 Lab n/r +Final Grade Papadopoulos et al. (2012) 54 Native Class +Final Grade Chang et al. (2013) 27 LMS Class +Final Grade Cho and Lee (2013) 120 Class None -Engagement ◦Learning Gain + = positive effects, ◦ = null results, - = negative effects indicates a very strong trend towards the generalizable effectiveness of using simulations and gamification to help learners. While each game or simulation is unique in its own right, the under- pinning theme in all of these studies is as follows: the learner earns and accumulates rewards by exhibiting desirable behavior as defined by the in- structor/designer. While creating native educational games or gamifying ex- isting learning environments (especially MOOCs as in Coetzee et al. (2014)) is a complex, time-consuming process, based on the predominantly positive findings in the literature, we conclude that it is an area with high potential for boosting learning performance. 23
Cutumisu and Schwartz (2016) ran a study evaluating the effect of choos- ing versus receiving feedback in a game-based assignment. Compared to the group which passively received feedback, the group which was forced to ac- tively retrieve feedback was more engaged with the environment, but showed no difference in learning or revision behavior. Note that this study is not in the Feedback category, as both cohorts received the same feedback; the ele- ment being tested in the experiment was the manner in which it was delivered within the simulated environment. Cagiltay et al. (2015) employed a serious game design and put students either in a competitive (showing a scoreboard and ranking of peer perfor- mance) or non-competitive environment. The authors found that the com- petitive environment led to significantly higher test scores and more time spent answering questions. On a similar note, Attali and Arieli-Attali (2015) evaluated the effect of implementing a points system within a computer-based mathematics learning environment. Although participants in the conditions with the points system answered questions faster, there was no effect on the accuracy of their responses. Hooshyar et al. (2016) created a formative assessment game for a com- puter programming learning task. The authors found that participants in a traditional, non-computer-based environment performed worse on problem solving tasks than those who received the computer-based formative assess- ment system. Barr (2017) also compared a game-based learning environment to a non-computer-based experience. In their experiment, the participants in the game-based learning condition displayed better scores on a post-test. With 68% of the reported results being positive findings—the second highest among all categories—we see great potential for the effectiveness of learning experiences where learners are afforded the ability to interact with and explore simulated environments. Due to the substantial cost of developing such environments, future research is needed to evaluate whether this trend of positive findings continues so that institutions can be assured in justifying their investment in these instructional strategies. 4.9. Programmed Instruction According to Hattie (2008), programmed instruction is a method of pre- senting new subject matter to students in a graded sequence of controlled steps. Its main purposes are to (i) manage learning under controlled condi- tions and (ii) promote learning at the pace of the individual learner. 24
Table 10: Simulations & Gaming Simulations & Gaming: + : 19 / ◦ : 9 Ref. N Env. Incentive Result Bumbacher et al. (2015) 36 Native n/r +Learning Gain Cox et al. (2012) 41 Native $ +Engagement Culbertson et al. (2016) 42 Native $/Credit +Engagement +Exam Score Ibanez et al. (2014) 22 Native Class +Exam Score Krause et al. (2015) 206 Native n/r +Persistence +Exam Score Li et al. (2014) 24 Native n/r ◦Efficiency +Learning Transfer Schneider et al. (2011) 82 Native Class +Learning Gain +Engagement Cutumisu and Schwartz (2016) 264 Mturk $ ◦Engagement ◦Learning Gain Coetzee et al. (2014) 1,101 MOOC None ◦Final Grade ◦Persistence +Forum Participation Cheng et al. (2017) 68 Native n/r ◦Learning Gain Pozo-Barajas et al. (2013) 194 Native n/r +Final Grade Smith et al. (2013) 57 Native Class +Learning Gain Brom et al. (2014) 75 Lab $ or Credit ◦Final Grade +Engagement Attali and Arieli-Attali (2015) 1,218 Mturk $ ◦Final Grade ◦Efficiency Cagiltay et al. (2015) 142 Native n/r +Final Grade Hooshyar et al. (2016) 52 Native Class +Learning Gain Nebel et al. (2017) 103 Native $ or Credit +Final Grade +Efficiency Barr (2017) 72 Lab n/r +Final Grade + = positive effects, ◦ = null results Programmed instruction is inherently adaptive–it presents material to the learner according to that learners unique set of previous actions. As they stand now, MOOCs are simply online course content resources that remain static irrespective of a learners behavior. Unlike the native and lab environments used in Brinton et al. (2015) and Karakostas and Demetriadis (2011), the current MOOC technology has not yet accounted for a learners past behavior in delivering personalized content accordingly. By developing 25
and implementing these types of systems in a MOOC, MOOCs could then become more adaptable and able to cater instruction based on the individual learner. To enable this would require a real-time tracking system for learners where their behavior could be collected, modeled/analyzed, and then acted upon (e.g., with the delivery of a personalized recommendation for a next activity or resource) in real time on a large scale. Table 11: Programmed Instruction Programmed Instruction: + : 4 / ◦ : 7 Ref. N Env. Incentive Result Brinton et al. (2015) 43 Native None +Engagement Karakostas and Demetriadis (2011) 76 Lab n/r +Learning Gain Rosen et al. (2017) 562 MOOC None +Learning Gain +Persistence ◦Final Grade Zhou et al. (2017) 153 ITS Class ◦Final Grade ◦Engagement Arawjo et al. (2017) 24 Native n/r ◦Engagement ◦Completion Rate ◦Learning Transfer van Gog (2011) 32 Lab n/r ◦Final Grade + = positive outcomes ◦ = null results 4.10. Interactive Multimedia Methods As lecture videos are currently the backbone of MOOC instructional con- tent, it is imperative that they effectively impart knowledge to learners in an engaging, understandable fashion. Also among the most effective strategies with 64% of reported results being positive, interactive multimedia methods, though not limited to video, test various methods of content delivery through multimedia application interfaces. Kizilcec et al. (2014b), for example, compared lecture videos which in- cluded a small overlay of the instructor’s face talking versus the same lecture videos without the overlay. Results show that while learners preferred videos showing face and perceived it as more educational, there were no significant differences in the groups’ exam scores. A more interactive approach to lecture videos was explored by Monser- rat et al. (2014) who integrated several interactive components into lecture videos by integrating quiz, annotation, and discussion activities within a 26
video player. Compared to a baseline interface, which separates the videos and assessments, the integrated interface was favored by learners and enabled them to learn more content in a shorter period of time. Looking beyond video delivery methods, Kwon and Lee (2016) compare four delivery methods of a tutorial on the topic of data visualization. The four conditions are: (i) a baseline which only included text, (ii) baseline plus static images, (iii) video tutorial, and (iv) interactive tutorial where learners worked with a Web interface to manipulate and create their own data visualizations. The authors found that learners with the interactive tutorial performed better on the exam and did so while spending less overall time in the platform — an indication of increased efficiency. Aldera and Mohsen (2013) evaluated the effectiveness of captioned anima- tion with keyword annotation (a note explaining the meaning of a given word) in multimedia listening activities for language learning. Compared to partic- ipants who received either just animations or animations with captions, the captioned animations with keyword annotation condition performed signifi- cantly better on recognition and vocabulary tests. However, the participants just receiving the animations significantly outperformed the other conditions on listening comprehension and recall over time. Martin and Ertzberger (2013) deployed a “here and now” learning strat- egy (where learners have 24/7 access to learning activities on their mo- bile phones) to compare its effectiveness against computer-based instruc- tion. While the “here and now” conditions expressed more positive attitudes towards the learning experience after the experiment, the computer-based learning cohort earned higher scores on a post-test. Limperos et al. (2015) tested the impact of modality (text vs. audio+text) on learning outcomes. They found that the multimodal format (audio+text) led to better learning outcomes than receiving text alone. Pastore (2012) ran a study to see the effect of compressing the time of instruction (decreasing time to train/learn the materials) on learning. They found that decreasing (accelerating) the time by 25% leads to similar learning outcomes, whereas decreasing by 50% causes a decrease in learning. Pursuing new research in this category is important going forward in trying to truly leverage the Web for all of its learning affordances. The possibilities for digital interfaces, sensors, and devices are expanding rapidly, and more immersive, interactive, and intelligent environments promise to make a significant impact on online learning environments in the future. Even before these exciting technologies have become widely explored, we still 27
You can also read