Monday, September 15, 2014

Comments on "The Relative Effectiveness of Human Tutoring, Intelligent Tutoring Systems, and Other Tutoring Systems."

First of all, this article is exhilarating. It is written clearly and well defended.  Conceptually, it addresses the core assumption in the adaptive learning, personalization, and intelligent tutoring systems research areas, which is that human tutoring is the gold standard of tutoring. This assumption became codified research in the late 70s and early 80s.  Educational research and the majority of educational technology innovation since then has been related to or influenced by this assumption.

Overall, I agree with the observation that the 2 sigma improvement in learning gains from tutoring found in Bloom's 1984 research can not be generalized as much as originally thought.  Vanlehn still classifies the research as exemplary and invites researchers to continue to seek improves to learning, but suggests that Blooms' research was an outlier.  The major limiting factor (in the case of Anania's third experiment) was a 90% minimum achievement level for mastery learning for the tutored-group compared to a 80% minimum achievement level for the not tutored group.

Specifically, I am impressed with the categorizations used to distinguish between substep-based tutoring, step-based tutoring, answer-based tutoring, and no tutoring.  This categorization correctly does not distinguish between different kinds of interaction media (text-based, voice, audio).  The hypothesis being tested was that a finer-grained level of interaction would yield greater learning gains.  Human tutoring being the most fine- grained level (because there are no limits on how detailed a human tutor can give feedback on) and answer tutoring the opposite end of the spectrum. This meta research discovered that the step-based tutoring, the substep-based tutoring and the human tutoring had about the same effect on learners.  And all were better than answer-based tutoring, which in turn was better than no tutoring.  The two nonintuitive results were, then, that substep-based tutoring does not have significantly better performance for learning than step-based tutoring.  The second big observation is that step-based tutoring (& substep-based tutoring) yielded very similar results to human tutoring.  Or in other words that human tutoring has not out performed step-based or substep-based tutoring.

My Take Aways
What are the research steps moving forward?
Could learning gains from substep-based tutoring be explored in "ill-defined tasks (e.g., design tasks where the outcome variable is novelty or creativity)," (Vanlehn 2011)?
What other outcome variables could be used to compare tutoring approaches?

Before I read this article I was going to write about the value of capturing inner loop (or step-based) student data to use to evaluate students' understanding of knowledge components.  More specifically, I have believed that improving diagnosis would improve the intervention.  Let's look more carefully at this notion as described by Vanlehn.  Vanlehn observes how human tutors do not know how to asses student learning, and do not know what to do with assessments of students when it is given to them.  But apparently, tutors infer which problems the student gets right and can use that in their tutoring.  So he has described why human tutors are not as good at diagnosing as we usually think.  But he has not built the case that ITS cannot become better at diagnosing student misconceptions.
 The article introduces a connection between student behavior and tutoring systems architecture that is a better categorization of learning than was understood previously.  This makes intuitive sense.  Learning is not solely dependent on the design of instruction, nor is it solely dependent on the behavior of the learner.  It make sense that both inputs would affect learning outcomes.  So, I've learned that my architectural-based model to improve learning is at most half the battle.  Yet I'm hopeful because I've found recent, valid research that I can extend using Christensen's theory building approach.

What is it about my research data or circumstance that can identify anomalies in Vanlehn's framework (in order to improve it)?


Another interesting question is, why do we continue to evaluate learning on learning gains?  If there is one thing I have learned from education, it is that there is more to education than grades.

No comments:

Post a Comment