Tuesday, June 11, 2019

When AI gets used to "teach to the test"

Artificial intelligence (AI) is changing the field of technology enhanced learning - slowly but surely. But is it for better, we do not know yet.

Whereas Intelligent Tutoring Systems have been studied since late 1970s and 80s, lately, there is some fresh air in the stuffiness of it all thanks to AI. (disclaimer: I personally think if ITS didn't take off in its first 30 years, it will not do it now). With better machine learning algorithms and computing power, questions are asked: how new technologies, including AI, could be used to transform education systems for the requirements of the future, instead of just fixing problems that are created by the aging education systems? In the latter scenario, while technologies are deployed to fix existing problems, what happens often is that at the same time, they consolidate and prop up the existing aging context and structure. We build an information system that serves the setup and problem of yesterday - and it becomes one of those legacy systems that will keep the future as the presence is.

So seeing AI, for example, helping kids manage better their day-to-day emotional growth or to build and acquire complex competences such as citizenship (including resilience and mitigating conflict resolutions) and helping them understand how they learn (competence: learning to learn) are "spot on" questions to ask and we are slowly starting to see people engaging in discussion about such use cases for AI in education (see report).

However, the old problems still persist. What we are seeing is that the new goodies of AI technologies are successfully being applied to "Teaching to the test". A prime example is the collaboration of Khan Academy with SAT. SAT being one of the biggest American company to administer Standardized Tests which are used for the purpose of university admission. The test literally puts people in a ranking order from best to worst so that the best universities can pick the students they want. Plethora of problems exists in such set up (e.g. kids' socioeconomic background plays a huge role), however, one pretty problematic is that universities do not see a correlation between the SAT results and students' latter success in their studies. So people say that the test does not even effectively test what it is supposed to test. There is a lot of writing about the topic on the internet, choose your side on it.

Anyway, back to the point. Bring in Khan Academy and AI. Kids can now take a prep SAT test in Math, for example, and bring in their results to Khan Academy, so that the system can effectively analyse (diagnose) where the mistakes were made and offer content to teach and revise it. Kids doing so do so much better in SAT. So, in other words, what the system does is the same as the old good ITS ideology promotes: it essentially says that each kid will need to learn the same material (the content that they are tested for) but they can choose a personalised (I prefer customised or differentiated) way to go through it. The latter also includes lot of fluffy words about how each kid is an individual and how they should all be taught in a personalised way.

Basically, what is said is that "one size" teaching, as we know it from the classroom (=teacher-led where most students advance in the same pace) is not OK, but it is OK to teach that very same "one size" content to kids if they can pick and choose their own learning pathway or if they are assisted by the system to choose it. So far I am totally down with it, it's a fine approach and can help some students and can free teachers time to other tasks.

What I am not OK with is that this all is done with a test as a motivator and the major goal in mind. Especially when that test is not even that justified in the first place! (some universities in the US apparently allow candidates to submit alternative test scores, for example from their past semester). So the tutoring system that Khan Academy makes available to everyone for free of charge (noble aim) is actually used to consolidate and justify the existence of standardised tests such as SAT as a determinator for university admission. This is very sad, now AI is really effectively used to "teaching to the test".

What prompted me to think about this is also that Khan Academy is not the only place where this is seen. Someone I know was just applying to a university in Finland and the new university admission system apparently allows a similar kind of prepping for the content using some kind of diagnostics to offer you study material to better succeed in the admission test. This as such is nothing new (pay prep courses always exist, even in Finland!), so maybe technologies are used here to democratizise the system. It is just a bit sad, though, as this kind of tutoring systems seem to be the low-hanging fruit of AI in education.

I wish we did not stop here, but push the use of AI even further to empower the learners and their agency.

PS. An idea: to judge or evaluate the purpose of an AI based application in education using the scale of "ethical continuum". The scale from 1 (ethical) to 7 (unethical) is presented below

Just a guess, Khan Academy and SAT would fall somewhere around 5, so still ethical, but at the far end of it!! (this is really a guess: I've never taken SAT so I don't even know if the format is really the same, but you could easily think it is).  

"A 1989 study on teaching to the test evaluated the ethical "continuum" of the practice. It identified seven practice points, ranging from most to least ethical:
  1. 1. General instruction on local objectives
  2. 2. Instruction on general test-taking skills
  3. 3. Instruction on objectives generally measured by standardized tests
  4. 4. Instruction on objectives specific to the test used
  5. 5. Instruction on objectives specific to the test used and using the same format
  6. 6. Instruction using a released test or a "clone" test that replicates the format and content of the test used
  7. 7. Instruction using the test to be used, either before or during test administration
The study concluded that the ethical boundary fell between points three and five, with points one and two being ethical and points six and seven being unethical." From the above link on Wikipedia Teaching to the test. By Mehrens, W.A.; Kaminski, J (1989). "Methods for Improving Standardized Test Scores: Fruitful, Fruitless or Fraudulent?". Educational Measurement: Issues and Practice8 (1): 14–22.

Some interesting (and some biased) reading:

Personalised learning and otherwise interesting stuff (e.g. neuroscience meets AI):