jimbob wrote: ↑Fri Apr 14, 2023 8:28 am
Sciolus wrote: ↑Fri Apr 14, 2023 7:41 am
jimbob wrote: ↑Thu Apr 13, 2023 8:25 pm
It really was badly designed and saying that it met the specification is no excuse when the specification was messed up
I don't think we are disagreeing. My point is that it's not just the algorithm but the system in which it sits that is messed up.
Ah, agreed
I'm not sure I would describe the algorithm modellers as 'statistically illiterate'. The full Research and Analysis report
https://assets.publishing.service.gov.u ... report.pdf
addresses some reasonably sophisticated statistics (multi-level modelling gets a mention, for instance). What I suspect is that they were naive about the performance of actual students - they may have been good statisticians, but have little experience in how classes of real students are likely to perform.
For instance, it indeed appears to be the case that ties were not allowed in the rank order of students. Ties are a challenge in systems designed to rank (as I know only too well in my day job). It's mathematically convenient,
neater, to just 'forbid' them in the model. But this has negative consequences. If it is true that a successful appeal would result in all students below the new level being downgraded by one rank position, then this makes no real world sense. Equally, matching the outcomes
by school to previous years may make sense for the system as a whole, but does not make sense for individual unique school years.
A real teacher with a real class, however, may well be unable to distinguish the number of levels of performance required to rank uniquely all the students in that class. A unique ranking therefore has lower information content than a ranking with ties. And teachers also know that performance can vary quite markedly from year to year. If instead each student were assigned a score, that would both allow ties, and also allow a teacher to identify unusual excellent performance. So if in class where students had performed consistently at ‘B’ level, a particular candidate was unusually gifted, the CAG for that candidate could also be unusually high. The information content would therefore be higher than a ranking-based system. The algorithm could ensure that at a national level the outcomes were commensurate with previous years, but there would be a better match at the individual and school level. I suspect that what was needed in the algorithm design was not more expert statisticians, but more experienced and reflective teachers…
It reminds me of a problem with admissions to St George’s Medical School in the 1980s. A programmer was tasked to develop a computer programme which would match the hand-scoring of applications (ironically, to reduce inconsistencies). He analysed several years of data, and eventually came up with a computer model with a 90-95% match to historical data. St George’s were delighted and promptly employed it to select their students for interview.
You know where this is going…
A staff member with an interest in such things noticed that the programme applied a numerical penalty for being female and having a ‘funny name’ (ethnicity was inferred from surnames). Students were ranked, and those with high rankings were given an interview. Because it was written into the code, there could be no doubt about the presence of discrimination. The programmer was perhaps indignant – “You
told me to model the existing system, and that is what I did!”
Was Allo V Psycho, but when my laptop died, I lost all the info on it...