E-marking crux

Thursday 30 June 2011

Productivity has been down on this website for the last few weeks – I have been swamped by the demands of helping to introduce e-marking for the May 2011 English B papers. The experience has been, shall we say, on the demanding side of tricky. The new e-marking system, called SCORIS, for future reference, seems to have been designed for discrete marking – for tests where there is a single clear answer to each question, with perhaps a little bit of variation allowed in the case of short written answers. And this job it seems to do quite well.

Where problems have arisen has been in subjects where impression marking is the dominant, or significant, rule in the marking of components.

Impression marking depends on applying criteria, not precise mark schemes. I quote from some comments that I sent out to a concerned examiner :-

“ ...1. Criteria are necessarily always too generalised. As it happens, I wrote the first draft for the current Criteria - which was then amended and finalised by the working party that wrote the current Subject Guide – so I know how the process works. The writing of Criteria is constrained by two main factors : (a) they have to be short enough to be conveniently usable (i.e. you can't have 300 words for each markband); and (2) they have to be translatable into Spanish and French + applicable to all 26 languages of Language B. So, they cannot be detailed or explicit or specific.
2. This necessarily means that interpretation of the brief phrasing is vital - key words need to have exegesis.
3. The IB assumes that examiners make the same interpretations, but this is evidently not so. This is not at all surprising, since examiners (take the 56 in English B HL2, for instance, about whom I know something) come from different backgrounds, live in different cultures, and - most significantly - the majority who are teachers will work in very different schools with very different students and very different expectations. That this disparate group should come together in May and achieve consensus of interpretation would be utterly astonishing.
4. These conditions have always applied - but the 'old' system applied post-moderation: i.e. examiners marked according to their personal values, and any divergence from the norm expected was corrected by computer after marking had finished. This system delivered marks which may not have been 'scientific' but were perceived as 'fair' (see the historically very low rate of appeals against final marks in English B). BUT - the system certainly did not encourage, or make possible, any serious debate that might achieve general consensus about marking standards.
5. We now have a system which does not permit post-adjustment, but requires examiners to mark accurately first time around. Consensus then becomes vital. But consensus does not currently exist. ...”

I won't go into the stresses and strains that are resulting from these changes (and there have been plenty, and very disagreeable for all concerned), but I would make two judgements about the whole process:

** The change, in itself, is no-one's 'fault'. The arguments in favour of adapting the IB's examining procedures to the information technology paradigm change are, in my view, irrefutable. If that is accepted, consider paragraph (4) above – the two systems are fundamentally opposed, and so to change from one to the other will inevitably cause trauma. The change might have been better managed, in ideal terms, but I have clear evidence that the IB is committed to a steep learning curve of continuous improvement.

** The change may result in a significant improvement in the quality of IB marking. My remarks quoted above argue that 'consensus' is now the Holy Grail – both because it is the only means by which e-marking can really function effectively in impression-marked components, and also because it is the only way that 'impression' examiners can cope with the system. Self interest applies both to how the IB organises its systems, and to how examiners adapt to the new system. If consensus can be achieved, in all components in all subjects, IB marking will be significantly more reliable … because at last there will be real detailed agreement about what we mean by 'good' and 'bad'.


Tags: marking, examiners, criteria, change