Articles on: Results

How is a scaled score calculated?

A mathematical explanation of how we arrive at a scaled score for a script

A frequent question that we get from schools is how we arrive at a scaled score for a script. In this article we explain how we get to a scaled score, and in turn explain some other aspects of the statistical process involved in comparative judgement.

A good place to start is the raw data that you can download for the candidates once a task has been carried out (in the task, go into Check results and download the Candidates Results file).

The headings we will concentrate on to explain where scaled scores come from will be (in order):

Local Comparisons
Score
Prop Score
True Score and True Score SE
Scaled Score and Scaled Score SE

Local Comparisons

When we judge a set of scripts, each script is judged against other scripts a certain number of times. That is the number of Local Comparisons (if the task is moderated, e.g. in national tasks, there will be Mod Comparisons as well).

Score

In the given number of comparisons, a particular script will be chosen as best, or ‘win’, a certain number of times – this is a raw score. This raw score for each script against other scripts are placed into a mathematical model and the resulting ‘quality’ of each script is calculated. The mathematical model is a theoretical model of how the wins and losses of scripts should be, given their varying quality. From the mathematical model, we obtain the theoretical number of wins that a script should have if the data were to fit the mathematical model perfectly - this is the Score.

Prop Score

The Prop Score is simply the theoretical proportion of wins we would expect ( Score divided by Local Comparisons). Extreme values (where all wins or all losses are recorded) are adjusted to avoid infinite values.

True Score and True Score SE

As mentioned above, based on the data from all the judgements, the mathematical model produces a measure of quality of a script. This is the True Score. In estimating this measure of quality, there is a margin of error indicated by the standard error of the true score (the True Score SE).

For a given collection of scripts, the True Score is set so that the average True Score for all the scripts is 0. The True Score therefore can be positive or negative, and stretch as far in either direction as required. This has the advantage that if we have a collection of scripts that are extremely varied in terms of quality, this can be reflected in the range of True Scores, and not constrained by the ‘ceilings’ of maximum or minimum scores.

Scaled Score and Scaled Score SE

In practice, it is unusual and perhaps undesirable for assessments to have negative scores. Also, because the True Score has no set maximum, we can get different assessments with different maximum scores, which can be confusing.

Therefore, we convert the True Scores to Scaled Scores by (a) specifying the range of values that we want, and then (b) shifting all the resulting scores so that the resulting lowest value is where we want the minimum of the scale to be.

Converting the True Score SE gives us a corresponding standard error for the Scaled Score ( Scaled Score SE).

You can try out the conversion using this interactive version version.

A worked example

To calculate Scaled Scores from True Scores, let’s show this with an example.

Suppose we have a set of True Scores that stretch from -2.8 to +3.2.

What we want is a desired minimum score of 6 and a maximum score of 30 – so a desired Scaled Score range of 24.

The True Score range is 3.2 – (-2.8) = 6 in this case.

To get the desired Scaled Score range, we multiply all the True Scores by 24/6 (or the Scale factor is desired Scaled Score range / True Score range) which is 4 in this case.

To get the correct minimum, we firstly scale up the True Score minimum as above – so that becomes -2.8 x 4 = -11.2 (so multiply by the scale factor). But this is not where we want the minimum to be – it needs shifting by 6 – (-11.2) = 17.2 (or required Scaled Score minimum – True Score minimum x Scale factor).

If we apply the scale factor and the shift to the maximum of the True Score then:

Scaled value = 3.2 x 4 = 12.8

Shift required = 6 – (-2.8) x 4 = 6 + 11.2 = 17.2

Resulting Scaled Score maximum = 12.8 + 17.2 = 30

To summarise then, to get the Scaled Scores from the True Scores:

Scale factor = desired Scaled Score range / True Score range

Shift required = required Scaled Score minimum – (True score minimum x Scale factor)

Luckily, when carrying out a task, the settings of the task will carry out all these calculations for us! We just need to specify the desired Scaled Score Min and the required Scaled Score Range in the Settings of a task, press the 'Update' button, and then the Refresh Scores button.

Updated on: 07/11/2024

Was this article helpful?

Thank you!