Articles on: Custom Tasks

Results

Where do I get the results? What do all the numbers mean?



When the judging has been completed for your task, you will want to look at the scores for your students. First of all, to update your scores, go to the particular task and click the Refresh Scores button. Then click on Check results:



You should see a table similar to the one below:



If you click on the 'Scaled Score' column heading, you can order your pupils in terms of their score. This is useful if you want to look at the progression in scripts as you go down or up through the scores. You can view a pupil's script by clicking on their link in the Code column.

If you click the Candidate Results button you will download a CSV file of pupil-level results.

This results file will look something like the following:



Key Results



Scaled Score The scaled scale for the candidate, a higher score being better. The scale limits are determined by the Scaled Score Min and the Scaled Score Range as defined in the Settings for the task (accessed via the 'cog' icon).

Infit The infit statistic for the candidate is a reflection of the degree of disagreement between judges on the quality of the scripts. A low value means low disagreement and a higher value means more disagreement. A value of 1.3 or greater means there was significant disagreement regarding this candidate’s script, and it may be of interest to look in more detail at scripts with these higher values.

Level and Level Value You can apply Levels such as "Pass" / "Fail" via the Levels menu on the Reporting page.

Link If you want to look in detail at a candidate's script then you can copy and paste this link into your browser to take you to an image of that script.

Local Comparisons How many local comparisons were made for this script.

Mod Comparisons How many moderation comparisons were made for this script.

Anchor Anchor script flag

Moderated Moderated script flag

Task Percentile The score where a certain percentage of scores fall below that number, eg. 100 is the best, 0 the worst.

Task Z Score Z-scores show us the student's score in standard deviation units, with a mean of 100 and an SD of 15.

Task Stanine This is a the scaled score divided into 9 bands to give an overview of performance.

Dot Plots



The Dot Plots tab is probably the quickest way to review your scores. Click on the Dot Plots tab to see a chart showing the frequencies of results for the candidates, grouped into Scaled Score 'bins':



You can move the 'Bin Width' slider on screen to adjust the width of these bins in order to view the most useful image of the data.
If you hover over an individual dot, you will see that candidate's name and their Scaled Score.
If you click on an individual dot, the corresponding script will be displayed on screen to the right of the chart.

Additional statistical fields for the researcher:



Scaled Score SE The standard error of the scaled score.

True Score The actual score before scaling used by the statistical calculations involved in comparative judgement.

True Score SE The standard error of the estimate of the true score.

Score All the judgements and the ‘wins’ and ‘losses’ for each script against other scripts are placed into a mathematical model and the resulting ‘quality’ of each script is calculated. The mathematical model is a theoretical model of how the wins and losses of scripts should be, given their varying quality. From this model, we obtain the theoretical number of wins that a script should have if the data were to fit the mathematical model perfectly - this is the Score.

Prop Score The theoretical proportion of wins we would expect (Score divided by number of comparisons).

Observed Score The number of 'wins' for the script

Losses The number of 'losses' for the script

Updated on: 10/09/2024

Was this article helpful?

Share your feedback

Cancel

Thank you!