Monitoring AI Agreement
Whereas in the past we used a statistic called judge infit to monitor judging quality, when you only make a few judgements and the majority are made by AI, the infit statistic is no longer reliable. We propose instead that you use the AI agreement report to monitor judging quality. You can see the agreement per teacher on the judging dashboard:
The column AI agreement shows the % agreement per judge:
You can investigate low rates of agreement through our disagreements report.
Updated on: 09/06/2025
Thank you!