← All posts

AI-assisted marking: what it actually means for academics

· AEMS Team · 4 min read
ai-markinghigher-educationworkflow

Every semester, the same pattern repeats. Exams finish, and a stack of papers lands on your desk. The marking begins, carefully at first, with close attention to the rubric. By paper fifty, the criteria feel blurry. By paper two hundred, you are checking the clock more than the rubric.

This is not a moral failing. It is a well-documented cognitive phenomenon: sustained evaluative tasks degrade consistency over time. The first paper and the last paper are not assessed the same way, no matter how diligent the examiner.

AI-assisted marking is designed to address exactly this problem, and only this problem. It is not a replacement for academic judgment. It is a tool that applies your rubric mechanically, so you can focus your expertise where it matters.

What AI does in the marking process

When you use AEMS, the workflow is straightforward:

  1. You define the rubric. Each check describes what to look for, how many marks it is worth, and what common errors to expect. The AI does not invent criteria.

  2. AI reads each submission. A vision-capable model reads the scanned pages (handwritten, typed, or printed) and extracts the student’s work.

  3. AI applies your checks. Each rubric check is evaluated independently against the extracted content. The model produces a proposed mark and a short explanation.

  4. You review everything. The result is an annotated PDF with colour-coded marks. Green for correct, red for errors, amber for partial credit. You accept, adjust, or override any mark before it reaches the student.

The critical point: the AI proposes. You decide. Every grade passes through human review.

Where AI adds the most value

Consistency. The AI applies the same rubric to paper one hundred as it did to paper one. The interpretation of “partially correct” does not shift over the course of an evening. I have written more about what this stability does to the psychology of grading.

Speed. For structured questions with clear right/wrong criteria (calculations, formula application, factual recall), AI marking is fast and accurate. What takes 15 minutes per paper manually can be pre-processed in seconds, leaving you to verify rather than re-derive.

Feedback quality. Because the AI references specific rubric checks in its annotations, students receive targeted feedback linked to defined criteria. This is more useful than a single number or a vague comment like “needs more detail.”

Where AI falls short

Nuanced argumentation. Open-ended essay questions, qualitative analysis, and “discuss” prompts remain difficult for current models. The AI can flag relevant content and check for structural elements, but evaluating the quality of an argument still requires human judgment.

Ambiguous handwriting. Vision models have improved dramatically, but heavily stylised or faint handwriting can still cause misreads. AEMS flags low-confidence extractions for manual review rather than guessing silently.

Context the rubric does not capture. If a student takes an unconventional but valid approach that your rubric did not anticipate, the AI will mark it as incorrect. The human review step catches these cases, and when you correct them, the system’s memory records the adjustment for future reference.

The privacy question

When universities consider AI marking tools, the first question is usually about student data. Where do the exam papers go? Who can see them? Are they used to train AI models?

These are the right questions. The answer depends entirely on how the tool is deployed. AEMS supports three models: local-first with a paired agent, EU-hosted with data processing agreements, and on-premises behind the university’s own firewall. Each has a different privacy profile, and the choice depends on your institution’s data governance requirements. We cover this in detail in posts on privacy-first grading and local-first architecture.

A tool, not a revolution

If you mark structured exams with clear rubrics, the time savings are immediate. If your assessment is primarily qualitative, the benefit is more modest but still real: consistent first-pass annotations that you refine rather than create from scratch.

The examiner’s expertise stays central. The AI handles the mechanical parts. The distinction is worth remembering, because the most common question I get from colleagues is whether this is the beginning of the end for human marking. It is not. It is the end of the beginning for making marking tolerable.