← All posts

The missing half of exam feedback

· Artem Kulachenko · 7 min read
gradinghigher-educationpsychologyfeedbackai-marking

When you grade exams, you are looking for problems. That is the job. You read a student’s work, compare it to the rubric, and note where the work falls short. You circle errors, flag missing steps, write corrective comments. The red pen is a diagnostic tool, and you wield it with precision.

What you almost never do is write “good” on a correct derivation.

This is not laziness. It is a structural consequence of how grading works, and I have been thinking about the cost of this omission for some time now.

The economics of attention

Positive feedback feels like waste when you are grading. When a student’s answer is correct, there is nothing to fix. The rubric check passes. You move on. Writing “good approach” or “clear reasoning” costs time and mental energy that could be spent on the next submission.

There is also a less obvious cost. Error detection and encouragement are different mental postures. When you are scanning for discrepancies, pausing to formulate a positive comment breaks the rhythm. You have to shift from “what is wrong here” to “what is right here,” which is a genuinely different cognitive operation. So you optimise: you skip the positive comment, because the process rewards finding errors and penalises everything else.

The result is that students receive their exams back marked only where they made mistakes. Correct work is unmarked. The implicit message: silence means acceptable.

What silence communicates

A student who receives an exam with three annotated errors and no positive marks faces an interpretive problem. They know what they did wrong. They do not know what they did right.

Did the examiner actually read the correct parts, or just skim past them? Was the approach to question four genuinely good, or merely not wrong enough to flag? Students in technical subjects often lack confidence in their own reasoning even when it is sound. A student who solved a differential equation correctly but received no acknowledgement has no external confirmation that their method is valid. They might use the same approach next time, or they might abandon it for something from a textbook, unsure whether their original method was acceptable.

The distinction matters pedagogically. When students know which approaches worked, they repeat those approaches. When they only know which approaches failed, they know what to avoid but not what to pursue. Avoidance learning is slower and more fragile than reinforcement. A student who is told “your energy method was well applied” has a technique they can build on. A student who only sees what went wrong has a list of things to fear.

Preparing the student for what comes next

There is a subtlety in how feedback is received that most grading systems ignore.

When I meet with students to discuss their work, I signal the nature of the conversation before diving in. “I need to point out something that did not work here” is a different opening than “this part was well done, but let me show you where you lost marks.” The signal matters. It allows the student to prepare for criticism or open up to praise, and the content that follows lands more effectively either way.

Written annotations do not offer this. A comment appears next to a line of work, and the student reads it cold. If every annotation is a correction, the student learns to flinch at each one. Reviewing a marked exam becomes a sequence of small shocks, each one a failure. It is not the most productive state of mind for learning.

Colour-coded annotations help. Green signals positive feedback. Red signals errors. Yellow indicates partial credit or uncertainty. The colour appears before the student reads the text, acting as the same kind of preparatory signal that a face-to-face conversation provides. The student sees green and relaxes. They see red and prepare. They see yellow and pay closer attention. A traffic light for the ego, if you like.

Curiosity and the positive signal

When a student sees that a particular approach was positively marked, it can trigger curiosity. “Why was this considered good? What about it worked?” This is a different kind of engagement than what error corrections produce, which tends to be defensive: “What did I do wrong? How do I avoid this next time?”

A student who wonders why their approach worked is starting to think about the subject at a deeper level. A student who only wonders what they did wrong is trying to survive the course. There is a meaningful distance between those two states.

A comment like “Elegant use of superposition” does more than validate the answer. It names the technique, affirms its application, and implicitly invites the student to explore it further. Error corrections close a gap. Positive feedback opens a door.

The cost of producing both

I have written elsewhere about how an AI grading companion changes the psychology of marking. But there is a specific aspect relevant here: the cost structure of annotation.

For a human examiner, every annotation competes with every other annotation for a fixed amount of time and attention. Positive comments lose that competition because they carry no corrective information. For an AI model, there is no competition. Noting a correct derivation costs exactly the same as noting an error. So positive annotations appear by default, not as something the examiner has to carve out time for.

The examiner reviews both kinds during calibration. Removing a positive comment that feels unearned is fast. The work shifts from producing all feedback to reviewing it, which takes less time and catches more.

Annotations as metadata

There is a second cost to sparse annotations that has nothing to do with student psychology.

When an examiner writes “wrong” on an answer and assigns zero marks, the grade is recorded but the reasoning is lost. The mark goes into the spreadsheet. The annotation says nothing about what kind of error occurred: conceptual misunderstanding or arithmetic slip, common misconception or individual mistake. We know how many students failed the question. We do not know why.

This matters because annotations are not just feedback to the student. They are metadata about the course.

When annotations are detailed, they become a dataset you can actually interrogate. Which concepts did students handle well? Where did the cohort struggle? Did a particular question trip up 60 percent of the class, or just the bottom quartile? A column of numbers tells you who passed and who failed. Annotations tell you why. But only if they exist, and only if they say more than “wrong.”

Sparse annotations from time-pressured grading leave you with almost no signal. A few terse corrections and silence everywhere else. Dense annotations, where every checkpoint receives a characterised comment, give you something you can compare across years and use to improve the course. I explore what becomes possible with this kind of persistent data in a separate post on the teacher’s context window.

Not guilt, but architecture

None of this is meant to suggest that examiners should feel guilty about skipping positive feedback. The grading process as it exists does not leave room for it. The constraint was never caring. It was time.

When producing positive feedback takes no additional effort, it happens. Students get exams that acknowledge what they did well alongside what they need to fix. The examiner did not suddenly become more thorough. The process stopped punishing thoroughness.