Skip to main content

The Confidently Wrong

Mark McCourt
16 June 2026

It is a dangerous habit to fall in love with one’s own ideas.

The longer we spend in a sector or profession, the easier it becomes to surround ourselves with people who think as we do, read the same authors, attend the same conferences and reinforce the same beliefs. We see this in real life and in the echo chambers that typify social media. Over time, theories that began life as tentative hypotheses can become articles of faith. We stop asking whether an idea is true and instead begin defending it as though it were part of our identity.

Being aware of this danger at least gives one a chance to resist the temptation.

One of the disciplines I have tried hard to adopt over the years is regularly revisiting things I have written in the past. Whenever new evidence emerges, I like to return to old claims and ask a simple question: would I still write this today?

Sometimes the answer is yes.

Sometimes the answer is no.

Often, the answer is somewhere in between. The general idea perhaps still holds up, but the details need adjusting. The theory becomes more nuanced, more precise and, hopefully, more useful.

Recently I have been asked several times about confidence-weighted multiple-choice questions (CWMCQs), which I wrote about in Teaching for Mastery (McCourt, 2019). This gave me an ideal opportunity to revisit the evidence and ask whether seven years of further research have strengthened, weakened or refined my original claims.

The answer, I think, is something like this.

The central argument survives. Indeed, I am more convinced than ever that confidence has an important role to play in assessment and learning. But the reasons for believing this have shifted somewhat, and recent evidence points us towards an even more interesting application than the one I originally emphasised.

The original work that attracted my attention was conducted by Erin Sparck, Elizabeth Bjork and Robert Bjork (2016). Their approach was elegant.

Rather than simply selecting an answer from a multiple-choice question, pupils were required to indicate the strength of their belief. The format allowed responses to be placed directly on an answer, between answers, or at an explicit “don’t know” position. Correct answers earned the highest scores. Confidently incorrect answers attracted substantial penalties.

The underlying rationale was connected to the hypercorrection effect, the well-established finding that errors made with high confidence are often corrected more effectively than errors made with low confidence (Butterfield & Metcalfe, 2001; Metcalfe, 2017). When we discover that something we felt certain about is wrong, the surprise grabs our attention and creates a sort of cognitive shock. The correction becomes more important because it collides with an existing belief.

The results reported by Sparck and colleagues were impressive. Confidence-weighted multiple-choice questions produced greater long-term retention than conventional multiple-choice questions. More intriguingly, simply asking pupils to state their confidence after answering a standard multiple-choice question did not produce the same benefit. The gains appeared to arise from the confidence-weighted structure itself, not merely from adding a confidence rating.

At the time, I saw this primarily as a mechanism for improving learning. Looking back, I still believe that claim is justified. However, what has intrigued me during my recent review of the literature is where subsequent research has focused.

There has not been a substantial body of follow-up work using the original triangular confidence-weighted format developed by Sparck and colleagues. Instead, researchers have explored confidence judgements more broadly through certainty-based marking, confidence ratings, diagnostic assessment systems and metacognitive measures (Foster et al., 2021; Wu et al., 2022).

I find this quite disappointing and a missed opportunity. The original format seems particularly elegant. More importantly, it is ideally suited to modern technology.

A digital assessment platform could effortlessly collect millions of confidence-weighted responses. We could identify patterns of misconceptions at a scale previously unimaginable. We could distinguish between ignorance, uncertainty, partial understanding and deeply embedded misconceptions. We could map not only what pupils know, but how strongly they believe they know it. The data would be extraordinarily rich. Yet despite this opportunity, relatively little large-scale work appears to have been conducted using the original approach.

Nevertheless, the broader confidence literature has produced an important insight that causes me to adjust my earlier position.

The greatest value of confidence-weighted questions may not lie in improving learning directly. Their greatest value may be diagnostic.

Educational discourse currently contains a phrase that appears almost everywhere: “checking for understanding”. This makes sense, right? Of course we want to check if pupils have understood the ideas at hand. But, in reality, this is rarely what is happening.

In many classrooms, checking for understanding has become synonymous with a rapid scan of mini-whiteboards, a quick show of hands or a brief verbal response from a handful of pupils.

Before going on, I should say that there are schools and certainly there are individuals who have managed to use these techniques with such skill that they truly are checking every pupil’s understanding. I recall, for instance, several visits to Matt Swain’s classrooms that blew me away. So, I am not saying that there is no value in the approach.

These techniques undoubtedly have value, but, save for those rare expert instances, they rarely tell us as much as we imagine.

Indeed, I would go further. Although we claim we are checking for understanding, we are really checking for performance. This is an important distinction, and we should not be shy about saying so.

Understanding is not directly observable. It exists inside the mind of the learner. What we can observe is performance. We can observe answers, explanations, procedures and behaviours. From these we infer understanding. Sometimes that inference is correct. Sometimes it is not.

A pupil who displays the correct response on a mini-whiteboard may possess secure knowledge. Equally, they may have guessed correctly. They may be copying a neighbour. They may have followed a procedure without understanding the underlying concept. They may even hold a misconception that simply was not exposed by the question we happened to ask.

Similarly, an incorrect answer does not tell us whether the pupil is uncertain, confused, partially correct or deeply committed to a misconception.

This is why confidence-weighted questions offer so much.

Instead of simply identifying whether an answer is right or wrong, they allow us to identify whether a pupil is right and confident, right but uncertain, wrong but uncertain, or wrong and highly confident.

Assessment becomes useful when it changes what we do next. These four categories imply four different teaching actions.

A pupil who is right and confident may be ready for extension.

A pupil who is right but uncertain may need reassurance and consolidation.

A pupil who is wrong and uncertain may need explanation and further practice.

A pupil who is wrong and highly confident may need something quite different. They may need cognitive conflict. They may need carefully chosen examples, counterexamples or experiences that force them to confront the limitations of their existing model.

In other words, confidence-weighted assessment is not merely telling us whether learning has happened, it is telling us what to do next.

That final category is especially important. The confidently wrong pupil represents one of the greatest challenges in teaching.

The pupil who knows they do not know something is often relatively easy to teach. They are receptive to explanation.

The pupil who believes they already understand is far more difficult. They have constructed a model of reality that appears coherent to them. New information must first dislodge the existing misconception before correct understanding can be established.

This is precisely where the hypercorrection literature becomes relevant. High-confidence errors are opportunities. They reveal misconceptions that might otherwise remain hidden. More importantly, when those misconceptions are exposed and corrected, the resulting learning can be particularly durable (Metcalfe, 2017).

There is a fascinating potential when this is combined with modern technology.

The assessment systems we currently use are remarkably poor at distinguishing between a pupil who does not know, a pupil who partly knows, and a pupil who confidently believes something that is false. Yet these are entirely different educational states requiring entirely different interventions.

Imagine collecting millions of confidence-weighted responses. We could identify misconceptions across entire year groups, schools, trusts and nations. We could determine not merely which concepts pupils struggle with, but which misconceptions are most strongly held. We could identify the ideas that repeatedly generate confident errors and redesign teaching accordingly.

This seems to me a vastly richer source of information than simply recording whether an answer was correct or incorrect.

This leads me to a revised position.

In 2019, I viewed confidence-weighted multiple-choice questions primarily as a tool for improving learning through the hypercorrection effect. Today, I still believe that benefit exists. However, I increasingly view them as a sophisticated mechanism for identifying the confidently wrong.

The deeper lesson here, however, is not really about assessment at all. Rather, this is a discussion about intellectual humility.

We are all confidently wrong about many things. As adults, we can train ourselves to revisit our assumptions, stress test our beliefs and expose our ideas to new evidence. This is a high-level habit of mind that pupils are still developing. Teachers therefore need effective tools for creating the kinds of cognitive shocks that force misconceptions into the open.

Confidence-weighted multiple-choice questions are one such tool. They are simple to administer, engaging for pupils and strongly supported by the evidence base. More importantly, they remind us of something fundamental about learning.

Being wrong is not the problem. Being confidently wrong and never discovering it is.

 

References

Butterfield, B., & Metcalfe, J. (2001). Errors committed with high confidence are hypercorrected. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27(6), 1491-1494.

Foster, C., McMullen, J., Hirst, J., & Kelley, N. (2021). Exploring confidence and correctness in diagnostic mathematics assessment. Educational Studies in Mathematics, 108, 287-307.

McCourt, M. (2019). Teaching for Mastery. John Catt Educational.

Metcalfe, J. (2017). Learning from errors. Annual Review of Psychology, 68, 465-489.

Sparck, E. M., Bjork, E. L., & Bjork, R. A. (2016). On the learning benefits of confidence-weighted testing. Cognitive Research: Principles and Implications, 1(3).

Wu, J., De la Torre, J., & Wells, C. S. (2022). Certainty-based marking on multiple-choice items: Psychometrics meets decision theory. Psychometrika, 87, 1062-1087.