A Pearson BTEC Internal Quality Assurer recently described their grading cycle in one sentence: "Three weeks per cohort, six assessors, and a 19% disagreement rate on Merit-versus-Distinction calls." That is not a story about lazy assessors. It is the unavoidable reality of criteria-based assessment at scale — where each student submission must be evaluated against a hierarchy of Pass, Merit, and Distinction criteria, every claim backed by evidence, and every grade defensible at an awarding-body audit.
In 2026, that workflow no longer needs to take three weeks. The same RAG-enhanced grading technology that powers PrepareBuddy's AI Assessment module — with 94% human-grader alignment — now extends to threshold-based, criterion-by-criterion grading across 11 global frameworks. This guide walks through how it works, which frameworks are supported, and what changes for BTEC, SQA, IB, and ATHE providers when image-aware AI starts grading every criterion in minutes instead of hours.
What Makes Criteria-Based Assessment Hard to Automate
Traditional AI grading produces a percentage score. That is fine for a multiple-choice quiz and useless for a BTEC unit. Criteria-based qualifications operate on threshold logic: a student either meets every Pass criterion or they do not achieve a Pass at all. Merit and Distinction stack on top of Pass — you cannot reach Distinction by being excellent on one criterion if you missed three Pass criteria along the way.
This creates three problems that ordinary AI grading cannot solve:
- Per-criterion evaluation: Every criterion in the unit needs an independent Met / Not Met judgement with supporting evidence quoted directly from the submission.
- Threshold computation: Once criteria are judged, framework-specific rules determine the overall grade. BTEC, SQA, and ATHE use Pass/Merit/Distinction; NVQ uses Competent / Not Yet Competent; IB uses Elementary / Basic / Proficient / Advanced.
- Evidence audit trail: Awarding bodies require proof. Assessors must show where in the submission each criterion was demonstrated, with a paper trail that survives external verification.
11 Frameworks, Zero Reconfiguration
PrepareBuddy's Criteria-Based Assessment module supports 11 global frameworks out of the box. Selecting an awarding body auto-configures the grading levels, criterion hierarchy, and threshold logic — no spreadsheet templates, no manual mark scheme uploads.
| Framework | Grading Levels | Typical Use |
|---|---|---|
| Pearson BTEC (HNC/HND) | Pass / Merit / Distinction | UK higher national qualifications |
| SQA | Pass / Merit / Distinction | Scottish qualifications |
| ATHE | Pass / Merit / Distinction | Alternative higher education providers |
| Qualifi | Pass / Merit / Distinction | Professional qualifications |
| OTHM | Pass / Merit / Distinction | International higher education |
| NVQ / SVQ | Competent / Not Yet Competent | National vocational qualifications |
| IB | Elementary / Basic / Proficient / Advanced | International Baccalaureate |
| Cambridge International | Level 1 / Level 2 / Level 3 | IGCSE, AS/A-Level |
| AQA / OCR / Edexcel | Level 1–4 | UK exam board mark schemes |
| NAAC / NBA | Below Average to Excellent | Indian accreditation |
| Competency-Based Education | Mastery | Outcome-based programs |
For institutions running bespoke frameworks — a corporate training body, a sector-specific licensing scheme — a custom grading option lets the centre define its own criteria hierarchy and threshold rules without losing the rest of the pipeline.
Image-Aware Evaluation: Where Generic AI Grading Fails
BTEC engineering units, IT projects, and design qualifications routinely include diagrams, wireframes, screenshots, ER models, and chart-based evidence. Text-only AI grading silently ignores all of it — which means the criterion that says "justify your network topology with a clearly labelled diagram" gets a Not Met even when the diagram is perfect.
The CBA module solves this by extracting images from uploaded PDFs, DOCX, and PPTX files in their original positions, then analysing them in context with a vision model alongside the text. The grading layer uses two models in tandem:
| Component | Model | Purpose |
|---|---|---|
| Text evaluation | 120B-parameter generative model | Criterion matching, evidence extraction, feedback |
| Image analysis | 17B vision-language model | Diagram interpretation, chart analysis, screenshot reading |
Evidence from visual content is no longer missed. For programme leaders running BTEC HND Computing, Engineering, or Creative Media Production, that is the difference between AI grading being a curiosity and being usable.
Seven Submission Types, One Pipeline
A single BTEC unit can demand a written report, a presentation, a code project, and an audio reflection. Forcing all of that through a text-only grading flow is a non-starter. The CBA module accepts:
| Submission Type | Formats | Processing |
|---|---|---|
| Document | PDF, DOCX | Text extraction with inline image analysis |
| Presentation | PPTX | Slide content + speaker notes + embedded images |
| Audio | MP3, WAV, M4A | Transcription + content analysis |
| Code project | ZIP | File extraction, code review, documentation analysis |
| Rich text | Online entry | Direct text evaluation |
| Portfolio | Multiple files | Aggregated evidence across files |
| Video | MP4, MOV | Transcript extraction + visual analysis |
How Threshold-Based Grading Actually Computes
The grading engine evaluates each criterion independently — Met, Not Met, or Partially Met — then applies the framework's threshold rule. The Pearson BTEC logic is the cleanest illustration:
If ALL Pass + ALL Merit + ALL Distinction criteria met -> DistinctionIf ALL Pass + ALL Merit met -> MeritIf ALL Pass met -> PassOtherwise -> ReferThe same engine adapts automatically to NVQ (all criteria must be Competent), IB (tiered achievement levels), and NAAC/NBA (four-tier quality grading). Programme leaders never write the threshold logic themselves — they select the awarding body and the rule fires.
Per-Criterion Evidence: What Assessors Actually See
Every criterion lands with a Met/Not Met status, a confidence score, the exact quoted evidence, and improvement-focused feedback. Here is the kind of output an assessor reviews:
Criterion: P1 — Explain the features and uses of business information
Status: Met
Confidence: 0.92
Evidence: "In section 2.3, the student identifies five distinct types of business information and explains the specific use case for each within their chosen organization."
Feedback: The response demonstrates clear understanding with practical examples. To strengthen the submission for Merit criteria, consider adding comparative analysis between information types and their interdependencies in organisational decision-making.
For Internal Quality Assurers, this is the audit trail that makes external verification straightforward — every grade is reproducible from the stored evidence.
Batch Processing for Whole Cohorts
Single-student grading is a demo. Real institutions need to grade entire cohorts in the days between submission deadline and results release. The batch pipeline handles this with parallel processing and cohort-level consistency analysis:
| Cohort Size | Processing Mode | Adds |
|---|---|---|
| 1–10 students | Sequential | Individual reports |
| 10–50 students | Parallel | Progress tracking + reports |
| 50+ students | Parallel with monitoring | Consistency analysis + cross-tab grid |
The consistency analysis is what IQAs find most useful: a per-criterion Met percentage across the cohort, a grade distribution view, and a cross-tab of every criterion against every student. Criteria with unusually low or high pass rates — the classic signals of an ambiguous brief or a grading drift — surface automatically.
The 3-Step Wizard: From Brief to Grades in One Sitting
Programme leaders who do not want to model the unit hierarchy by hand can use the quick-setup wizard:
- Upload the assignment brief. AI extracts the criteria, learning outcomes, and scenario from the brief PDF.
- Review extracted criteria. Edit or confirm the AI-extracted criterion hierarchy — this is where a programme leader sanity-checks the codes and wording.
- Upload student files and evaluate. Bulk upload submissions, run the batch, and review results with per-criterion evidence.
The same wizard works for a single BTEC unit, an IB internal assessment, or an SQA Higher National Unit — the framework switch only changes the threshold logic underneath.
Reports and Audit Trail
Every batch run produces a history snapshot preserving the criterion evaluations, grade distribution, AI model identity, and evaluator details. This enables three things that paper-based assessment cannot:
- Grade appeals with full evidence trail — the original evaluation is reproducible months later.
- IQA sampling driven by cross-tab analysis instead of random selection.
- Regulatory compliance documentation for awarding-body verification visits.
Reports export to PDF, DOCX, CSV, or a ZIP bundle for batch distribution. Email delivery can be automated with the institution's branding, signature, and a custom message body.
Manual CBA Grading vs AI-Powered CBA
| Aspect | Manual CBA Grading | AI-Powered CBA |
|---|---|---|
| Per-criterion evaluation | Hours per student | Minutes per student |
| Evidence documentation | Assessor handwrites notes | AI extracts and quotes evidence |
| Consistency across cohort | Varies by assessor | Identical standards applied |
| Threshold calculation | Manual tally | Automatic with framework logic |
| Report generation | Manual Word documents | One-click PDF/DOCX/CSV |
| Image/diagram evidence | Often overlooked | Automatically analysed |
| Audit trail | Paper-based | Digital history snapshots |
| IQA sampling | Manual selection | Cross-tab identifies outliers |
| Framework switching | Full reconfiguration | Select from dropdown |
Who This Is For
Criteria-based AI grading is most valuable for:
- BTEC, SQA, ATHE, OTHM, and Qualifi centres — whose entire grading model is threshold-based and whose IQA process is currently a bottleneck.
- IB and Cambridge International schools — where tiered grading and evidence-based assessment dominate internal assessment workflows.
- Vocational training providers — running NVQ/SVQ portfolios where the question is always Competent vs Not Yet Competent against many criteria.
- Universities running NAAC/NBA accreditation — needing systematic criterion-level evidence for institutional reviews.
- Corporate L&D teams — using competency-based education frameworks where mastery against defined outcomes is the unit of progress.
For B2C language test takers, this is not the right product — that workflow lives in AI Assessment and AI Writing Analysis. CBA is built specifically for institutions whose grades are defended at awarding-body audits.
What It Means for Institutions in 2026
The institutions running BTEC, SQA, IB, and similar qualifications carry an unusual burden: every grade has to be defensible long after the student has moved on. Manual grading was already labour-intensive; rising cohort sizes and tighter results-release windows have made it untenable.
Criteria-based AI grading does not replace the assessor — it gives the assessor a first pass with per-criterion evidence, a consistency view across the cohort, and a clean audit trail. The human still owns the final grade decision, especially on borderline Merit-versus-Distinction calls. What changes is where the time goes: less on transcribing evidence into Word documents, more on the judgement calls that actually require human expertise.
Combined with PrepareBuddy's wider platform for universities — adaptive language testing, psychometric assessments, and student journey management — criteria-based AI grading slots into a complete higher-education stack that 200+ institutions already use.
Try It on a Real Unit
The fastest way to see whether criteria-based AI grading fits your institution is to run a single unit through the pipeline. Upload one assignment brief and a small batch of student submissions, review the per-criterion evidence, and compare the grade distribution to your current manual cycle.
Schedule a demo or sign up to start with a free first month — no credit card required, no lock-in contracts. For procurement-level conversations, the For Universities page covers deployment, white-label branding, and the 24–48 hour onboarding window.

Join the Discussion