Install our app for a better experience!
Criteria-Based Assessment AI grading for BTEC, SQA, ATHE, IB and 7 more frameworks

A Pearson BTEC Internal Quality Assurer recently described their grading cycle in one sentence: "Three weeks per cohort, six assessors, and a 19% disagreement rate on Merit-versus-Distinction calls." That is not a story about lazy assessors. It is the unavoidable reality of criteria-based assessment at scale — where each student submission must be evaluated against a hierarchy of Pass, Merit, and Distinction criteria, every claim backed by evidence, and every grade defensible at an awarding-body audit.

In 2026, that workflow no longer needs to take three weeks. The same RAG-enhanced grading technology that powers PrepareBuddy's AI Assessment module — with 94% human-grader alignment — now extends to threshold-based, criterion-by-criterion grading across 11 global frameworks. This guide walks through how it works, which frameworks are supported, and what changes for BTEC, SQA, IB, and ATHE providers when image-aware AI starts grading every criterion in minutes instead of hours.

What Makes Criteria-Based Assessment Hard to Automate

Traditional AI grading produces a percentage score. That is fine for a multiple-choice quiz and useless for a BTEC unit. Criteria-based qualifications operate on threshold logic: a student either meets every Pass criterion or they do not achieve a Pass at all. Merit and Distinction stack on top of Pass — you cannot reach Distinction by being excellent on one criterion if you missed three Pass criteria along the way.

This creates three problems that ordinary AI grading cannot solve:

  • Per-criterion evaluation: Every criterion in the unit needs an independent Met / Not Met judgement with supporting evidence quoted directly from the submission.
  • Threshold computation: Once criteria are judged, framework-specific rules determine the overall grade. BTEC, SQA, and ATHE use Pass/Merit/Distinction; NVQ uses Competent / Not Yet Competent; IB uses Elementary / Basic / Proficient / Advanced.
  • Evidence audit trail: Awarding bodies require proof. Assessors must show where in the submission each criterion was demonstrated, with a paper trail that survives external verification.

11 Frameworks, Zero Reconfiguration

PrepareBuddy's Criteria-Based Assessment module supports 11 global frameworks out of the box. Selecting an awarding body auto-configures the grading levels, criterion hierarchy, and threshold logic — no spreadsheet templates, no manual mark scheme uploads.

FrameworkGrading LevelsTypical Use
Pearson BTEC (HNC/HND)Pass / Merit / DistinctionUK higher national qualifications
SQAPass / Merit / DistinctionScottish qualifications
ATHEPass / Merit / DistinctionAlternative higher education providers
QualifiPass / Merit / DistinctionProfessional qualifications
OTHMPass / Merit / DistinctionInternational higher education
NVQ / SVQCompetent / Not Yet CompetentNational vocational qualifications
IBElementary / Basic / Proficient / AdvancedInternational Baccalaureate
Cambridge InternationalLevel 1 / Level 2 / Level 3IGCSE, AS/A-Level
AQA / OCR / EdexcelLevel 1–4UK exam board mark schemes
NAAC / NBABelow Average to ExcellentIndian accreditation
Competency-Based EducationMasteryOutcome-based programs

For institutions running bespoke frameworks — a corporate training body, a sector-specific licensing scheme — a custom grading option lets the centre define its own criteria hierarchy and threshold rules without losing the rest of the pipeline.

Image-Aware Evaluation: Where Generic AI Grading Fails

BTEC engineering units, IT projects, and design qualifications routinely include diagrams, wireframes, screenshots, ER models, and chart-based evidence. Text-only AI grading silently ignores all of it — which means the criterion that says "justify your network topology with a clearly labelled diagram" gets a Not Met even when the diagram is perfect.

The CBA module solves this by extracting images from uploaded PDFs, DOCX, and PPTX files in their original positions, then analysing them in context with a vision model alongside the text. The grading layer uses two models in tandem:

ComponentModelPurpose
Text evaluation120B-parameter generative modelCriterion matching, evidence extraction, feedback
Image analysis17B vision-language modelDiagram interpretation, chart analysis, screenshot reading

Evidence from visual content is no longer missed. For programme leaders running BTEC HND Computing, Engineering, or Creative Media Production, that is the difference between AI grading being a curiosity and being usable.

Seven Submission Types, One Pipeline

A single BTEC unit can demand a written report, a presentation, a code project, and an audio reflection. Forcing all of that through a text-only grading flow is a non-starter. The CBA module accepts:

Submission TypeFormatsProcessing
DocumentPDF, DOCXText extraction with inline image analysis
PresentationPPTXSlide content + speaker notes + embedded images
AudioMP3, WAV, M4ATranscription + content analysis
Code projectZIPFile extraction, code review, documentation analysis
Rich textOnline entryDirect text evaluation
PortfolioMultiple filesAggregated evidence across files
VideoMP4, MOVTranscript extraction + visual analysis

How Threshold-Based Grading Actually Computes

The grading engine evaluates each criterion independently — Met, Not Met, or Partially Met — then applies the framework's threshold rule. The Pearson BTEC logic is the cleanest illustration:

If ALL Pass + ALL Merit + ALL Distinction criteria met -> DistinctionIf ALL Pass + ALL Merit met                          -> MeritIf ALL Pass met                                      -> PassOtherwise                                            -> Refer

The same engine adapts automatically to NVQ (all criteria must be Competent), IB (tiered achievement levels), and NAAC/NBA (four-tier quality grading). Programme leaders never write the threshold logic themselves — they select the awarding body and the rule fires.

Per-Criterion Evidence: What Assessors Actually See

Every criterion lands with a Met/Not Met status, a confidence score, the exact quoted evidence, and improvement-focused feedback. Here is the kind of output an assessor reviews:

Criterion: P1 — Explain the features and uses of business information
Status: Met
Confidence: 0.92
Evidence: "In section 2.3, the student identifies five distinct types of business information and explains the specific use case for each within their chosen organization."
Feedback: The response demonstrates clear understanding with practical examples. To strengthen the submission for Merit criteria, consider adding comparative analysis between information types and their interdependencies in organisational decision-making.

For Internal Quality Assurers, this is the audit trail that makes external verification straightforward — every grade is reproducible from the stored evidence.

Batch Processing for Whole Cohorts

Single-student grading is a demo. Real institutions need to grade entire cohorts in the days between submission deadline and results release. The batch pipeline handles this with parallel processing and cohort-level consistency analysis:

Cohort SizeProcessing ModeAdds
1–10 studentsSequentialIndividual reports
10–50 studentsParallelProgress tracking + reports
50+ studentsParallel with monitoringConsistency analysis + cross-tab grid

The consistency analysis is what IQAs find most useful: a per-criterion Met percentage across the cohort, a grade distribution view, and a cross-tab of every criterion against every student. Criteria with unusually low or high pass rates — the classic signals of an ambiguous brief or a grading drift — surface automatically.

The 3-Step Wizard: From Brief to Grades in One Sitting

Programme leaders who do not want to model the unit hierarchy by hand can use the quick-setup wizard:

  1. Upload the assignment brief. AI extracts the criteria, learning outcomes, and scenario from the brief PDF.
  2. Review extracted criteria. Edit or confirm the AI-extracted criterion hierarchy — this is where a programme leader sanity-checks the codes and wording.
  3. Upload student files and evaluate. Bulk upload submissions, run the batch, and review results with per-criterion evidence.

The same wizard works for a single BTEC unit, an IB internal assessment, or an SQA Higher National Unit — the framework switch only changes the threshold logic underneath.

Reports and Audit Trail

Every batch run produces a history snapshot preserving the criterion evaluations, grade distribution, AI model identity, and evaluator details. This enables three things that paper-based assessment cannot:

  • Grade appeals with full evidence trail — the original evaluation is reproducible months later.
  • IQA sampling driven by cross-tab analysis instead of random selection.
  • Regulatory compliance documentation for awarding-body verification visits.

Reports export to PDF, DOCX, CSV, or a ZIP bundle for batch distribution. Email delivery can be automated with the institution's branding, signature, and a custom message body.

Manual CBA Grading vs AI-Powered CBA

AspectManual CBA GradingAI-Powered CBA
Per-criterion evaluationHours per studentMinutes per student
Evidence documentationAssessor handwrites notesAI extracts and quotes evidence
Consistency across cohortVaries by assessorIdentical standards applied
Threshold calculationManual tallyAutomatic with framework logic
Report generationManual Word documentsOne-click PDF/DOCX/CSV
Image/diagram evidenceOften overlookedAutomatically analysed
Audit trailPaper-basedDigital history snapshots
IQA samplingManual selectionCross-tab identifies outliers
Framework switchingFull reconfigurationSelect from dropdown

Who This Is For

Criteria-based AI grading is most valuable for:

  • BTEC, SQA, ATHE, OTHM, and Qualifi centres — whose entire grading model is threshold-based and whose IQA process is currently a bottleneck.
  • IB and Cambridge International schools — where tiered grading and evidence-based assessment dominate internal assessment workflows.
  • Vocational training providers — running NVQ/SVQ portfolios where the question is always Competent vs Not Yet Competent against many criteria.
  • Universities running NAAC/NBA accreditation — needing systematic criterion-level evidence for institutional reviews.
  • Corporate L&D teams — using competency-based education frameworks where mastery against defined outcomes is the unit of progress.

For B2C language test takers, this is not the right product — that workflow lives in AI Assessment and AI Writing Analysis. CBA is built specifically for institutions whose grades are defended at awarding-body audits.

What It Means for Institutions in 2026

The institutions running BTEC, SQA, IB, and similar qualifications carry an unusual burden: every grade has to be defensible long after the student has moved on. Manual grading was already labour-intensive; rising cohort sizes and tighter results-release windows have made it untenable.

Criteria-based AI grading does not replace the assessor — it gives the assessor a first pass with per-criterion evidence, a consistency view across the cohort, and a clean audit trail. The human still owns the final grade decision, especially on borderline Merit-versus-Distinction calls. What changes is where the time goes: less on transcribing evidence into Word documents, more on the judgement calls that actually require human expertise.

Combined with PrepareBuddy's wider platform for universitiesadaptive language testing, psychometric assessments, and student journey management — criteria-based AI grading slots into a complete higher-education stack that 200+ institutions already use.

Try It on a Real Unit

The fastest way to see whether criteria-based AI grading fits your institution is to run a single unit through the pipeline. Upload one assignment brief and a small batch of student submissions, review the per-criterion evidence, and compare the grade distribution to your current manual cycle.

Schedule a demo or sign up to start with a free first month — no credit card required, no lock-in contracts. For procurement-level conversations, the For Universities page covers deployment, white-label branding, and the 24–48 hour onboarding window.

Share
Previous PTE Reading: How to Score 79+ in 2026 (Complete Strate…

Join the Discussion