Install our app for a better experience!

RAG Configuration API Guide

RAG Configuration API Guide

Overview

RAG (Retrieval-Augmented Generation) enhances teacher feedback evaluation quality by providing the AI with relevant reference examples during evaluation. This guide explains how to configure and use RAG via the Assessment API.

What is RAG?

RAG finds similar high-quality feedback examples from your reference library and provides them as context to the AI evaluator. This results in:

  • +18-28% accuracy improvement in evaluation consistency
  • More specific feedback based on your quality standards
  • Context-aware scoring that learns from your examples
  • Actionable recommendations aligned with your rubric

Prerequisites

Before using RAG via API, you need:

User Requirements

  1. Valid account on the platform
  2. Organization membership with one of these roles:
  3. Admin - Full organization access
  4. Examiner - Can create evaluations and manage assessments
  5. Super Admin - Platform-wide access
  6. Active membership status
  7. API authentication token - See Authentication Guide for details

Note: If you don't have the required membership, contact your organization administrator.

RAG Setup Requirements

  1. Create Reference Feedbacks in the web interface at /assessment/teacher-evaluation/references/
  2. Link to Rubric - Reference feedbacks must be associated with your evaluation rubric
  3. Quality Examples - Include 10-20 examples covering different quality levels
  4. Vector Embeddings - System automatically generates embeddings when you create references

RAG API Parameters

When creating a batch evaluation, include these RAG parameters:

{
  "name": "My Batch",
  "rubric_id": 1,
  "organization_id": 1,
  "feedback_ids": [101, 102, 103],

  "use_reference_matching": true,
  "reference_matching_mode": "dynamic",
  "reference_detail_level": "detailed",
  "max_references": 5,
  "evaluation_instructions": "Additional AI context..."
}

Parameter Reference

Parameter Type Default Description
use_reference_matching boolean false Enable/disable RAG
reference_matching_mode string "dynamic" Matching strategy
reference_detail_level string "summary" Analysis depth
max_references integer 5 Number of examples (1-10)
evaluation_instructions string "" Additional AI context

Parameter Details

use_reference_matching

Type: boolean Default: false Description: Master switch to enable/disable RAG

{
  "use_reference_matching": true
}

When to Enable: - ✓ You have 10+ reference feedbacks for the rubric - ✓ You want more consistent evaluations - ✓ You need context-aware scoring - ✓ You have established quality standards

When to Disable: - ✗ No reference feedbacks available - ✗ First-time testing/exploration - ✗ Generic evaluation without specific standards


reference_matching_mode

Type: string Values: "static" | "dynamic" Default: "dynamic" Description: How reference examples are selected

{
  "reference_matching_mode": "dynamic"
}
  • Uses vector similarity search to find most relevant examples
  • Automatically adapts to each feedback's context
  • Best for varied feedback types
  • Recommended for most use cases

How it works: 1. Converts feedback to vector embedding 2. Searches reference library for similar examples 3. Returns top N most similar references 4. AI uses these as evaluation context

Example:

Feedback: "Student needs to improve grammar and clarity"
↓
Vector Search finds similar references:
- "Grammar needs attention" (similarity: 0.92)
- "Writing clarity could improve" (similarity: 0.88)
- "Structure and grammar issues" (similarity: 0.85)

Static Mode

{
  "reference_matching_mode": "static"
}
  • Uses predefined set of reference examples
  • Same examples for all feedbacks in batch
  • Faster but less context-aware
  • Useful for standardized evaluations

When to use Static: - All feedbacks have similar context - You've curated specific examples for this batch type - Speed is more important than precision


reference_detail_level

Type: string Values: "summary" | "detailed" Default: "summary" Description: How much reference detail the AI receives

Summary Level

{
  "reference_detail_level": "summary"
}
  • Provides key highlights from references
  • Faster processing (fewer tokens)
  • Good for straightforward evaluations
  • Reduces API costs

Summary includes: - Overall score - Key strengths - Main areas for improvement - Brief recommendations

{
  "reference_detail_level": "detailed"
}
  • Provides complete reference feedback
  • Better evaluation quality
  • More accurate comparisons
  • Slightly higher API cost

Detailed includes: - Full criterion-by-criterion feedback - Complete strengths and weaknesses - Detailed recommendations - Consistency notes - Reference comparison insights

Recommendation: Use "detailed" unless processing costs are a concern. The quality improvement outweighs the marginal cost increase.


max_references

Type: integer Range: 1-10 Default: 5 Description: Maximum number of reference examples to use

{
  "max_references": 5
}

Choosing the Right Number:

Value Use Case Token Cost Quality
1-2 Quick evaluation, limited references Low Basic
3-5 Recommended - Balanced approach Medium High
6-8 Complex evaluations, large reference library High Very High
9-10 Maximum context (rarely needed) Very High Maximum

Guidelines: - Start with 5 - Good balance of quality and cost - Increase to 7-8 if you have 20+ diverse references - Use 1-3 for simple, standardized evaluations - Never exceed 10 - diminishing returns


evaluation_instructions

Type: string Optional Description: Additional context or instructions for the AI evaluator

{
  "evaluation_instructions": "Pay special attention to specific, measurable recommendations for student improvement. Focus on actionability over general praise."
}

Use Cases: - Emphasize specific rubric criteria - Provide batch-specific context - Highlight organizational standards - Guide tone or style preferences

Examples:

// Focus on specificity
{
  "evaluation_instructions": "Prioritize feedback specificity. Score higher for concrete examples and actionable suggestions."
}

// Emphasis on growth mindset
{
  "evaluation_instructions": "Evaluate how well feedback promotes student growth mindset and provides clear paths for improvement."
}

// Subject-specific context
{
  "evaluation_instructions": "For mathematics feedback, assess clarity of problem-solving explanations and step-by-step guidance."
}

Complete Configuration Examples

curl -X POST https://your-domain.com/api/batches/create/ \
  -H "Authorization: Bearer your-token" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Weekly Teacher Evaluations",
    "rubric_id": 5,
    "organization_id": 1,
    "feedback_ids": [101, 102, 103, 104, 105],

    "use_reference_matching": true,
    "reference_matching_mode": "dynamic",
    "reference_detail_level": "detailed",
    "max_references": 5
  }'

Best for: Most use cases, balanced quality and cost


Example 2: High-Quality Configuration

curl -X POST https://your-domain.com/api/batches/create/ \
  -H "Authorization: Bearer your-token" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Annual Performance Review",
    "rubric_id": 5,
    "organization_id": 1,
    "feedback_ids": [201, 202, 203],

    "use_reference_matching": true,
    "reference_matching_mode": "dynamic",
    "reference_detail_level": "detailed",
    "max_references": 8,
    "evaluation_instructions": "This is a comprehensive annual review. Provide thorough analysis with specific examples and detailed improvement recommendations."
  }'

Best for: Critical evaluations, annual reviews, comprehensive assessments


Example 3: Fast Processing Configuration

curl -X POST https://your-domain.com/api/batches/create/ \
  -H "Authorization: Bearer your-token" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Quick Check Evaluations",
    "rubric_id": 5,
    "organization_id": 1,
    "feedback_ids": [301, 302, 303, 304, 305, 306, 307, 308, 309, 310],

    "use_reference_matching": true,
    "reference_matching_mode": "static",
    "reference_detail_level": "summary",
    "max_references": 3
  }'

Best for: Large batches, quick turnaround, standardized evaluations


Example 4: No RAG (Baseline)

curl -X POST https://your-domain.com/api/batches/create/ \
  -H "Authorization: Bearer your-token" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Baseline Evaluations",
    "rubric_id": 5,
    "organization_id": 1,
    "feedback_ids": [401, 402, 403],

    "use_reference_matching": false
  }'

Best for: Testing, comparison baselines, when no references exist


Understanding RAG Results

When RAG is enabled, evaluation results include reference comparison data:

{
  "batch": {...},
  "results": [
    {
      "feedback": {...},
      "evaluation": {
        "overall_score": 42.5,
        "percentage": 85.0,
        "reference_comparisons": {
          "similar_references_found": 5,
          "top_reference": {
            "id": 45,
            "title": "Excellent Math Feedback Example",
            "quality_level": "excellent",
            "similarity_score": 0.89,
            "score": 48.0
          },
          "comparison_insights": "This feedback shows similar clarity to reference #45 but could benefit from more specific examples as demonstrated in reference #23..."
        },
        ...
      }
    }
  ]
}

Key Fields: - similar_references_found - Number of references used - top_reference - Most similar reference example - similarity_score - How similar (0.0-1.0, higher is more similar) - comparison_insights - AI-generated comparison analysis


Cost Analysis

RAG adds minimal cost to evaluations:

Configuration Additional Tokens Cost per Evaluation Annual Cost (1000 evals)
No RAG 0 $0 $0
RAG (3 refs, summary) ~800 $0.0024 $2.40
RAG (5 refs, detailed) ~1,800 $0.0054 $5.40
RAG (8 refs, detailed) ~2,800 $0.0084 $8.40

Note: These are incremental costs on top of base evaluation costs. RAG adds approximately $0.002-$0.008 per evaluation depending on configuration.

ROI: The quality improvement (+18-28% accuracy) typically far outweighs the marginal cost increase.


Best Practices

1. Reference Library Management

  • ✓ Maintain 10-20 quality references per rubric
  • ✓ Include examples across all performance levels (excellent, good, average, poor)
  • ✓ Update references quarterly based on new exemplars
  • ✓ Tag references with subject area and grade level for better matching

2. Configuration Selection

  • ✓ Start with default settings (dynamic, detailed, 5 references)
  • ✓ A/B test different configurations to optimize for your use case
  • ✓ Use detailed level for important evaluations
  • ✓ Increase max_references if you have 20+ diverse examples

3. Performance Optimization

  • ✓ Batch similar feedbacks together for consistent reference matching
  • ✓ Use static mode for large batches of similar feedbacks
  • ✓ Monitor token usage and adjust max_references if needed
  • ✓ Cache frequently used reference sets

4. Quality Assurance

  • ✓ Review RAG results periodically to ensure quality
  • ✓ Check reference_comparisons to see which references are being matched
  • ✓ Add new references when gaps are identified
  • ✓ Compare RAG vs non-RAG evaluations to measure improvement

Troubleshooting

No References Found

Symptom: similar_references_found: 0 in results Causes: - No reference feedbacks created for this rubric - References not linked to correct rubric - References not activated (is_active=false)

Solution: 1. Go to /assessment/teacher-evaluation/references/ 2. Create 5-10 reference feedbacks 3. Link them to your rubric 4. Ensure is_active=true

Poor Reference Matching

Symptom: similarity_score consistently below 0.5 Causes: - Reference examples don't match feedback context - Limited reference library diversity - Subject/grade level mismatch

Solution: - Add more diverse reference examples - Tag references with subject area and grade level - Create subject-specific reference sets

RAG Not Improving Results

Symptom: Similar scores with or without RAG Causes: - Reference examples not high quality - Too few references (< 5) - Wrong configuration for use case

Solution: - Review and improve reference quality - Increase to 7-8 references - Try detailed level if using summary - Add more specific evaluation instructions


API Response Schema

Batch Creation Response with RAG:

{
  "message": "Batch evaluation created successfully",
  "batch": {
    "id": 123,
    "name": "My Batch",
    "status": "pending",
    "total_feedbacks": 10,
    "created_at": "2025-01-15T10:30:00Z",
    "rag_configuration": {
      "use_reference_matching": true,
      "reference_matching_mode": "dynamic",
      "reference_detail_level": "detailed",
      "max_references": 5,
      "reference_snapshot_date": "2025-01-15T10:30:00Z",
      "reference_snapshot_ids": [45, 46, 47, 48, 49, 50, 51, 52],
      "reference_snapshot_version": "v1.0"
    }
  }
}

Evaluation Result with RAG:

{
  "result_id": 456,
  "feedback_id": 101,
  "overall_score": 42.5,
  "percentage": 85.0,
  "reference_comparisons": {
    "enabled": true,
    "mode": "dynamic",
    "similar_references_found": 5,
    "references_used": [
      {
        "id": 45,
        "title": "Excellent Math Feedback",
        "quality_level": "excellent",
        "similarity_score": 0.89,
        "score": 48.0
      },
      {
        "id": 47,
        "title": "Good Constructive Feedback",
        "quality_level": "good",
        "similarity_score": 0.82,
        "score": 43.5
      }
    ],
    "comparison_insights": "This feedback demonstrates strong clarity similar to reference #45...",
    "average_reference_score": 44.2
  }
}

  • Workflow Guide: api_core_workflow.md - Complete API workflow including RAG
  • RAG Implementation: RAG_IMPLEMENTATION_GUIDE.md - Detailed RAG architecture and best practices
  • Email API: email_api.md - Automatic email delivery after RAG evaluation
  • API Reference: API_DOCUMENTATION.md - Complete API endpoint documentation

FAQs

Q: How many reference feedbacks do I need? A: Minimum 5, recommended 10-20 for best results. Include examples across all quality levels.

Q: Does RAG work with any rubric? A: Yes, but references must be linked to the specific rubric you're evaluating against.

Q: Can I use RAG without reference feedbacks? A: No. If use_reference_matching=true but no references exist, the system falls back to standard evaluation.

Q: How do I know if RAG is working? A: Check reference_comparisons in evaluation results. similar_references_found > 0 means it's working.

Q: What's the performance impact? A: Adds 10-30 seconds per batch (one-time cost for reference retrieval). Each individual evaluation is unchanged.

Q: Can I update references after creating a batch? A: Yes, but the batch uses a snapshot of references from creation time. New batches will use updated references.


Last Updated: January 2025