RAG Configuration API Guide
Overview
RAG (Retrieval-Augmented Generation) enhances teacher feedback evaluation quality by providing the AI with relevant reference examples during evaluation. This guide explains how to configure and use RAG via the Assessment API.
What is RAG?
RAG finds similar high-quality feedback examples from your reference library and provides them as context to the AI evaluator. This results in:
- +18-28% accuracy improvement in evaluation consistency
- More specific feedback based on your quality standards
- Context-aware scoring that learns from your examples
- Actionable recommendations aligned with your rubric
Prerequisites
Before using RAG via API, you need:
User Requirements
- Valid account on the platform
- Organization membership with one of these roles:
- Admin - Full organization access
- Examiner - Can create evaluations and manage assessments
- Super Admin - Platform-wide access
- Active membership status
- API authentication token - See Authentication Guide for details
Note: If you don't have the required membership, contact your organization administrator.
RAG Setup Requirements
- Create Reference Feedbacks in the web interface at
/assessment/teacher-evaluation/references/ - Link to Rubric - Reference feedbacks must be associated with your evaluation rubric
- Quality Examples - Include 10-20 examples covering different quality levels
- Vector Embeddings - System automatically generates embeddings when you create references
RAG API Parameters
When creating a batch evaluation, include these RAG parameters:
{
"name": "My Batch",
"rubric_id": 1,
"organization_id": 1,
"feedback_ids": [101, 102, 103],
"use_reference_matching": true,
"reference_matching_mode": "dynamic",
"reference_detail_level": "detailed",
"max_references": 5,
"evaluation_instructions": "Additional AI context..."
}
Parameter Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
use_reference_matching |
boolean | false |
Enable/disable RAG |
reference_matching_mode |
string | "dynamic" |
Matching strategy |
reference_detail_level |
string | "summary" |
Analysis depth |
max_references |
integer | 5 |
Number of examples (1-10) |
evaluation_instructions |
string | "" |
Additional AI context |
Parameter Details
use_reference_matching
Type: boolean
Default: false
Description: Master switch to enable/disable RAG
{
"use_reference_matching": true
}
When to Enable: - ✓ You have 10+ reference feedbacks for the rubric - ✓ You want more consistent evaluations - ✓ You need context-aware scoring - ✓ You have established quality standards
When to Disable: - ✗ No reference feedbacks available - ✗ First-time testing/exploration - ✗ Generic evaluation without specific standards
reference_matching_mode
Type: string
Values: "static" | "dynamic"
Default: "dynamic"
Description: How reference examples are selected
Dynamic Mode (Recommended)
{
"reference_matching_mode": "dynamic"
}
- Uses vector similarity search to find most relevant examples
- Automatically adapts to each feedback's context
- Best for varied feedback types
- Recommended for most use cases
How it works: 1. Converts feedback to vector embedding 2. Searches reference library for similar examples 3. Returns top N most similar references 4. AI uses these as evaluation context
Example:
Feedback: "Student needs to improve grammar and clarity"
↓
Vector Search finds similar references:
- "Grammar needs attention" (similarity: 0.92)
- "Writing clarity could improve" (similarity: 0.88)
- "Structure and grammar issues" (similarity: 0.85)
Static Mode
{
"reference_matching_mode": "static"
}
- Uses predefined set of reference examples
- Same examples for all feedbacks in batch
- Faster but less context-aware
- Useful for standardized evaluations
When to use Static: - All feedbacks have similar context - You've curated specific examples for this batch type - Speed is more important than precision
reference_detail_level
Type: string
Values: "summary" | "detailed"
Default: "summary"
Description: How much reference detail the AI receives
Summary Level
{
"reference_detail_level": "summary"
}
- Provides key highlights from references
- Faster processing (fewer tokens)
- Good for straightforward evaluations
- Reduces API costs
Summary includes: - Overall score - Key strengths - Main areas for improvement - Brief recommendations
Detailed Level (Recommended)
{
"reference_detail_level": "detailed"
}
- Provides complete reference feedback
- Better evaluation quality
- More accurate comparisons
- Slightly higher API cost
Detailed includes: - Full criterion-by-criterion feedback - Complete strengths and weaknesses - Detailed recommendations - Consistency notes - Reference comparison insights
Recommendation: Use "detailed" unless processing costs are a concern. The quality improvement outweighs the marginal cost increase.
max_references
Type: integer
Range: 1-10
Default: 5
Description: Maximum number of reference examples to use
{
"max_references": 5
}
Choosing the Right Number:
| Value | Use Case | Token Cost | Quality |
|---|---|---|---|
1-2 |
Quick evaluation, limited references | Low | Basic |
3-5 |
Recommended - Balanced approach | Medium | High |
6-8 |
Complex evaluations, large reference library | High | Very High |
9-10 |
Maximum context (rarely needed) | Very High | Maximum |
Guidelines: - Start with 5 - Good balance of quality and cost - Increase to 7-8 if you have 20+ diverse references - Use 1-3 for simple, standardized evaluations - Never exceed 10 - diminishing returns
evaluation_instructions
Type: string
Optional
Description: Additional context or instructions for the AI evaluator
{
"evaluation_instructions": "Pay special attention to specific, measurable recommendations for student improvement. Focus on actionability over general praise."
}
Use Cases: - Emphasize specific rubric criteria - Provide batch-specific context - Highlight organizational standards - Guide tone or style preferences
Examples:
// Focus on specificity
{
"evaluation_instructions": "Prioritize feedback specificity. Score higher for concrete examples and actionable suggestions."
}
// Emphasis on growth mindset
{
"evaluation_instructions": "Evaluate how well feedback promotes student growth mindset and provides clear paths for improvement."
}
// Subject-specific context
{
"evaluation_instructions": "For mathematics feedback, assess clarity of problem-solving explanations and step-by-step guidance."
}
Complete Configuration Examples
Example 1: Standard Configuration (Recommended)
curl -X POST https://your-domain.com/api/batches/create/ \
-H "Authorization: Bearer your-token" \
-H "Content-Type: application/json" \
-d '{
"name": "Weekly Teacher Evaluations",
"rubric_id": 5,
"organization_id": 1,
"feedback_ids": [101, 102, 103, 104, 105],
"use_reference_matching": true,
"reference_matching_mode": "dynamic",
"reference_detail_level": "detailed",
"max_references": 5
}'
Best for: Most use cases, balanced quality and cost
Example 2: High-Quality Configuration
curl -X POST https://your-domain.com/api/batches/create/ \
-H "Authorization: Bearer your-token" \
-H "Content-Type: application/json" \
-d '{
"name": "Annual Performance Review",
"rubric_id": 5,
"organization_id": 1,
"feedback_ids": [201, 202, 203],
"use_reference_matching": true,
"reference_matching_mode": "dynamic",
"reference_detail_level": "detailed",
"max_references": 8,
"evaluation_instructions": "This is a comprehensive annual review. Provide thorough analysis with specific examples and detailed improvement recommendations."
}'
Best for: Critical evaluations, annual reviews, comprehensive assessments
Example 3: Fast Processing Configuration
curl -X POST https://your-domain.com/api/batches/create/ \
-H "Authorization: Bearer your-token" \
-H "Content-Type: application/json" \
-d '{
"name": "Quick Check Evaluations",
"rubric_id": 5,
"organization_id": 1,
"feedback_ids": [301, 302, 303, 304, 305, 306, 307, 308, 309, 310],
"use_reference_matching": true,
"reference_matching_mode": "static",
"reference_detail_level": "summary",
"max_references": 3
}'
Best for: Large batches, quick turnaround, standardized evaluations
Example 4: No RAG (Baseline)
curl -X POST https://your-domain.com/api/batches/create/ \
-H "Authorization: Bearer your-token" \
-H "Content-Type: application/json" \
-d '{
"name": "Baseline Evaluations",
"rubric_id": 5,
"organization_id": 1,
"feedback_ids": [401, 402, 403],
"use_reference_matching": false
}'
Best for: Testing, comparison baselines, when no references exist
Understanding RAG Results
When RAG is enabled, evaluation results include reference comparison data:
{
"batch": {...},
"results": [
{
"feedback": {...},
"evaluation": {
"overall_score": 42.5,
"percentage": 85.0,
"reference_comparisons": {
"similar_references_found": 5,
"top_reference": {
"id": 45,
"title": "Excellent Math Feedback Example",
"quality_level": "excellent",
"similarity_score": 0.89,
"score": 48.0
},
"comparison_insights": "This feedback shows similar clarity to reference #45 but could benefit from more specific examples as demonstrated in reference #23..."
},
...
}
}
]
}
Key Fields:
- similar_references_found - Number of references used
- top_reference - Most similar reference example
- similarity_score - How similar (0.0-1.0, higher is more similar)
- comparison_insights - AI-generated comparison analysis
Cost Analysis
RAG adds minimal cost to evaluations:
| Configuration | Additional Tokens | Cost per Evaluation | Annual Cost (1000 evals) |
|---|---|---|---|
| No RAG | 0 | $0 | $0 |
| RAG (3 refs, summary) | ~800 | $0.0024 | $2.40 |
| RAG (5 refs, detailed) | ~1,800 | $0.0054 | $5.40 |
| RAG (8 refs, detailed) | ~2,800 | $0.0084 | $8.40 |
Note: These are incremental costs on top of base evaluation costs. RAG adds approximately $0.002-$0.008 per evaluation depending on configuration.
ROI: The quality improvement (+18-28% accuracy) typically far outweighs the marginal cost increase.
Best Practices
1. Reference Library Management
- ✓ Maintain 10-20 quality references per rubric
- ✓ Include examples across all performance levels (excellent, good, average, poor)
- ✓ Update references quarterly based on new exemplars
- ✓ Tag references with subject area and grade level for better matching
2. Configuration Selection
- ✓ Start with default settings (
dynamic,detailed,5 references) - ✓ A/B test different configurations to optimize for your use case
- ✓ Use
detailedlevel for important evaluations - ✓ Increase
max_referencesif you have 20+ diverse examples
3. Performance Optimization
- ✓ Batch similar feedbacks together for consistent reference matching
- ✓ Use
staticmode for large batches of similar feedbacks - ✓ Monitor token usage and adjust
max_referencesif needed - ✓ Cache frequently used reference sets
4. Quality Assurance
- ✓ Review RAG results periodically to ensure quality
- ✓ Check
reference_comparisonsto see which references are being matched - ✓ Add new references when gaps are identified
- ✓ Compare RAG vs non-RAG evaluations to measure improvement
Troubleshooting
No References Found
Symptom: similar_references_found: 0 in results
Causes:
- No reference feedbacks created for this rubric
- References not linked to correct rubric
- References not activated (is_active=false)
Solution:
1. Go to /assessment/teacher-evaluation/references/
2. Create 5-10 reference feedbacks
3. Link them to your rubric
4. Ensure is_active=true
Poor Reference Matching
Symptom: similarity_score consistently below 0.5
Causes:
- Reference examples don't match feedback context
- Limited reference library diversity
- Subject/grade level mismatch
Solution: - Add more diverse reference examples - Tag references with subject area and grade level - Create subject-specific reference sets
RAG Not Improving Results
Symptom: Similar scores with or without RAG Causes: - Reference examples not high quality - Too few references (< 5) - Wrong configuration for use case
Solution:
- Review and improve reference quality
- Increase to 7-8 references
- Try detailed level if using summary
- Add more specific evaluation instructions
API Response Schema
Batch Creation Response with RAG:
{
"message": "Batch evaluation created successfully",
"batch": {
"id": 123,
"name": "My Batch",
"status": "pending",
"total_feedbacks": 10,
"created_at": "2025-01-15T10:30:00Z",
"rag_configuration": {
"use_reference_matching": true,
"reference_matching_mode": "dynamic",
"reference_detail_level": "detailed",
"max_references": 5,
"reference_snapshot_date": "2025-01-15T10:30:00Z",
"reference_snapshot_ids": [45, 46, 47, 48, 49, 50, 51, 52],
"reference_snapshot_version": "v1.0"
}
}
}
Evaluation Result with RAG:
{
"result_id": 456,
"feedback_id": 101,
"overall_score": 42.5,
"percentage": 85.0,
"reference_comparisons": {
"enabled": true,
"mode": "dynamic",
"similar_references_found": 5,
"references_used": [
{
"id": 45,
"title": "Excellent Math Feedback",
"quality_level": "excellent",
"similarity_score": 0.89,
"score": 48.0
},
{
"id": 47,
"title": "Good Constructive Feedback",
"quality_level": "good",
"similarity_score": 0.82,
"score": 43.5
}
],
"comparison_insights": "This feedback demonstrates strong clarity similar to reference #45...",
"average_reference_score": 44.2
}
}
Related Documentation
- Workflow Guide:
api_core_workflow.md- Complete API workflow including RAG - RAG Implementation:
RAG_IMPLEMENTATION_GUIDE.md- Detailed RAG architecture and best practices - Email API:
email_api.md- Automatic email delivery after RAG evaluation - API Reference:
API_DOCUMENTATION.md- Complete API endpoint documentation
FAQs
Q: How many reference feedbacks do I need? A: Minimum 5, recommended 10-20 for best results. Include examples across all quality levels.
Q: Does RAG work with any rubric? A: Yes, but references must be linked to the specific rubric you're evaluating against.
Q: Can I use RAG without reference feedbacks?
A: No. If use_reference_matching=true but no references exist, the system falls back to standard evaluation.
Q: How do I know if RAG is working?
A: Check reference_comparisons in evaluation results. similar_references_found > 0 means it's working.
Q: What's the performance impact? A: Adds 10-30 seconds per batch (one-time cost for reference retrieval). Each individual evaluation is unchanged.
Q: Can I update references after creating a batch? A: Yes, but the batch uses a snapshot of references from creation time. New batches will use updated references.
Last Updated: January 2025
