AI Course Generator - Post-MVP Improvements

This document outlines enhancements and improvements that can be implemented after the MVP launch. Each improvement is categorized by priority and includes implementation considerations.

Overview

The MVP implementation provides a functional AI Course Generator with:

Course outline generation via Amazon Bedrock (Claude)
Lesson placeholder creation with AI-generated prompts
Intro video generation via HeyGen
Real-time progress tracking
Basic editing capabilities

The improvements below will enhance the feature’s capabilities, performance, and user experience.

High Priority Improvements

1. Real-time Progress Updates via WebSocket

Current State: Frontend polls the API every 2 seconds for status updates.

Improvement: Implement real-time updates using AWS AppSync subscriptions or API Gateway WebSocket.

Benefits:

Reduced API calls and server load
Instant updates to UI
Better user experience

Implementation Approach:

// Using AWS AppSync Subscriptions
const subscription = gqlClient.subscribe({
  query: GENERATION_STATUS_SUBSCRIPTION,
  variables: { jobId }
}).subscribe({
  next: (data) => updateProgress(data),
  error: (err) => handleError(err)
});

Estimated Effort: 2-3 days Dependencies: AppSync or API Gateway WebSocket configuration

2. Lesson Content Generation

Current State: MVP generates lesson prompts/placeholders, not full content.

Improvement: Add “Generate Content” button for each lesson that uses the stored prompt to generate full lesson content.

Features:

Generate full HTML lesson content
Generate 3-5 action items
Generate key takeaways
Generate resources/links

Implementation:

// New endpoint: POST /admin/generate/lesson/{lessonId}/content
interface GenerateLessonContentRequest {
  lessonId: string;
  overridePrompt?: string; // Optional: use custom prompt
}

interface GenerateLessonContentResponse {
  content: string; // HTML content
  actionItems: string[];
  keyTakeaways: string[];
  resources: { title: string; url: string }[];
  tokensUsed: number;
}

Estimated Effort: 3-4 days Dependencies: Backend Lambda function, Bedrock integration

3. Retry Logic with Exponential Backoff

Current State: Generation failures require manual retry.

Improvement: Implement automatic retry with exponential backoff in Step Functions.

Implementation:

{
  "GenerateCourseOutline": {
    "Type": "Task",
    "Retry": [
      {
        "ErrorEquals": ["States.TaskFailed", "Lambda.ServiceException"],
        "IntervalSeconds": 2,
        "MaxAttempts": 3,
        "BackoffRate": 2
      }
    ],
    "Catch": [
      {
        "ErrorEquals": ["States.ALL"],
        "ResultPath": "$.error",
        "Next": "HandleError"
      }
    ]
  }
}

Estimated Effort: 1 day Dependencies: Step Functions state machine update

4. Cost Estimation Before Generation

Current State: Costs are calculated after generation.

Improvement: Show estimated cost before user starts generation.

Features:

Token estimation based on course length
Display estimated cost in USD
Require confirmation for high-cost generations

Implementation:

function estimateCost(durationDays: number, generateVideo: boolean): CostEstimate {
  const baseTokens = 2000; // Course outline
  const lessonTokens = durationDays * 400; // Per lesson
  const videoTokens = generateVideo ? 500 : 0;

  const totalTokens = baseTokens + lessonTokens + videoTokens;

  // Claude 3 Sonnet pricing: ~$18/million tokens
  const bedrockCost = (totalTokens / 1_000_000) * 18;

  // HeyGen cost: ~$0.50 per minute of video
  const heygenCost = generateVideo ? 1.5 : 0; // 90-second video

  return {
    estimatedTokens: totalTokens,
    bedrockCost,
    heygenCost,
    totalCost: bedrockCost + heygenCost
  };
}

Estimated Effort: 1 day Dependencies: None

Medium Priority Improvements

5. Multiple AI Model Support

Current State: Uses Claude 3 Sonnet only.

Improvement: Allow admins to choose between different models.

Implementation:

interface GenerationOptions {
  modelTier: 'fast' | 'balanced' | 'quality';
  // Maps to: Haiku, Sonnet, Opus
}

const MODEL_CONFIG = {
  fast: {
    modelId: 'anthropic.claude-3-haiku-20240307-v1:0',
    maxTokens: 4096,
    costPer1MTokens: 1.25
  },
  balanced: {
    modelId: 'anthropic.claude-3-sonnet-20240229-v1:0',
    maxTokens: 4096,
    costPer1MTokens: 15
  },
  quality: {
    modelId: 'anthropic.claude-3-opus-20240229-v1:0',
    maxTokens: 4096,
    costPer1MTokens: 75
  }
};

Estimated Effort: 2 days Dependencies: Bedrock model access

6. Prompt Versioning and A/B Testing

Current State: Single hardcoded prompt templates.

Improvement: Store prompts in database with versioning, enable A/B testing.

Features:

Version control for prompts
A/B testing between prompt versions
Analytics on prompt performance
Easy prompt updates without deployment

Database Schema:

CREATE TABLE prompt_templates (
  id UUID PRIMARY KEY,
  name VARCHAR(100) NOT NULL,
  version INTEGER NOT NULL,
  category VARCHAR(50) NOT NULL, -- 'course_outline', 'lesson_prompt', 'video_script'
  content TEXT NOT NULL,
  is_active BOOLEAN DEFAULT false,
  ab_test_weight DECIMAL(3,2) DEFAULT 1.0,
  success_rate DECIMAL(5,4),
  average_quality_score DECIMAL(3,2),
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
  created_by UUID REFERENCES users(id),
  UNIQUE(name, version)
);

Estimated Effort: 4-5 days Dependencies: Database migration, Admin UI for prompt management

7. Content Quality Validation

Current State: Basic quality score based on completeness.

Improvement: Use AI to validate and score generated content quality.

Features:

Readability score (Flesch-Kincaid)
Topic relevance check
Content coherence analysis
Plagiarism detection (optional)
Suggested improvements

Implementation:

async function validateContent(content: string): Promise<QualityReport> {
  const validationPrompt = `
    Analyze this educational content and provide:
    1. Readability score (1-10)
    2. Topic coherence (1-10)
    3. Educational value (1-10)
    4. Specific improvement suggestions

    Content: ${content}
  `;

  const response = await invokeClaudeModel({
    prompt: validationPrompt,
    maxTokens: 1024
  });

  return parseQualityReport(response);
}

Estimated Effort: 3 days Dependencies: Additional Bedrock calls

8. Batch Course Generation

Current State: Generate one course at a time.

Improvement: Allow generating multiple courses in parallel.

Features:

Upload CSV with course details
Process multiple courses concurrently
Dashboard for batch job monitoring
Error handling per course

Implementation:

// New endpoint: POST /admin/generate/batch
interface BatchGenerationRequest {
  courses: CourseInput[];
  options: GenerationOptions;
}

// Uses SQS queue for processing
// Step Functions Map state for parallel execution
{
  "ProcessCourses": {
    "Type": "Map",
    "ItemsPath": "$.courses",
    "MaxConcurrency": 3,
    "Iterator": {
      "StartAt": "GenerateSingleCourse",
      "States": { /* ... */ }
    }
  }
}

Estimated Effort: 5-6 days Dependencies: SQS queue, Step Functions Map state

9. Template System

Current State: Start from scratch each time.

Improvement: Save and reuse generation settings as templates.

Features:

Save current settings as template
Load templates to pre-fill form
Share templates between admins
Default templates for common course types

Database Schema:

CREATE TABLE generation_templates (
  id UUID PRIMARY KEY,
  name VARCHAR(100) NOT NULL,
  description TEXT,
  settings JSONB NOT NULL,
  is_public BOOLEAN DEFAULT false,
  created_by UUID REFERENCES users(id),
  usage_count INTEGER DEFAULT 0,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

Estimated Effort: 2-3 days Dependencies: Database migration, UI components

Lower Priority Improvements

10. Collaborative Editing

Current State: Single admin edits at a time.

Improvement: Multiple admins can edit same course simultaneously.

Features:

Real-time cursor presence
Conflict resolution
Edit history
Comments and suggestions

Technology Options:

Yjs (CRDT-based collaboration)
AWS AppSync with optimistic locking
Operational Transform (OT)

Estimated Effort: 10+ days Dependencies: Significant architecture changes

11. Video Generation Queue Management

Current State: HeyGen video generation is fire-and-forget.

Improvement: Proper queue management with status tracking.

Features:

Track video generation status
Retry failed video generations
Webhook for video completion
Preview generated video

Implementation:

// Store video generation status
interface VideoGenerationJob {
  id: string;
  courseId: string;
  heygenVideoId: string;
  status: 'PENDING' | 'PROCESSING' | 'COMPLETED' | 'FAILED';
  scriptUsed: string;
  videoUrl?: string;
  thumbnailUrl?: string;
  duration?: number;
  createdAt: Date;
  completedAt?: Date;
}

// Webhook endpoint for HeyGen callbacks
// POST /webhooks/heygen

Estimated Effort: 3-4 days Dependencies: HeyGen webhook configuration

12. Advanced Analytics Dashboard

Current State: Basic metrics displayed after generation.

Improvement: Comprehensive analytics for AI generation.

Metrics to Track:

Generation success rate
Average quality scores
Cost per course
Time to generate
Model usage breakdown
Popular categories
Admin usage patterns

Dashboard Features:

Time-series charts
Comparison between models
Export reports
Alert on anomalies

Estimated Effort: 5-6 days Dependencies: Analytics data collection, charting library

13. Import Reference Materials

Current State: Reference materials as text URLs/descriptions.

Improvement: Upload and process reference documents.

Features:

Upload PDF, DOCX, TXT files
Extract text content
Use as context for generation
Cite sources in generated content

Implementation:

// S3 upload + text extraction
async function processReferenceMaterial(fileKey: string): Promise<string> {
  const file = await s3.getObject({ Key: fileKey });

  if (fileKey.endsWith('.pdf')) {
    return extractPdfText(file.Body);
  } else if (fileKey.endsWith('.docx')) {
    return extractDocxText(file.Body);
  }

  return file.Body.toString();
}

Estimated Effort: 3-4 days Dependencies: Text extraction libraries

14. Localization Support

Current State: English only.

Improvement: Generate courses in multiple languages.

Features:

Select target language
Translate existing courses
Localized prompts
RTL language support

Languages to Support (Phase 1):

Spanish
French
German
Portuguese
Japanese

Estimated Effort: 4-5 days Dependencies: Prompt localization, UI localization

15. Custom AI Avatars

Current State: Default HeyGen avatar.

Improvement: Allow custom avatars for intro videos.

Features:

Upload instructor photos for avatar
Select from avatar library
Custom voice selection
Brand-specific styling

Estimated Effort: 2-3 days Dependencies: HeyGen avatar management

Implementation Roadmap

Phase 1 (Next Sprint)

Real-time Progress Updates (#1)
Retry Logic (#3)
Cost Estimation (#4)

Phase 2 (Following Sprint)

Lesson Content Generation (#2)
Template System (#9)
Content Quality Validation (#7)

Phase 3 (Future)

Multiple AI Model Support (#5)
Prompt Versioning (#6)
Video Queue Management (#11)

Phase 4 (Long-term)

Batch Generation (#8)
Analytics Dashboard (#12)
Collaborative Editing (#10)

Technical Debt Notes

During MVP implementation, document these for cleanup:

Hardcoded Categories: Move to database or API
Polling Interval: Make configurable
Error Messages: Improve user-facing error messages
Loading States: Add skeleton loaders
Form Validation: Add more comprehensive validation
Type Safety: Ensure all API responses are properly typed
Test Coverage: Add more integration tests
Logging: Improve structured logging
Caching: Add caching for frequently accessed data
Rate Limiting: Implement per-user rate limits

Security Considerations for Improvements

Content Moderation: Add AI-based content moderation for generated text
Audit Logging: Log all generation activities
Cost Limits: Set per-user generation limits
Input Sanitization: Validate all user inputs before sending to AI
Output Filtering: Filter any potentially harmful generated content
Access Control: Ensure proper RBAC for all new endpoints

Monitoring and Observability

Add these monitors post-MVP:

CloudWatch Alarms:
- High error rate (>5%)
- Slow generation time (>5 minutes)
- Bedrock throttling
- Step Functions failures
X-Ray Tracing:
- End-to-end request tracing
- Identify bottlenecks
- Cold start analysis
Custom Metrics:
- Generation success rate
- Average tokens per course
- Cost per generation
- Queue depth (if using SQS)

Last Updated: December 2024 Document Version: 1.0