BC2: AI Resume Architecture

Status: CA-APPROVED (96%) | Date: 2026-03-07 | Business Case: BC2 AI Resume Optimizer

Architecture Approval

This architecture was reviewed and approved by the cloud-architect agent at 96% agreement. Gap modules are tracked at the bottom of this page.

Architecture Overview

Two ECS Fargate services on the same cluster, fronted by a single ALB from terraform-aws-web:

Service	Size	Role
`resume-nextjs`	1 vCPU / 2 GB	AG-UI + Next.js 15 frontend
`resume-api`	2 vCPU / 4 GB	CrewAI 5-agent crew + FastAPI

ALB path routing:

/api/* → resume-api target group
/* → resume-nextjs target group

CrewAI 5-Agent Topology

Agent	Role	Input	Output	Est. Tokens
`ResumeParserAgent`	Extract structured data from PDF	PDF (via S3 presigned URL)	JSON resume object	~2k
`ScoringAgent`	Score resume against job description	Resume JSON + JD text	Score 0–100 + gap list	~3k
`GapIdentifierAgent`	Identify skill/experience gaps	Score output	Prioritized gap list	~2k
`CoverLetterWriter`	Generate targeted cover letter	Resume + JD + gaps	Cover letter text	~4k
`InterviewPrepCoach`	Prepare behavioral interview Q&A	Resume + JD + gaps	10 Q&A pairs	~5k

Total per run (single-pass, no HITL rejection): ~16k tokens. Anthropic prompt caching on the JD system prompt reduces ScoringAgent cost by ~40%.

AG-UI Event Flow

Browser (Next.js 15)          FastAPI (AG-UI Adapter)         CrewAI (ECS)
    |                              |                              |
    |--- useCoAgent() ----------->|                              |
    |                              |--- start_crew() ----------->|
    |                              |                              |
    |<-- RUN_STARTED -------------|<-- crew_started ------------|
    |<-- TOOL_CALL (parser) ------|<-- agent_started(parser) ---|
    |<-- TEXT_MESSAGE_CONTENT ----|<-- agent_output ------------|
    |                              |                              |
    |   [HITL: user approves]      |                              |
    |--- approve() --------------->|--- resume_crew() ---------->|
    |                              |                              |
    |<-- TOOL_CALL (scorer) ------|<-- agent_started(scorer) ---|
    |<-- STATE_DELTA (score:87) --|<-- agent_output ------------|
    |   ...continues for each agent...                           |
    |<-- RUN_FINISHED ------------|<-- crew_completed ----------|

The AG-UI useCoAgent() hook in Next.js 15 drives the real-time event stream. The FastAPI adapter translates CrewAI callbacks into AG-UI protocol events over SSE.

ECS Task Sizing

Service	vCPU	Memory	Min Replicas	Max Replicas	Scaling Trigger
`resume-nextjs`	1	2 GB	1	4	Target 70% CPU
`resume-api`	2	4 GB	1	4	Target 60% CPU

Both services run Graviton3 ARM64 (~20% cheaper vs x86). The resume-api minimum replica count of 1 ensures no cold-start delay for CrewAI crew initialization.

Aurora PG16 Schema

Resume state is persisted in Aurora Serverless v2 PG16. Two core tables:

CREATE TABLE resumes (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id TEXT NOT NULL,
    s3_key TEXT NOT NULL,
    parsed_json JSONB,
    created_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE job_applications (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    resume_id UUID REFERENCES resumes(id),
    job_description TEXT NOT NULL,
    score INTEGER,
    gaps JSONB,
    cover_letter TEXT,
    interview_prep JSONB,
    status TEXT DEFAULT 'pending',
    created_at TIMESTAMPTZ DEFAULT now()
);

parsed_json, gaps, and interview_prep use JSONB to allow flexible schema evolution without migrations as agent output formats mature.

terraform-aws-web Dependency

BC2 uses the same terraform-aws-web module as BC1 for the ALB + listener rules. The target groups and path-based routing are configured via module inputs:

module "web" {
  source = "app.terraform.io/oceansoft/web/aws"

  vpc_id          = module.vpc.vpc_id
  subnet_ids      = module.vpc.public_subnet_ids
  certificate_arn = aws_acm_certificate.resume.arn

  target_groups = {
    nextjs = {
      port              = 3000
      protocol          = "HTTP"
      health_check_path = "/api/health"
    }
    api = {
      port              = 8000
      protocol          = "HTTP"
      health_check_path = "/health"
    }
  }

  listener_rules = [
    {
      priority         = 100
      path_pattern     = ["/api/*"]
      target_group_key = "api"
    }
  ]

  tags = {
    CostCenter  = "resume"
    Environment = "production"
    Project     = "ai-resume"
  }
}

The default listener (priority 1000) catches all other traffic and routes to the nextjs target group. The api listener rule at priority 100 takes precedence for /api/* paths.

S3 Document Storage

Resume PDFs are uploaded via S3 presigned URLs (direct browser PUT):

Avoids routing large PDF binaries through ECS memory
ResumeParserAgent reads via presigned GET URL — no ECS-to-S3 gateway required
S3 Intelligent-Tiering applied to the resume bucket for cost optimization on older documents

HITL Checkpoint

The CrewAI crew includes a HITL pause after ScoringAgent:

User reviews the score (0–100) and gap list before the crew proceeds
AG-UI renders an approve/reject UI via the TOOL_CALL event type
Approve → crew continues to GapIdentifierAgent → CoverLetterWriter → InterviewPrepCoach
Reject → returns user to resume edit flow; re-score on next submit

This checkpoint prevents downstream token spend (~11k tokens across the final three agents) when the initial score or gap identification is incorrect.

Cost Model

Environment	Infra	AI (per run)	Total
LOCAL-DEV	$0	$0 (Ollama)	$0/month
TEST/SIT	Variable — run `task plan:cost`	Variable	Run `task plan:cost`
PROD	Variable — run `task plan:cost`	Variable	Run `task plan:cost`

Cost Estimation

Cloud environment costs depend on resume volume, replica counts, and Claude API token usage per run. Run task plan:cost against the target environment tfvars for a current Infracost estimate.

Cost Optimizations Applied

Anthropic prompt caching on JD system prompt → ~40% ScoringAgent token cost reduction
S3 presigned URL upload → zero ECS bandwidth charge for PDF ingestion
CrewAI crew runs only on demand — no idle compute between requests
Graviton3 ARM64 on all ECS tasks → ~20% cheaper vs x86 equivalent

Gap Modules Needed

The following modules are required before BC2 reaches production readiness:

Gap Module	Purpose	Priority
`terraform-aws-aurora-serverless`	Resume + job application state persistence	High — blocks production
`terraform-aws-s3-document-store`	Resume PDF lifecycle management (upload policy, Intelligent-Tiering, presigned URL TTL)	High — blocks PDF ingestion
MCP server: `linkedin-scraper`	Job description extraction from LinkedIn postings	Medium
MCP server: `job-board-aggregator`	Multi-board job search (Seek, Indeed, LinkedIn)	Medium

Shared Gap with BC1

terraform-aws-aurora-serverless is a gap blocker for both BC1 and BC2. Building it once satisfies both business cases.

Architecture Overview​

CrewAI 5-Agent Topology​

AG-UI Event Flow​

ECS Task Sizing​

Aurora PG16 Schema​

terraform-aws-web Dependency​

S3 Document Storage​

HITL Checkpoint​

Cost Model​

Cost Optimizations Applied​

Gap Modules Needed​