Skip to main content

BC2: AI Resume Architecture

Status: CA-APPROVED (96%) | Date: 2026-03-07 | Business Case: BC2 AI Resume Optimizer

Architecture Approval

This architecture was reviewed and approved by the cloud-architect agent at 96% agreement. Gap modules are tracked at the bottom of this page.

Architecture Overview

Two ECS Fargate services on the same cluster, fronted by a single ALB from terraform-aws-web:

ServiceSizeRole
resume-nextjs1 vCPU / 2 GBAG-UI + Next.js 15 frontend
resume-api2 vCPU / 4 GBCrewAI 5-agent crew + FastAPI

ALB path routing:

  • /api/*resume-api target group
  • /*resume-nextjs target group

CrewAI 5-Agent Topology

AgentRoleInputOutputEst. Tokens
ResumeParserAgentExtract structured data from PDFPDF (via S3 presigned URL)JSON resume object~2k
ScoringAgentScore resume against job descriptionResume JSON + JD textScore 0–100 + gap list~3k
GapIdentifierAgentIdentify skill/experience gapsScore outputPrioritized gap list~2k
CoverLetterWriterGenerate targeted cover letterResume + JD + gapsCover letter text~4k
InterviewPrepCoachPrepare behavioral interview Q&AResume + JD + gaps10 Q&A pairs~5k

Total per run (single-pass, no HITL rejection): ~16k tokens. Anthropic prompt caching on the JD system prompt reduces ScoringAgent cost by ~40%.

AG-UI Event Flow

Browser (Next.js 15)          FastAPI (AG-UI Adapter)         CrewAI (ECS)
| | |
|--- useCoAgent() ----------->| |
| |--- start_crew() ----------->|
| | |
|<-- RUN_STARTED -------------|<-- crew_started ------------|
|<-- TOOL_CALL (parser) ------|<-- agent_started(parser) ---|
|<-- TEXT_MESSAGE_CONTENT ----|<-- agent_output ------------|
| | |
| [HITL: user approves] | |
|--- approve() --------------->|--- resume_crew() ---------->|
| | |
|<-- TOOL_CALL (scorer) ------|<-- agent_started(scorer) ---|
|<-- STATE_DELTA (score:87) --|<-- agent_output ------------|
| ...continues for each agent... |
|<-- RUN_FINISHED ------------|<-- crew_completed ----------|

The AG-UI useCoAgent() hook in Next.js 15 drives the real-time event stream. The FastAPI adapter translates CrewAI callbacks into AG-UI protocol events over SSE.

ECS Task Sizing

ServicevCPUMemoryMin ReplicasMax ReplicasScaling Trigger
resume-nextjs12 GB14Target 70% CPU
resume-api24 GB14Target 60% CPU

Both services run Graviton3 ARM64 (~20% cheaper vs x86). The resume-api minimum replica count of 1 ensures no cold-start delay for CrewAI crew initialization.

Aurora PG16 Schema

Resume state is persisted in Aurora Serverless v2 PG16. Two core tables:

CREATE TABLE resumes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id TEXT NOT NULL,
s3_key TEXT NOT NULL,
parsed_json JSONB,
created_at TIMESTAMPTZ DEFAULT now()
);

CREATE TABLE job_applications (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
resume_id UUID REFERENCES resumes(id),
job_description TEXT NOT NULL,
score INTEGER,
gaps JSONB,
cover_letter TEXT,
interview_prep JSONB,
status TEXT DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT now()
);

parsed_json, gaps, and interview_prep use JSONB to allow flexible schema evolution without migrations as agent output formats mature.

terraform-aws-web Dependency

BC2 uses the same terraform-aws-web module as BC1 for the ALB + listener rules. The target groups and path-based routing are configured via module inputs:

module "web" {
source = "app.terraform.io/oceansoft/web/aws"

vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.public_subnet_ids
certificate_arn = aws_acm_certificate.resume.arn

target_groups = {
nextjs = {
port = 3000
protocol = "HTTP"
health_check_path = "/api/health"
}
api = {
port = 8000
protocol = "HTTP"
health_check_path = "/health"
}
}

listener_rules = [
{
priority = 100
path_pattern = ["/api/*"]
target_group_key = "api"
}
]

tags = {
CostCenter = "resume"
Environment = "production"
Project = "ai-resume"
}
}

The default listener (priority 1000) catches all other traffic and routes to the nextjs target group. The api listener rule at priority 100 takes precedence for /api/* paths.

S3 Document Storage

Resume PDFs are uploaded via S3 presigned URLs (direct browser PUT):

  • Avoids routing large PDF binaries through ECS memory
  • ResumeParserAgent reads via presigned GET URL — no ECS-to-S3 gateway required
  • S3 Intelligent-Tiering applied to the resume bucket for cost optimization on older documents

HITL Checkpoint

The CrewAI crew includes a HITL pause after ScoringAgent:

  1. User reviews the score (0–100) and gap list before the crew proceeds
  2. AG-UI renders an approve/reject UI via the TOOL_CALL event type
  3. Approve → crew continues to GapIdentifierAgentCoverLetterWriterInterviewPrepCoach
  4. Reject → returns user to resume edit flow; re-score on next submit

This checkpoint prevents downstream token spend (~11k tokens across the final three agents) when the initial score or gap identification is incorrect.

Cost Model

EnvironmentInfraAI (per run)Total
LOCAL-DEV$0$0 (Ollama)$0/month
TEST/SITVariable — run task plan:costVariableRun task plan:cost
PRODVariable — run task plan:costVariableRun task plan:cost
Cost Estimation

Cloud environment costs depend on resume volume, replica counts, and Claude API token usage per run. Run task plan:cost against the target environment tfvars for a current Infracost estimate.

Cost Optimizations Applied

  • Anthropic prompt caching on JD system prompt → ~40% ScoringAgent token cost reduction
  • S3 presigned URL upload → zero ECS bandwidth charge for PDF ingestion
  • CrewAI crew runs only on demand — no idle compute between requests
  • Graviton3 ARM64 on all ECS tasks → ~20% cheaper vs x86 equivalent

Gap Modules Needed

The following modules are required before BC2 reaches production readiness:

Gap ModulePurposePriority
terraform-aws-aurora-serverlessResume + job application state persistenceHigh — blocks production
terraform-aws-s3-document-storeResume PDF lifecycle management (upload policy, Intelligent-Tiering, presigned URL TTL)High — blocks PDF ingestion
MCP server: linkedin-scraperJob description extraction from LinkedIn postingsMedium
MCP server: job-board-aggregatorMulti-board job search (Seek, Indeed, LinkedIn)Medium
Shared Gap with BC1

terraform-aws-aurora-serverless is a gap blocker for both BC1 and BC2. Building it once satisfies both business cases.