BC2: AI Resume Architecture
Status: CA-APPROVED (96%) | Date: 2026-03-07 | Business Case: BC2 AI Resume Optimizer
This architecture was reviewed and approved by the cloud-architect agent at 96% agreement. Gap modules are tracked at the bottom of this page.
Architecture Overview
Two ECS Fargate services on the same cluster, fronted by a single ALB from terraform-aws-web:
| Service | Size | Role |
|---|---|---|
resume-nextjs | 1 vCPU / 2 GB | AG-UI + Next.js 15 frontend |
resume-api | 2 vCPU / 4 GB | CrewAI 5-agent crew + FastAPI |
ALB path routing:
/api/*→resume-apitarget group/*→resume-nextjstarget group
CrewAI 5-Agent Topology
| Agent | Role | Input | Output | Est. Tokens |
|---|---|---|---|---|
ResumeParserAgent | Extract structured data from PDF | PDF (via S3 presigned URL) | JSON resume object | ~2k |
ScoringAgent | Score resume against job description | Resume JSON + JD text | Score 0–100 + gap list | ~3k |
GapIdentifierAgent | Identify skill/experience gaps | Score output | Prioritized gap list | ~2k |
CoverLetterWriter | Generate targeted cover letter | Resume + JD + gaps | Cover letter text | ~4k |
InterviewPrepCoach | Prepare behavioral interview Q&A | Resume + JD + gaps | 10 Q&A pairs | ~5k |
Total per run (single-pass, no HITL rejection): ~16k tokens. Anthropic prompt caching on the JD system prompt reduces ScoringAgent cost by ~40%.
AG-UI Event Flow
Browser (Next.js 15) FastAPI (AG-UI Adapter) CrewAI (ECS)
| | |
|--- useCoAgent() ----------->| |
| |--- start_crew() ----------->|
| | |
|<-- RUN_STARTED -------------|<-- crew_started ------------|
|<-- TOOL_CALL (parser) ------|<-- agent_started(parser) ---|
|<-- TEXT_MESSAGE_CONTENT ----|<-- agent_output ------------|
| | |
| [HITL: user approves] | |
|--- approve() --------------->|--- resume_crew() ---------->|
| | |
|<-- TOOL_CALL (scorer) ------|<-- agent_started(scorer) ---|
|<-- STATE_DELTA (score:87) --|<-- agent_output ------------|
| ...continues for each agent... |
|<-- RUN_FINISHED ------------|<-- crew_completed ----------|
The AG-UI useCoAgent() hook in Next.js 15 drives the real-time event stream. The FastAPI adapter translates CrewAI callbacks into AG-UI protocol events over SSE.
ECS Task Sizing
| Service | vCPU | Memory | Min Replicas | Max Replicas | Scaling Trigger |
|---|---|---|---|---|---|
resume-nextjs | 1 | 2 GB | 1 | 4 | Target 70% CPU |
resume-api | 2 | 4 GB | 1 | 4 | Target 60% CPU |
Both services run Graviton3 ARM64 (~20% cheaper vs x86). The resume-api minimum replica count of 1 ensures no cold-start delay for CrewAI crew initialization.
Aurora PG16 Schema
Resume state is persisted in Aurora Serverless v2 PG16. Two core tables:
CREATE TABLE resumes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id TEXT NOT NULL,
s3_key TEXT NOT NULL,
parsed_json JSONB,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE job_applications (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
resume_id UUID REFERENCES resumes(id),
job_description TEXT NOT NULL,
score INTEGER,
gaps JSONB,
cover_letter TEXT,
interview_prep JSONB,
status TEXT DEFAULT 'pending',
created_at TIMESTAMPTZ DEFAULT now()
);
parsed_json, gaps, and interview_prep use JSONB to allow flexible schema evolution without migrations as agent output formats mature.
terraform-aws-web Dependency
BC2 uses the same terraform-aws-web module as BC1 for the ALB + listener rules. The target groups and path-based routing are configured via module inputs:
module "web" {
source = "app.terraform.io/oceansoft/web/aws"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.public_subnet_ids
certificate_arn = aws_acm_certificate.resume.arn
target_groups = {
nextjs = {
port = 3000
protocol = "HTTP"
health_check_path = "/api/health"
}
api = {
port = 8000
protocol = "HTTP"
health_check_path = "/health"
}
}
listener_rules = [
{
priority = 100
path_pattern = ["/api/*"]
target_group_key = "api"
}
]
tags = {
CostCenter = "resume"
Environment = "production"
Project = "ai-resume"
}
}
The default listener (priority 1000) catches all other traffic and routes to the nextjs target group. The api listener rule at priority 100 takes precedence for /api/* paths.
S3 Document Storage
Resume PDFs are uploaded via S3 presigned URLs (direct browser PUT):
- Avoids routing large PDF binaries through ECS memory
ResumeParserAgentreads via presigned GET URL — no ECS-to-S3 gateway required- S3 Intelligent-Tiering applied to the resume bucket for cost optimization on older documents
HITL Checkpoint
The CrewAI crew includes a HITL pause after ScoringAgent:
- User reviews the score (0–100) and gap list before the crew proceeds
- AG-UI renders an approve/reject UI via the
TOOL_CALLevent type - Approve → crew continues to
GapIdentifierAgent→CoverLetterWriter→InterviewPrepCoach - Reject → returns user to resume edit flow; re-score on next submit
This checkpoint prevents downstream token spend (~11k tokens across the final three agents) when the initial score or gap identification is incorrect.
Cost Model
| Environment | Infra | AI (per run) | Total |
|---|---|---|---|
| LOCAL-DEV | $0 | $0 (Ollama) | $0/month |
| TEST/SIT | Variable — run task plan:cost | Variable | Run task plan:cost |
| PROD | Variable — run task plan:cost | Variable | Run task plan:cost |
Cloud environment costs depend on resume volume, replica counts, and Claude API token usage per run. Run task plan:cost against the target environment tfvars for a current Infracost estimate.
Cost Optimizations Applied
- Anthropic prompt caching on JD system prompt → ~40%
ScoringAgenttoken cost reduction - S3 presigned URL upload → zero ECS bandwidth charge for PDF ingestion
- CrewAI crew runs only on demand — no idle compute between requests
- Graviton3 ARM64 on all ECS tasks → ~20% cheaper vs x86 equivalent
Gap Modules Needed
The following modules are required before BC2 reaches production readiness:
| Gap Module | Purpose | Priority |
|---|---|---|
terraform-aws-aurora-serverless | Resume + job application state persistence | High — blocks production |
terraform-aws-s3-document-store | Resume PDF lifecycle management (upload policy, Intelligent-Tiering, presigned URL TTL) | High — blocks PDF ingestion |
MCP server: linkedin-scraper | Job description extraction from LinkedIn postings | Medium |
MCP server: job-board-aggregator | Multi-board job search (Seek, Indeed, LinkedIn) | Medium |
terraform-aws-aurora-serverless is a gap blocker for both BC1 and BC2. Building it once satisfies both business cases.