Decide on AWS Multiple Accounts and Organizational Units

When setting up your AWS infrastructure, you need to decide how to organize your workloads across multiple AWS accounts to ensure optimal isolation and management. This involves deciding the appropriate account structure and organizational units (OUs) that align with your operational needs and security requirements.

Context and Problem Statement

The AWS Well-Architected Framework recommends splitting workloads across multiple AWS accounts.

When moving to an infrastructure-as-code (IaC) model of infrastructure provisioning, many of the same best practices that apply to regular software development should apply to IaC. Part of that is not making changes to a production environment that hasn't been tested in a staging environment. If the production and staging environments are in the same account, then there are insufficient assurances/guarantees/protections in place to prevent breaking production.

Constructs like VPCs only provide network-level isolation, but not IAM-level isolation. And within a single AWS account, there’s no practical way to manage IAM-level boundaries between multiple stages like dev/staging/prod. For example, to provision most terraform modules, “administrative” level access is required because provisioning any IAM roles requires admin privileges. That would mean that a developer needs to be an “admin” in order to iterate on a module.

Leveraging multiple AWS accounts within an AWS Organization is the only way to satisfy these requirements. Guardrails can be be in place to restrict what can happen in an account and by whom.

We must decide how to organize the flat account structure into organizational units. Organizational units can then leverage things like Service Control Policies to restrict what can happen inside the accounts.

Multiple AWS accounts should be used to provide a higher degree of isolation by segmenting/isolating workloads. There is no additional cost for operating multiple AWS accounts. It does add additional overhead to manage as a standard set of components will to manage the account. AWS support only applies to one account, so it may need to be purchased for each account unless the organization upgrades to Enterprise Support.

Multiple AWS accounts are all managed underneath an AWS Organization and organized into multiple organizational units (OUs). Service Control Policies can restrict what runs in an account and place boundaries around an account that even account-level administrators cannot bypass.

Considered Options

AWS Well-Architected Account Designations

Here are some common account designations. Not all are required.

This is our recommended approach.

Management Account
Infrastructure OU
Security OU
Exceptions OU
Workloads OU

👑 Core-Root Management Account (parent, billing): `aws-admin` | `core-root`

The "root" account creates all child accounts. The root account has unique abilities not available in any other account.

An administrator in the root account by default has the OrganizationAccountAccessRole to all other accounts (admin access).
Service Control Policies can only be set in this account
It’s the only account that can have member accounts associated with it.
It’s the only account that can manage the AWS Organization
Organizational CloudTrails can only be provisioned in this account.

ou-shared-services & Shared Services Accounts

OU	Environment	Account Alias	Root Email	Description
`management account`	prod	aws-admin	[email protected]	Management account + Centralized identity and user management
Shared Services	prod	aws-shared-services	[email protected]	Centralized networking (Transit Gateway, VPCs)
Centralized Ops	prod	aws-centralized-ops	[email protected]	Centralized operational services
Shared Services	prod	aws-backup	[email protected]	Backup services

⛑️ Log archive account:

inbound-outbound-sec acts as a consolidation point for log data that is gathered from all the accounts in the organization and primarily used by your security, operations, audit, and compliance teams.
- VPC Flow Logs
- Amazon Security Lake

⛑️ Security Tooling (Audit) accounts: `aws-audit`

Provide centralized delegated admin access to AWS security tooling and consoles, as well as provide view-only access for investigative purposes into all accounts in the organization. The security tooling account should be restricted to authorized security and compliance personnel and related security.

ou-security OU & Security Accounts

OU	Environment	Account Alias	Root Email	Description
Security (ou-security)	prod	aws-log-archive	[email protected]	Centralized logging (CloudTrail, AWS Config)
Security (ou-security)	prod	aws-audit	[email protected]	Security tooling (Security Hub, GuardDuty)

🧑‍🎄 Service Control Policies and scrutiny: `aws-poc`

ou-exceptions & Exceptions Accounts

OU	Environment	Account Alias	Root Email	Description
Exceptions	prod	aws-exception	[email protected]	Accounts requiring policy exceptions

🤠 DataHub Platform Workloads/Applications account:

ou-prod 🥇 plat-prod: The "production" is the account where you run your most mission-critical applications.
ou-non-prod 🥈 plat-sit | plat-staging: The “staging” account, System Integration Testing (SIT), is where QA and integration tests will run for public consumption. This is production for QA engineers and partners doing integration tests. It must be stable for third-parties to test. It runs a kubernetes cluster.
ou-non-prod 🥈plat-dev: The "dev" account is where to run automated tests, load tests infrastructure code. This is where the entire engineering organization operates daily. It needs to be stable for developers. This environment is Production for developers to develop code.
ou-non-prod 🥈 plat-uat: Additional or alternative platform accounts.
ou-sandbox 🥉 plat-sandbox: The "sandbox" account is where you let your developers have fun and break things. Developers get admin. This is where changes happen first. It will be used by developers who need the bleeding edge. Only DevOps work here or developers trying to get net-new applications added to tools like slice.

🤠 `data-prod`, `data-staging`, `data-dev`

The "data" account is where the quants live =) Runs systems like Airflow, Jupyterhub, Batch processing, Redshift,...

ou-applications & Workload (Applications) Accounts

OU	Environment	Account Alias	Root Email	Description
Applications (ou-applications)	prod	aws-apps-prod	[email protected]	Production workloads
Applications (ou-applications)	non-prod	---	[email protected]	Pre-production workloads (SIT, and UAT)
Applications (ou-applications)	sandbox	aws-sandbox	[email protected]	Dev/Test workloads (Development, QA, Test)

AWS Control-Tower-Compatible Multi-Organization, Multi-Account Naming Convention

The recommended approach for email addresses leverages RFC-5233 (subaddressing, plus addressing), facilitating easy management of email addresses without creating multiple mailboxes:

Account Alias	Recommended Root Email (subaddressed)	Email Example	Description
`aws-admin`	[email protected]	[email protected]	`management account` Central identity and access management account
aws-shared-services	[email protected]	[email protected]	Central network management (Transit Gateway), File/Print Servers
aws-centralized-ops	[email protected]	[email protected]	Central operations management account
aws-backup	[email protected]	[email protected]	Backup services account
aws-log-archive	[email protected]	[email protected]	Centralized logs account (CloudTrail, Config logs)
aws-audit	[email protected]	[email protected]	Security tools management (GuardDuty, Security Hub)
aws-apps-prod	[email protected]	[email protected]	Production applications account
---	[email protected]	[email protected]	Non-production applications account
aws-sandbox	[email protected]	[email protected]	Sandbox environment
aws-exception	[email protected]	[email protected]	Account requiring security policy exceptions

TODO:
- The core-identity account is where to add users and delegate access to the other accounts and is where users log in.
  - IAM Identity Center
- The core-audit account is where all logs end up
- The core-security account is where to run automated security scanning software that might operate in a read-only fashion against the audit account.
- The core-dns account is the owner for all zones (may have a legal role with Route53Registrar.* permissions). Cannot touch zones or anything else. Includes billing.
  - Example use-case: Legal team needs to manage DNS and it’s easier to give them access to an account specific to DNS rather than multiple set of resources.
- The core-automation account is where any gitops automation will live. Some automation (like Spacelift) has “god” mode in this account.
  - The network account will typically have transit gateway access to all other accounts, therefore we want to limit what is deployed in the automation account to only those services which need it.
- This core-artifacts account is where we recommend centralizing and storing artifacts (e.g. ECR, assets, etc) for CI/CD
- The core-public for public S3 buckets, public ECRs, public AMIs, anything public. This will be the only account that doesn’t have a SCP that blocks public s3 buckets.
  - Use-cases: All s3 buckets are private by default using a SCP in every account except for the public account
- The “$tenant” account is a symbolic account representing dedicated account environment. It’s architecture will likely resemble prod. This relates to this link

note

It is advised to keep the names of accounts as short as possible because of resources with low max character limits AWS Resources Limitations

Multi-Account (Production, Staging, Dev)

Not recommended because there’s not enough isolation.

Strict, enforceable boundaries between multiple environments (aka stages) at the IAM layer
Ability to create a release process whereby we stage changes in one account before applying them to the next account
Ability to grant developers administrative access to sandbox account (dev) so that they can develop/iterate on IAM policies. These policies then are committed as code and submitted as part of a Pull Request, where they get code reviewed.
API limits are scoped to an account. A bug in staging can't take out production.

Single-Account Strategy (Production = Staging = Dev) - NOT RECOMMENDED

Editing live IAM permissions in the mono account is the equivalent "cowboy coding" in production; we don't do this with our software, so we should not do this with our infrastructure
No strict separation between stages; copying and pasting infrastructure could accidentally lead to catastrophic outcomes
Very difficult to write/manage complex IAM policies (especially without a staging organization!)
No way to grant someone IAM permissions to create/manage policies while also restricting access to other production resources using IAM policies. This makes it very slow/tedious for developers to work on AWS and puts all the burden to develop IAM policies on a select few individuals, which often leads to a bottleneck
VPCs only provide network-level isolation. We need IAM level isolation.
AWS API limits are at the account level. A bug in staging/dev can directly DoS production services.

account

References

Here are some great videos for context

Decide on AWS Multiple Accounts and Organizational Units

Context and Problem Statement

Considered Options

AWS Well-Architected Account Designations

👑 Core-Root Management Account (parent, billing): `aws-admin` | `core-root`

🎩 Network account: serves as the central hub for your networking resources and route traffic between accounts in your environment, your on-premises, and egress/ingress traffic to the internet.

🎩 Shared Services account: `aws-shared-services`

🎩 Centralised Operations Tooling account: `aws-centralised-ops`

🎩 Backup account: `aws-backup` account serves as a dedicated and centralized hub for backup and disaster recovery management.

⛑️ Log archive account:

⛑️ Security Tooling (Audit) accounts: `aws-audit`

🧑‍🎄 Service Control Policies and scrutiny: `aws-poc`

🤠 DataHub Platform Workloads/Applications account:

🤠 `data-prod`, `data-staging`, `data-dev`

AWS Control-Tower-Compatible Multi-Organization, Multi-Account Naming Convention

Multi-Account (Production, Staging, Dev)

Single-Account Strategy (Production = Staging = Dev) - NOT RECOMMENDED

References

Context and Problem Statement​

Considered Options​

AWS Well-Architected Account Designations​

👑 Core-Root Management Account (parent, billing): aws-admin | core-root​

🎩 Network account: serves as the central hub for your networking resources and route traffic between accounts in your environment, your on-premises, and egress/ingress traffic to the internet.​

🎩 Shared Services account: aws-shared-services​

🎩 Centralised Operations Tooling account: aws-centralised-ops​

🎩 Backup account: aws-backup account serves as a dedicated and centralized hub for backup and disaster recovery management.​

⛑️ Log archive account:​

⛑️ Security Tooling (Audit) accounts: aws-audit​

🧑‍🎄 Service Control Policies and scrutiny: aws-poc​

🤠 DataHub Platform Workloads/Applications account:​

🤠 data-prod, data-staging, data-dev​

AWS Control-Tower-Compatible Multi-Organization, Multi-Account Naming Convention​

Multi-Account (Production, Staging, Dev)​

Single-Account Strategy (Production = Staging = Dev) - NOT RECOMMENDED​

Related Components​

References​

Context and Problem Statement

Considered Options

AWS Well-Architected Account Designations

👑 Core-Root Management Account (parent, billing): `aws-admin` | `core-root`

🎩 Network account: serves as the central hub for your networking resources and route traffic between accounts in your environment, your on-premises, and egress/ingress traffic to the internet.

🎩 Shared Services account: `aws-shared-services`

🎩 Centralised Operations Tooling account: `aws-centralised-ops`

🎩 Backup account: `aws-backup` account serves as a dedicated and centralized hub for backup and disaster recovery management.

⛑️ Log archive account:

⛑️ Security Tooling (Audit) accounts: `aws-audit`

🧑‍🎄 Service Control Policies and scrutiny: `aws-poc`

🤠 DataHub Platform Workloads/Applications account:

🤠 `data-prod`, `data-staging`, `data-dev`

AWS Control-Tower-Compatible Multi-Organization, Multi-Account Naming Convention

Multi-Account (Production, Staging, Dev)

Single-Account Strategy (Production = Staging = Dev) - NOT RECOMMENDED

Related Components

References