APEX Skills — Agentic Platform Engineering eXperience
Curated EKS platform engineering skills that compress onboarding from months to weeks. Domain knowledge authored by senior AWS SSAs, delivered through agentic AI tools (Claude Code, Kiro CLI etc).
APEX uses agentic AI (frontier models and agent harness like Claude Code) combined with curated "skills" to give engineers SSA-grade platform engineering output.
Agent Skills are organized folders of instructions, scripts, and resources that frontier LLM models can discover and load dynamically to perform specialized tasks. By codifying expert platform engineering knowledge as Agent Skills, we amplify best practices and scale them across teams while reducing toil. They follow the Agent Skills Agent Skills open standard open standard and are compatible with any supported agent harness.
What's in This Repo
sample-apex-skills/
├── skills/ → 📚 Domain knowledge (EKS best practices, Terraform, skill creation)
├── steering/ → 🎯 Guided workflows (optional — structured engagement playbooks)
├── examples/ → 🏗️ Hands-on exercises (deployable labs with planted issues)
└── misc/ → 🔧 Maintenance and tooling
├── evals/ → 🧪 Per-skill evaluation inputs (triggering + task prompts)
└── (scripts) → Sync skills from sources, update cross-references
| Directory | Purpose | Think of it as... |
|---|---|---|
skills/ | What the agent knows — reusable domain knowledge | An expert's brain |
steering/ | How the agent runs an engagement — slash commands, questionnaires, checkpoints, routing | A senior SA's playbook |
examples/ | How to try it — deploy, run APEX against it, see results | A workshop lab |
misc/ | Maintenance tooling and per-skill evaluation inputs | The toolbox |
Key principle: Skills provide the knowledge. Steering provides the structure.
Quick Start
Option A: Just the Skills
Use the skills with any agent that supports the Agent Skills standard. Skills are self-contained — clone and point your tool at them.
git clone https://github.com/aws-samples/sample-apex-skills.git cd sample-apex-skills
Each skill lives in skills/{skill-name}/ with a SKILL.md (frontmatter + instructions) and optional references/, scripts/, and assets/ directories. See skills/README.md for details.
Option B: Skills + Steering (Guided Experience)
For a structured engagement experience — where the agent follows a questionnaire, enforces checkpoints, and validates output quality — add the steering files.
Claude Code
git clone https://github.com/aws-samples/sample-apex-skills.git cd sample-apex-skills # One-time setup — symlink skills, steering + commands into .claude/ mkdir -p .claude/skills .claude/commands for skill in skills/*/; do ln -sfn "../../$skill" ".claude/skills/$(basename $skill)"; done ln -sfn ../../steering/commands/apex .claude/commands/apex ln -sfn ../steering .claude/steering # Make steering available at a fixed absolute path for slash commands ln -sfn "$(pwd)/steering" ~/.claude/apex-steering
Claude Code walks up to the git root to find
.claude/, so commands work from any subdirectory in the repo. The~/.claude/apex-steeringsymlink gives slash commands an absolute path to load steering files instantly.
Usage:
- Start a Claude Code session from anywhere in the repo
- Use slash commands:
/apex:eks— hub that auto-routes based on your request/apex:eks-design— "Help me design an EKS cluster"/apex:eks-upgrade— "Upgrade my cluster from 1.30 to 1.33"
Kiro CLI
git clone https://github.com/aws-samples/sample-apex-skills.git cd sample-apex-skills # Skills — symlink into .kiro/skills/ mkdir -p .kiro/skills for skill in skills/*/; do name=$(basename "$skill") ln -sfn "../../skills/$name" ".kiro/skills/$name" done # Steering — copy for Kiro IDE slash commands mkdir -p .kiro/steering cp steering/eks.md .kiro/steering/eks.md
Usage:
kiro-cli chat # Then in the session: /model claude-opus-4.5 /context add steering/eks.md # Ask: "Help me design an EKS cluster"
Skills Reference
This table is auto-generated by
misc/update-skills-references.sh. Do not edit manually.
| Skill | What It Covers |
|---|---|
| eks-best-practices | Use this skill whenever someone is making an Amazon EKS design, architecture, or configuration decision — even phrased casually as "how should we set up...", "what's the right way to...", "should we use X or Y", "we're about to redesign/consolidate/migrate...", or "is this reasonable?". Covers compute strategy (Karpenter, MNG, Fargate, Auto Mode, self-managed), multi-tenant platform design and tenant isolation (namespaces, node pools, RBAC, network policies, quotas), VPC/IP planning, ingress, IAM/Pod Identity/IRSA, pod security, PDBs and reliability, upgrade strategy (in-place vs blue-green), cost (Spot, Graviton, consolidation), autoscaling, and observability. Also triggers for Terraform with terraform-aws-modules/terraform-aws-eks (access entries, addons, node groups, IRSA). Trigger even if "best practice" is never said — any EKS planning or architectural judgment call qualifies. Skip for step-by-step upgrade execution (eks-upgrader) or pure Kubernetes questions unrelated to EKS. |
| eks-mcp-server | Setup and configure the EKS MCP Server for live cluster operations. Use this skill when the user wants to interact with real EKS clusters (list clusters, read K8s resources, troubleshoot pods, deploy workloads, check upgrade insights) but MCP tools are not available or not working. Also activate if user mentions "eks mcp", "mcp server", or asks how to connect their AI assistant to EKS. |
| eks-recon | EKS cluster reconnaissance and environment discovery. Detects compute strategy (Karpenter, MNG, Auto Mode, Fargate), IaC tooling (Terraform, CloudFormation, CDK, eksctl), CI/CD pipelines (GitHub Actions, GitLab, ArgoCD, Flux), add-on inventory, networking, security posture, and observability. Use this skill whenever someone asks about their EKS cluster, wants to understand their setup, is planning an upgrade or migration, needs cluster context for any reason, asks "what version am I running", mentions wanting to review or document their cluster, or is about to make any EKS-related decision - even if they don't explicitly say "reconnaissance" or "discovery". When in doubt about cluster state, run recon first. |
| eks-upgrader | EKS cluster upgrade companion. Add-on compatibility matrices, upgrade procedures (in-place and blue-green), and component-specific guidance for Karpenter, Istio, and other EKS add-ons and ecosystem controllers (CoreDNS, kube-proxy, VPC CNI, ingress controllers, cluster-autoscaler). Use when planning or executing an EKS version upgrade, checking add-on compatibility, or troubleshooting upgrade issues. |
| skill-creator | Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy. |
| steering-workflow-creator | Author a new steering workflow for any AWS service and pair it with a matching slash-command shim. Use when the user asks to create a steering workflow, add a workflow to apex, standardize steering, write a new workflow for EKS / RDS / Lambda / IAM / any AWS service, or build a phased playbook that plugs into a service hub. Covers the convention (frontmatter, header block, required sections), tool routing (knowledge vs. live MCP vs. setup-bridge), and the lint pass before handoff. |
| terraform-skill | Use when working with Terraform or OpenTofu - creating modules, writing tests (native test framework, Terratest), setting up CI/CD pipelines, reviewing configurations, choosing between testing approaches, debugging state issues, implementing security scanning (trivy, checkov), or making infrastructure-as-code architecture decisions |
Steering (Optional)
This table is auto-generated by
misc/update-steering-references.sh. Do not edit manually.
| Steering File | Description |
|---|---|
| apex | APEX meta hub. Routes contributor requests about the repo itself — adding a new skill, authoring a new steering workflow, and other maintenance actions that are not tied to a specific AWS service. |
| eks | EKS platform engineering hub. Routes to design and upgrade workflows. Use as the entry point for any EKS-related request. |
| design | Day 0 architecture design workflow. 8-phase questionnaire for EKS cluster design, architecture reviews, and option comparisons. |
| new-skill | Meta contributor workflow. Onboards a new skill end-to-end — scope intake, optional skill-creator drafting, sibling-graph survey, repo fan-out diff, eval scaffold and finalization, and baseline PR prep. Bimodal — greenfield authoring or retrofit on an existing skill that skipped the process. |
| upgrade | Day 2 upgrade workflow. Pre-flight validation, upgrade planning, guided execution with checkpoints, and post-upgrade validation. |
Slash Commands (Claude Code)
| Command | Description |
|---|---|
| /apex:eks | EKS platform engineering hub. Routes to design or upgrade workflows based on your request. Use for any EKS-related task -- architecture design, cluster upgrades, reviews, comparisons, or general EKS questions. |
| /apex:eks-design | Design a new EKS cluster architecture. 8-phase questionnaire covering compute, networking, security, observability, cost, reliability, and multi-tenancy. Also handles architecture reviews and option comparisons. |
| /apex:eks-upgrade | Plan and execute an EKS cluster upgrade. Pre-flight validation, Terraform or CLI path detection, step-by-step execution with checkpoints, and post-upgrade validation. Supports in-place and blue-green strategies. |
| /apex:new-skill | Onboard a new skill end-to-end — draft it, survey siblings, fan out the repo edits, scaffold and finalize the eval set, and baseline the scorecard. Bimodal — greenfield authoring or retrofit on an existing skill. |
Steering files control how the agent runs an engagement — they don't contain domain knowledge (that's in skills), but define the interaction pattern. The hub (eks.md) is the entry point — it detects what the user wants and routes to the appropriate workflow. Each workflow follows a structured sequence with checkpoints and STOP gates. The commands/ directory provides agent-harness-specific entry points (e.g., Claude Code slash commands) that map to the hub and workflows.
The key test: If you removed all steering files, would the agent still know the right answers? Yes — skills provide the knowledge. But the agent wouldn't know how to run the engagement.
Examples
This table is auto-generated by
misc/update-examples-references.sh. Do not edit manually.
| Example | Description | Workflow |
|---|---|---|
| In-Place EKS Upgrade | Deploy an EKS 1.32 cluster with planted issues and upgrade to 1.33 using the APEX EKS upgrade workflow. Covers deprecated API detection, blocking PDB remediation, and Terraform-aware upgrades. | upgrade |
Contributing
See CONTRIBUTING.md for guidelines on:
- Where new content goes (skills vs steering vs examples)
- How to create a new skill
- How to create a new steering workflow
- How to create a new example
- How to add evals for a new skill
Sources
All best practices content is sourced from public AWS documentation:
- AWS EKS Best Practices Guide
- AWS Prescriptive Guidance — HA and Resiliency for EKS
- AWS Well-Architected Framework
- terraform-aws-modules/terraform-aws-eks
- ArgoCD Documentation
- EKS Workshop
- AWS EKS Capabilities
License
This project is licensed under the MIT-0 License. See the LICENSE file.