coder
diff --git a/‎.github/workflows/pre-commit-hooks.yml‎
Lines changed: 3 additions & 3 deletions b/‎.github/workflows/pre-commit-hooks.yml‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎.github/workflows/secret-scanning.yml‎
Lines changed: 3 additions & 3 deletions b/‎.github/workflows/secret-scanning.yml‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎.github/workflows/terraform-apply.yml‎
Lines changed: 5 additions & 5 deletions b/‎.github/workflows/terraform-apply.yml‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎.github/workflows/terraform-destroy.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/terraform-destroy.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.github/workflows/terraform-plan.yml‎
Lines changed: 3 additions & 3 deletions b/‎.github/workflows/terraform-plan.yml‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎.pre-commit-config.yaml‎
Lines changed: 4 additions & 4 deletions b/‎.pre-commit-config.yaml‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/cost-optimization-strategy.md‎
Lines changed: 130 additions & 0 deletions b/‎docs/cost-optimization-strategy.md‎
Lines changed: 130 additions & 0 deletions
diff --git a/‎infra/aws/us-east-2/README.md‎
Lines changed: 7 additions & 0 deletions b/‎infra/aws/us-east-2/README.md‎
Lines changed: 7 additions & 0 deletions
@@ -6,8 +6,8 @@ name: Pre-commit Validation
 on:
   pull_request:
     paths:
-      - '.pre-commit-config.yaml'
-      - '.github/workflows/pre-commit-hooks.yml'
+      - ".pre-commit-config.yaml"
+      - ".github/workflows/pre-commit-hooks.yml"
 
 jobs:
   validate-pre-commit:
@@ -19,7 +19,7 @@ jobs:
       - name: Set up Python
         uses: actions/setup-python@v4
         with:
-          python-version: '3.11'
+          python-version: "3.11"
 
       - name: Install pre-commit
         run: |
 
@@ -7,8 +7,8 @@ on:
   push:
     branches:
       - main
-      - 'feature/**'
-      - 'fix/**'
+      - "feature/**"
+      - "fix/**"
 
 permissions:
   contents: write
@@ -23,7 +23,7 @@ jobs:
       - name: Checkout code
         uses: actions/checkout@v4
         with:
-          fetch-depth: 0  # Fetch all history for accurate scanning
+          fetch-depth: 0 # Fetch all history for accurate scanning
 
       - name: Run Gitleaks
         uses: gitleaks/gitleaks-action@v2
 
@@ -5,13 +5,13 @@ on:
     branches:
       - main
     paths:
-      - 'infra/aws/**/*.tf'
-      - 'infra/aws/**/*.tfvars'
-      - '.github/workflows/terraform-*.yml'
+      - "infra/aws/**/*.tf"
+      - "infra/aws/**/*.tfvars"
+      - ".github/workflows/terraform-*.yml"
   workflow_dispatch:
     inputs:
       module:
-        description: 'Specific module to apply (leave empty for all changed)'
+        description: "Specific module to apply (leave empty for all changed)"
         required: false
         type: string
 
@@ -65,7 +65,7 @@ jobs:
       matrix:
         module: ${{ fromJson(needs.detect-changes.outputs.modules) }}
       fail-fast: false
-      max-parallel: 1  # Apply modules one at a time to avoid conflicts
+      max-parallel: 1 # Apply modules one at a time to avoid conflicts
     defaults:
       run:
         working-directory: ${{ matrix.module }}
 
@@ -4,7 +4,7 @@ on:
   workflow_dispatch:
     inputs:
       module:
-        description: 'Module to destroy (e.g., infra/aws/us-east-2/eks)'
+        description: "Module to destroy (e.g., infra/aws/us-east-2/eks)"
         required: true
         type: string
       confirm:
 
@@ -5,9 +5,9 @@ on:
     branches:
       - main
     paths:
-      - 'infra/aws/**/*.tf'
-      - 'infra/aws/**/*.tfvars'
-      - '.github/workflows/terraform-*.yml'
+      - "infra/aws/**/*.tf"
+      - "infra/aws/**/*.tfvars"
+      - ".github/workflows/terraform-*.yml"
 
 permissions:
   contents: read
 
@@ -17,13 +17,13 @@ repos:
         exclude: '\.md$'
       - id: end-of-file-fixer
       - id: check-yaml
-        args: ['--unsafe']  # Allow custom YAML tags
+        args: ["--unsafe"] # Allow custom YAML tags
       - id: check-added-large-files
-        args: ['--maxkb=1000']
+        args: ["--maxkb=1000"]
       - id: check-merge-conflict
       - id: detect-private-key
       - id: detect-aws-credentials
-        args: ['--allow-missing-credentials']
+        args: ["--allow-missing-credentials"]
 
   # Terraform
   - repo: https://github.com/antonbabenko/pre-commit-terraform
@@ -47,7 +47,7 @@ repos:
     rev: v4.5.0
     hooks:
       - id: no-commit-to-branch
-        args: ['--branch', 'main', '--branch', 'master']
+        args: ["--branch", "main", "--branch", "master"]
         stages: [commit]
 
 # Global settings
 
@@ -0,0 +1,130 @@
+# Cost Optimization Strategy for Coder Demo
+
+## Mixed Capacity Approach
+
+### Node Group Strategy
+
+**System Nodes (ON_DEMAND)**
+
+- **Purpose**: Run critical Kubernetes infrastructure
+- **Workloads**: CoreDNS, kube-proxy, metrics-server, cert-manager, AWS LB Controller
+- **Size**: t4g.medium (ARM Graviton)
+- **Count**: 1-2 nodes minimum
+- **Cost**: ~$24/month (1 node) to $48/month (2 nodes)
+
+**Application Nodes (MIXED: 20% On-Demand, 80% Spot via Karpenter)**
+
+- **Purpose**: Run Coder server and workspaces
+- **Spot Savings**: 70-90% cost reduction
+- **Interruption Risk**: Mitigated by:
+  - Multiple instance types (diversified Spot pools)
+  - Karpenter auto-rebalancing
+  - Pod Disruption Budgets
+
+### Karpenter NodePool Configuration
+
+#### 1. Coder Server NodePool (ON_DEMAND Priority)
+
+```yaml
+capacity_type: ["on-demand", "spot"] # Prefer On-Demand, fallback to Spot
+weight:
+  on-demand: 100 # Higher priority
+  spot: 10
+```
+
+#### 2. Coder Workspace NodePool (SPOT Priority)
+
+```yaml
+capacity_type: ["spot", "on-demand"] # Prefer Spot, fallback to On-Demand
+weight:
+  spot: 100 # Higher priority
+  on-demand: 10
+```
+
+### Risk Mitigation
+
+**Spot Interruption Handling:**
+
+1. **2-minute warning** → Karpenter automatically provisions replacement
+2. **Multiple instance types** → 15+ types reduces interruption rate to <1%
+3. **Pod Disruption Budgets** → Ensures minimum replicas always running
+4. **Karpenter Consolidation** → Automatically moves pods before termination
+
+**Example Instance Type Diversity:**
+
+```
+Spot Pool: t4g.medium, t4g.large, t3a.medium, t3a.large,
+           m6g.medium, m6g.large, m6a.medium, m6a.large
+```
+
+### Cost Breakdown
+
+| Component          | Instance Type | Capacity  | Monthly Cost  |
+| ------------------ | ------------- | --------- | ------------- |
+| System Nodes (2)   | t4g.medium    | ON_DEMAND | $48           |
+| Coder Server (2)   | t4g.large     | 80% SPOT  | $28 (vs $140) |
+| Workspaces (avg 5) | t4g.xlarge    | 90% SPOT  | $75 (vs $750) |
+| **Total**          |               | **Mixed** | **$151/mo**   |
+
+**vs All On-Demand:** $938/month → **84% savings**
+
+### Dynamic Scaling
+
+**Low Usage (nights/weekends):**
+
+- Scale to zero workspaces
+- Keep 1 system node + 1 Coder server node
+- Cost: ~$48/month during idle
+
+**High Usage (business hours):**
+
+- Auto-scale workspaces on Spot
+- Karpenter provisions nodes in <60 seconds
+- Cost: ~$150-200/month during peak
+
+### Monitoring & Alerts
+
+**CloudWatch Alarms:**
+
+- Spot interruption rate > 5%
+- Available On-Demand capacity < 20%
+- Karpenter provisioning failures
+
+**Response:**
+
+- Automatic fallback to On-Demand
+- Email alerts to ops team
+- Karpenter adjusts instance type mix
+
+## Implementation Timeline
+
+1. ✅ Deploy EKS with ON_DEMAND system nodes
+2. ⏳ Deploy Karpenter
+3. ⏳ Configure mixed-capacity NodePools
+4. ⏳ Deploy Coder with node affinity rules
+5. ⏳ Test Spot interruption handling
+6. ⏳ Enable auto-scaling policies
+
+## Fallback Plan
+
+If Spot becomes unreliable (rare):
+
+1. Update Karpenter NodePool to 100% On-Demand
+2. `kubectl apply -f nodepool-ondemand.yaml`
+3. Karpenter gracefully migrates pods
+4. Takes ~5 minutes, zero downtime
+
+## Best Practices
+
+✅ **DO:**
+
+- Use multiple Spot instance types (10+)
+- Set Pod Disruption Budgets
+- Monitor Spot interruption rates
+- Test failover regularly
+
+❌ **DON'T:**
+
+- Run databases on Spot (use RDS)
+- Use Spot for single-replica critical services
+- Rely on single instance type for Spot
@@ -7,6 +7,7 @@ This directory uses remote S3 backend for state management, but **backend config
 ## Local Setup
 
 1. **Get backend configuration from teammate** or **retrieve from AWS**:
+
    ```bash
    # Get S3 bucket name (it contains the account ID)
    aws s3 ls | grep terraform-state
@@ -24,6 +25,7 @@ This directory uses remote S3 backend for state management, but **backend config
    ```
 
    Create `backend.tf`:
+
    ```hcl
    terraform {
      backend "s3" {
@@ -62,6 +64,7 @@ These are configured in: Repository Settings > Secrets and variables > Actions
 Instead of creating backend.tf, you can use a config file:
 
 1. Create `backend.conf` (gitignored):
+
    ```
    bucket         = "YOUR-BUCKET-NAME"
    dynamodb_table = "YOUR-TABLE-NAME"
@@ -86,12 +89,14 @@ Instead of creating backend.tf, you can use a config file:
 This repository has automated secret scanning to prevent accidental exposure of credentials:
 
 ### GitHub Actions (Automated)
+
 - **Gitleaks** - Scans every PR and push for secrets
 - **TruffleHog** - Additional verification layer
 - **Custom Pattern Matching** - Catches common secret patterns
 - **Auto-Revert** - Automatically reverts commits to main with secrets
 
 ### Pre-commit Hooks (Local)
+
 Catch secrets before they reach GitHub:
 
 ```bash
@@ -106,6 +111,7 @@ pre-commit run --all-files
 ```
 
 ### What Gets Detected
+
 - AWS Access Keys (AKIA...)
 - API Keys and Tokens
 - Private Keys (RSA, SSH, etc.)
@@ -115,6 +121,7 @@ pre-commit run --all-files
 - High-entropy strings (likely secrets)
 
 ### If Secrets Are Detected
+
 1. **PR is blocked** - Cannot merge until secrets are removed
 2. **Automatic notification** - PR comment explains the issue
 3. **Required actions**: