# Infrastructure Setup ## Architecture Two EC2 instances during scanning: - **c5.2xlarge** (`everytab`) — compute: runs pipeline, stores icons on 1TB EBS - **i3.large** (`everytab-db`) — database: runs Postgres on 475GB local NVMe (100K+ IOPS) Both provisioned by Terraform with `user_data` scripts that auto-run on first boot: - Compute: `ec2-userdata.sh` — installs Go, DuckDB, Unbound, swap; clones repo; builds binaries; applies DB schema - Database: `db-setup.sh` — formats NVMe, installs Postgres, creates database + schema ## Quick Start Everything runs from your local machine unless noted. ```bash # 1. Create infrastructure cd infra cp terraform.tfvars.example terraform.tfvars # fill in your values (including repo_url) terraform init terraform apply # 2. Save SSH key terraform output -raw ssh_private_key > everytab-key && chmod 600 everytab-key # 3. Wait ~3-5 minutes for both instances to auto-provision, then verify ssh -i everytab-key ec2-user@$(terraform output -raw ec2_public_ip) \ 'pg_isready -h $(grep DATABASE_URL ~/.bashrc | cut -d@ -f2 | cut -d: -f1)' ``` If `repo_url` is set in tfvars, the compute instance automatically: - Clones the repo - Builds all Go binaries - Waits for the DB to be ready - Applies the schema ## Running the Pipeline SSH to the compute instance — everything is ready: ```bash ssh -i everytab-key ec2-user@$(terraform output -raw ec2_public_ip) # DATABASE_URL is already in .bashrc, binaries already built # Start the pipeline (see pipeline/README.md for full guide) ./pipeline/01_cc_index/query.sh --db-url "$DATABASE_URL" --limit 0 ``` ## Debugging (if auto-provision fails) Check cloud-init logs on either instance: ```bash # Compute instance ssh -i everytab-key ec2-user@$(terraform output -raw ec2_public_ip) \ 'tail -30 /var/log/cloud-init-output.log' # DB instance ssh -i everytab-key ec2-user@$(terraform output -raw db_public_ip) \ 'tail -30 /var/log/cloud-init-output.log' ``` ## Pinning the EC2 AMI The `data.aws_ami` lookup fetches the latest Amazon Linux 2023 AMI. Pin it to prevent instance replacement on unrelated changes: ```bash aws ec2 describe-instances --filters "Name=tag:Name,Values=everytab" \ --query "Reservations[0].Instances[0].ImageId" --output text # Add to terraform.tfvars echo 'ec2_ami = "ami-XXXXXXXXXXXX"' >> terraform.tfvars ``` Remove the line when you want fresh instances with the latest AMI. ## Teardown From the compute instance, back up before tearing down: ```bash # Back up database pg_dump $DATABASE_URL -Fc > ~/everytab_dump.pgfc # Back up icons to homelab rsync -avP ~/icons/ homelab:/backups/everytab/icons/ ``` From your local machine: ```bash # Destroy scanning infrastructure (keeps CloudFront + site bucket) terraform apply -var="scanning=false" # Or full destroy (including the live site) terraform destroy ``` **IMPORTANT:** The i3's local NVMe is ephemeral — all data is lost on stop/terminate. Always pg_dump before teardown.