# Infrastructure Setup ## Architecture Two EC2 instances during scanning: - **c5.2xlarge** (`everytab`) — compute: runs pipeline, stores icons on 1TB EBS - **i3.large** (`everytab-db`) — database: runs Postgres on 475GB local NVMe (100K+ IOPS) Both provisioned by Terraform with `user_data` scripts that run on first boot: - Compute: `ec2-userdata.sh` (Go, DuckDB, Unbound, swap) - Database: `db-setup.sh` (NVMe format, Postgres install + config) ## 1. Terraform ```bash cd infra cp terraform.tfvars.example terraform.tfvars # fill in your values terraform init terraform apply ``` This creates both instances. They auto-provision via user_data (~3 minutes). ## 2. SSH Key ```bash terraform output -raw ssh_private_key > everytab-key && chmod 600 everytab-key terraform output ssh_command # SSH to compute instance terraform output ssh_command_db # SSH to database instance ``` ## 3. Verify Database is Ready ```bash # From your local machine or the compute instance pg_isready -h $(terraform output -raw db_private_ip) ``` If not ready yet, SSH to the DB instance and check `cloud-init` logs: ```bash tail -f /var/log/cloud-init-output.log ``` ## 4. Clone Repo + Build on Compute Instance ```bash ssh -i everytab-key ec2-user@$(terraform output -raw ec2_public_ip) git clone ~/everytab cd ~/everytab go build -o ~/warc_parse ./pipeline/02_warc_parse/ go build -o ~/icon_download ./pipeline/03_icon_download/ go build -o ~/bundle_gen ./pipeline/05_bundle_gen/ ``` ## 5. Connect to Database + Apply Schema ```bash # Get the connection string export DATABASE_URL=$(terraform output -raw database_url) echo "export DATABASE_URL='$DATABASE_URL'" >> ~/.bashrc # Test connectivity psql $DATABASE_URL -c 'SELECT 1;' # Apply schema psql $DATABASE_URL -f ~/everytab/pipeline/01_cc_index/schema.sql ``` ## 6. Run Pipeline See `pipeline/README.md` for the full stage-by-stage guide. ## Pinning the EC2 AMI The `data.aws_ami` lookup fetches the latest Amazon Linux 2023 AMI. If Amazon publishes a new one between applies, Terraform will want to replace your instances. To prevent this, pin the AMI after initial creation: ```bash # Get the current AMI aws ec2 describe-instances --filters "Name=tag:Name,Values=everytab" \ --query "Reservations[0].Instances[0].ImageId" --output text # Add to terraform.tfvars echo 'ec2_ami = "ami-XXXXXXXXXXXX"' >> terraform.tfvars ``` Remove the `ec2_ami` line from tfvars when you want fresh instances with the latest AMI. ## Teardown (after backup) ```bash # Back up the database (run from compute instance) pg_dump $DATABASE_URL -Fc > ~/everytab_dump.pgfc # Back up icons to homelab rsync -avP ~/icons/ homelab:/backups/everytab/icons/ ``` Switch to serving-only mode (destroys both EC2 instances): ```bash terraform apply -var="scanning=false" ``` Full destroy (including the live site): ```bash terraform destroy ``` **IMPORTANT:** The i3's local NVMe is ephemeral — all data is lost on stop/terminate. Always pg_dump before teardown.