# infra-party **Repository Path**: mirrors_elastic/infra-party ## Basic Information - **Project Name**: infra-party - **Description**: Terraform and supporting utilities for reproducing cloud infra for dev (in otel etc.) - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-11-19 - **Last Updated**: 2025-12-06 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Infra Party Infra Party contains Terraform configurations and automation scripts for creating cloud infrastructure scenarios on Google Cloud Platform. It currently ships with: - **VPC Flow Logs**: Generates internal VM traffic and exports subnet flow logs. - **Proxy Network Load Balancer Logs**: Provisions a regional external proxy TCP Network Load Balancer, drives client traffic through the forwarding rule, and exports connection logs. - **Application Load Balancer Logs**: Provisions either a global or regional external Application Load Balancer (HTTP/HTTPS), generates traffic through the load balancer, and exports request logs with optional TLS metadata. - **External Passthrough Network Load Balancer Logs**: Provisions a regional external passthrough Network Load Balancer, drives client traffic through the forwarding rule, and exports connection logs. ## Prerequisites - Terraform **v1.5+** - Google provider **v5.0+** (downloaded automatically by Terraform) - An authenticated `gcloud` session with access to your GCP project - Make sure you are logged into gcloud in TWO different ways: - `gcloud auth login` - `gcloud auth application-default login` (for Terraform) - Fish shell **v3.6+** (the helper scripts are written in fish; bash/zsh are not supported) - `jq` (for JSON processing) - `curl` and `netcat` (used to generate NLB traffic from your workstation) - Go **1.21+** (only required for the VPC flow scenario traffic runner) > After Terraform completes, the helper scripts automatically generate traffic. > The VPC flow scenario uses a Go traffic runner over SSH, while the NLB scenario > drives curl/netcat traffic from your local machine. > **Warning:** Running either scenario provisions billable Google Cloud resources. > Proxy Network Load Balancers incur hourly forwarding rule and proxy-only subnet > costs even when idle. Destroy the scenario as soon as you finish exporting logs. ## Quick Start 1. Copy the example environment file and adjust the values: ```bash cp config.env.example config.env $EDITOR config.env ``` Update `PROJECT_ID`, `REGION`, and `ZONE`. Set `SCENARIO` to `vpc-flow`, `nlb`, `alb`, or `nlb-passthrough` if you plan to run Terraform manually; the helper scripts force the correct value. For the Application Load Balancer scenario, also set `LOAD_BALANCER_SCOPE` to either `global` or `regional` in `config.env`. ### VPC Flow Logs Scenario ```bash ./run.fish generate --scenario=vpc-flow # wait ~10 minutes for flow logs to aggregate ./run.fish export --scenario=vpc-flow ``` Results are written to `./vpc-fixtures-out/vpc_logs.jsonl`. ### Proxy Network Load Balancer Scenario ```bash ./run.fish generate --scenario=nlb # wait a few minutes for load balancer logs to aggregate ./run.fish export --scenario=nlb ``` Results are written to `./nlb-fixtures-out/nlb_logs.jsonl`. ### Application Load Balancer Scenario ```bash ./run.fish generate --scenario=alb # wait a few minutes for load balancer logs to aggregate ./run.fish export --scenario=alb ``` Results are written to `./alb-fixtures-out/alb_logs.jsonl`. The Application Load Balancer can be deployed in two modes: - **Global**: Set `LOAD_BALANCER_SCOPE=global` in `config.env` for a global external Application Load Balancer - **Regional**: Set `LOAD_BALANCER_SCOPE=regional` in `config.env` for a regional external Application Load Balancer By default, TLS is enabled with self-signed certificates. Traffic is generated over HTTPS and logs include TLS protocol and cipher information. ### External Passthrough Network Load Balancer Scenario ```bash ./run.fish generate --scenario=nlb-passthrough # wait a few minutes for load balancer logs to aggregate ./run.fish export --scenario=nlb-passthrough ``` Results are written to `./nlb-fixtures-out/nlb_passthrough_logs.jsonl`. ### Destroy Destroy whichever scenario is active: ```bash ./run.fish destroy --scenario=vpc-flow --dry-run=false # or select a different scenario explicitly ./run.fish destroy --scenario=nlb --dry-run=false ./run.fish destroy --scenario=alb --dry-run=false ./run.fish destroy --scenario=nlb-passthrough --dry-run=false ``` ## How It Works 1. **Generate**: `./run.fish generate --scenario=` runs Terraform with the selected scenario, validates outputs, and automatically kicks off traffic generation. 2. **Traffic**: - *VPC Flow Logs*: A Go helper connects to MIG instances over SSH to create east-west traffic. - *NLB Logs*: The script waits for backend readiness and for the proxy to respond, then fires curl/netcat traffic from the local machine. - *ALB Logs*: The script waits for backend instances and load balancer readiness, then generates HTTP/HTTPS traffic from the local machine using curl. - *External Passthrough NLB Logs*: The script waits for backend readiness and for the load balancer to respond, then fires curl/netcat traffic from the local machine. 3. **Ingestion Delay**: Logs are not immediate. Expect ~10 minutes for VPC flow logs and a few minutes for load balancer logs (both NLB and ALB). 4. **Export**: `./run.fish export --scenario=` reuses Terraform outputs, applies a default 20-minute window (`START_TIME` = now-20m, `END_TIME` = now), and writes JSON Lines files to `./vpc-fixtures-out`, `./nlb-fixtures-out`, `./alb-fixtures-out`, or `./nlb-passthrough-fixtures-out`. 5. **Destroy**: `./run.fish destroy --scenario=` cleans up the Terraform resources. By default it runs in dry-run mode until you pass `--dry-run=false`. ## Configuration ### Environment Variables - `START_TIME` / `END_TIME`: UTC timestamps (`YYYY-MM-DDTHH:MM:SSZ`) used when exporting logs. Default is from 20 minutes ago until now. - `MAX_RESULTS`: Caps log entries returned by `gcloud logging read` (default `2000`). - `OUTPUT_DIR`: Directory where exports are written (`./vpc-fixtures-out`, `./nlb-fixtures-out`, `./alb-fixtures-out`, or `./nlb-passthrough-fixtures-out` by default). - `RESOURCE_PREFIX`: Prefix for Terraform resource names (`gcp-fixture` if unset). - `LOAD_BALANCER_SCOPE`: For ALB scenario only - set to `global` or `regional` (default: `regional`). ### Destroy Options - `--dry-run` flag controls whether `destroy` issues `terraform plan -destroy` (default) or a full `terraform destroy`. - To actually delete resources, pass `--dry-run=false`. ## Log Output Format Both `export` commands produce JSON Lines files (`*.jsonl`). Each line is a complete JSON object that is safe to ingest into downstream tooling. ### Proxy Network Load Balancer Logs - `resource.type="l4_proxy_rule"` - Key labels include: - `project_id`, `network_name`, `region`, `load_balancing_scheme`, `protocol` - `forwarding_rule_name`, `target_proxy_name` - `backend_target_name`, `backend_target_type` - `backend_name`, `backend_type`, `backend_scope`, `backend_scope_type` - `jsonPayload.connection` records client/server IPs, ports, protocol numbers, byte counts, start/end timestamps, and latency ### VPC Flow Logs - `resource.type="gce_subnetwork"` - `jsonPayload` matches the VPC Flow Logs schema (5‑minute aggregation, `reporter`, `connection`, `src/dest` metadata) - Includes bytes, packets, and compute metadata (instance ID, tags, subnet) ### Application Load Balancer Logs (Export Output) - `resource.type="http_load_balancer"` (global) or `"http_external_regional_lb_rule"` (regional) - Key labels include: - `project_id`, `url_map_name`, `backend_service_name`, `region` (regional only) - `matched_url_path_rule`, `target_proxy_name`, `forwarding_rule_name` - `httpRequest` contains method, URL, status, response size, user agent, latency - When TLS is enabled, `jsonPayload` includes: - `tls.protocol`: TLS protocol version (e.g., "TLS 1.3") - `tls.cipher`: Cipher suite used for the connection ### External Passthrough Network Load Balancer Logs - `resource.type="loadbalancing.googleapis.com/ExternalNetworkLoadBalancerRule"` - Key labels include: - `project_id`, `network_name`, `region`, `load_balancing_scheme`, `protocol` - `forwarding_rule_name`, `target_proxy_name` - `backend_target_name`, `backend_target_type` - `backend_name`, `backend_type`, `backend_scope`, `backend_scope_type` - `jsonPayload.connection` records client/server IPs, ports, protocol numbers, byte counts, start/end timestamps, and latency ## Infrastructure Details ### VPC Flow Logs Infra - **VPC Network**: Custom mode network with a single subnet (`10.10.0.0/20`) - **VPC Flow Logs**: Enabled with 5-minute aggregation and full metadata sampling - **Firewall Rules**: - Internal traffic (all protocols within the subnet) - SSH access (from anywhere) - **Managed Instance Group**: Regional MIG with 2 Debian 12 instances - **Traffic Generation**: Automated intra-VPC traffic plus calls to Google Cloud APIs ### Proxy Network Load Balancer Infra - **VPC Network**: Custom mode network with subnet (`10.20.0.0/20`) - **Backend MIG**: Zonal managed instance group (2 Debian 12 VMs) running a simple HTTP server - **Health Checks**: TCP health check on port 80 with firewall rules for Google LB ranges - **Client VM**: Dedicated client instance that generates HTTP and raw TCP traffic - **Proxy-only Subnet**: Dedicated `/24` subnet (`10.20.16.0/24`) with `REGIONAL_MANAGED_PROXY` purpose for the LB control plane - **Target Proxy**: Regional target TCP proxy resource that fronts the backend service - **Load Balancer**: Regional external proxy Network Load Balancer (EXTERNAL_MANAGED) using a TCP proxy with 100% connection logging - **Network Tier**: STANDARD tier addresses to keep costs low during testing - **Readiness Waits**: Helper script waits up to 5 minutes for backend instances and the proxy to start responding before traffic generation - **Logging**: Connection logs exported via `resource.type="l4_proxy_rule"` and filtered by forwarding rule name - **Firewall Rules**: Internal traffic, SSH access, client-to-backend allow list ### Application Load Balancer Infra - **VPC Network**: Custom mode network with subnet (`10.20.0.0/20`) - **Backend MIG**: Zonal managed instance group (2 Debian 12 VMs) running nginx - **Health Checks**: HTTP health check on port 80 with firewall rules for Google LB ranges - **Client VM**: Dedicated client instance for traffic generation - **Proxy-only Subnet**: Regional-only subnet (`10.20.16.0/24`) with `REGIONAL_MANAGED_PROXY` purpose (created only for regional ALB) - **Load Balancer Types**: - **Global**: Uses global resources (`google_compute_*`) with Premium network tier and global IP - **Regional**: Uses regional resources (`google_compute_region_*`) with Standard network tier and regional IP - **TLS Configuration**: Self-signed certificates generated via Terraform's TLS provider, with separate regional/global SSL certificate resources - **Target Proxy**: HTTP or HTTPS proxy (conditional based on TLS setting) that routes to the backend service - **Load Balancer**: External managed Application Load Balancer (EXTERNAL_MANAGED) with configurable logging - **Logging**: - Global: `resource.type="http_load_balancer"` - Regional: `resource.type="http_external_regional_lb_rule"` - Backend service logs include TLS metadata when TLS is enabled (protocol, cipher) - Sample rate: 100% (configurable via variables) - **Firewall Rules**: Internal traffic, SSH access, health check ranges, client-to-backend allow list - **Traffic Generation**: HTTPS requests from local machine using curl with `--insecure` flag for self-signed certificates ### External Passthrough Network Load Balancer Infra - **VPC Network**: Custom mode network with subnet (`10.30.0.0/20`) - **Backend MIG**: Zonal managed instance group (2 Debian 12 VMs) running NGINX as a simple HTTP server - **Health Checks**: Regional TCP health check on port 80 with firewall rules allowing traffic from all sources - **Client VM**: Dedicated client instance that generates HTTP and raw TCP traffic (iperf3, curl, netcat) - **Load Balancer**: Regional external passthrough Network Load Balancer (`EXTERNAL` scheme) using a backend service with 100% connection logging - **Network Tier**: STANDARD tier addresses to minimize cost during testing - **Readiness Waits**: Startup scripts ensure backend and client VMs are ready before traffic generation - **Logging**: Connection logs exported via `resource.type="loadbalancing.googleapis.com/ExternalNetworkLoadBalancerRule"` and filtered by forwarding rule name - **Firewall Rules**: Internal traffic, health check access, client-to-backend allow list, SSH access ## Troubleshooting - **No logs exported yet**: Flow logs take about 10 minutes to appear; proxy NLB connection logs typically take 2–5 minutes. Re-run export or adjust `START_TIME`/`END_TIME`. - **Load balancer not responding**: Backends might still be initializing. `run.fish` already waits for readiness, but you can confirm status via `gcloud compute instance-groups managed list-instances`. - **Global ALB takes longer to provision**: Global load balancers need to propagate configuration across Google's global network, which can take 10-15 minutes. Regional ALBs typically provision faster. - **TLS certificate warnings**: The ALB scenario uses self-signed certificates for testing. This is expected and traffic generation uses `curl --insecure` to bypass certificate validation. - **Destroy fails with `resourceInUseByAnotherResource`**: Forwarding rules may still reference the proxy-only subnet. Wait a minute and re-run `./run.fish destroy --scenario= --dry-run=false`. - **Costs creeping up**: Proxy load balancers incur per-hour forwarding rule and proxy-only subnet charges. Always destroy the scenario after exporting the data you need. ## Adding New Scenarios Note that usage of an LLM is highly recommended for this repo. The repository is structured so additional scenarios can reuse the same tooling: 1. Create a Terraform module under `terraform/modules//`. 2. Update `terraform/main.tf`, `variables.tf`, and `outputs.tf` to expose the scenario. 3. Add a `lib/scenarios/.fish` helper that implements: - `scenario::validate_outputs` — pulls required Terraform outputs into shell variables. - `scenario::run_traffic` — generates the scenario-specific traffic after Terraform apply. - `scenario::export_logs` — runs the correct `gcloud logging read` query and writes JSONL. - `scenario::print_next_steps` — displays post-run instructions (e.g., wait times, destroy reminders). 4. Document the workflow in this README.