Building Compliant Infrastructure: How We Log and Store Data at Twinnlinks
A deep dive into our technical infrastructure — from Terraform org policies to Workload Identity, and how PIPEDA compliance shapes every architectural decision.
Building Compliant Infrastructure: How We Log and Store Data at Twinnlinks
Healthcare data in Canada doesn't just need to work — it needs to work within strict legal frameworks. When we built Twinnlinks, we didn't start with features. We started with compliance.
PIPEDA isn't a checklist you tack on at the end. It's the foundation. Every logging decision, every storage choice, every access control flows from those ten principles.
Here's how we built a compliant data pipeline from day one.
Our Stack
- Compute: Google Cloud Run (serverless, regional)
- Storage: Cloud SQL (regional) + Cloud Storage (object)
- Logging: Cloud Logging with Log Router sinks
- Secrets: Secret Manager with automatic rotation
- IAM: Workload Identity federation (no long-lived credentials)
- IaC: Terraform with organization policy constraints
- CI/CD: GitHub Actions with OIDC auth
- Cloud Storage object lifecycle policies for log archival
- BigQuery partition expiration for queryable logs
- Scheduled jobs that enforce deletion timelines
- SOC 2 Type II certification (target: Q2 2026)
- FedRAMP readiness for public sector clinics
- Automated compliance checks in CI/CD pipelines
- PIPEDA Compliance Guide
- Google Cloud Healthcare compliance
- Our team is always happy to talk infra: contact@twinnlinks.com
PIPEDA Principles → Technical Controls
Here's the mapping that actually matters. These aren't abstract concepts — they're concrete configurations we enforce.
| PIPEDA Principle | Technical Control | Implementation |
|---|---|---|
| Safeguarding | Encryption at rest + in transit | Cloud SQL automatic encryption, TLS 1.3 enforced |
| Openness | Audit logging for all data access | Cloud Logging with organization sinks |
| Individual Access | Data export + self-service deletion | GDPR/PIPEDA dashboard with API triggers |
| Accuracy | Write-once logs + immutable storage | Cloud Storage bucket with object versioning |
| Limiting Collection | Schema validation + PII redaction | Pre-commit hooks + DLP API inspection |
| Consent | Feature flags tied to consent scopes | LaunchDarkly integration with consent graph |
| Accountability | Traceability from request to log | Correlation IDs across all services |
Terraform Organization Policy
We don't rely on developers to remember compliance. We enforce it at the organization level.
resource "google_org_policy_policy" "deny_region_override" {
name = "constraints/gcp.resourceLocations"
parent = "organizations/${var.organization_id}"
spec {
rules {
allow_all = false
}
rules {
locations {
include = []
exclude = ["asia-*", "eu-*", "us-*"]
}
}
}
}
resource "google_org_policy_policy" "require_encryption" {
name = "constraints/gcp.compute.requireShieldedVm"
parent = "organizations/${var.organization_id}"
spec {
rules {
enforce = true
}
}
}
This means: nothing gets deployed outside Canada, ever. No developer can accidentally spin up a US region. No "just this once" exception.
Workload Identity: No More Service Account Keys
We don't have long-lived credentials. Period. Every deployment uses short-lived tokens via Workload Identity federation.
# .github/workflows/deploy.yml
name: Deploy to Production
permissions:
contents: read
id-token: write
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Authenticate to Google Cloud
uses: google-github-actions/auth@v2
with:
workload_identity_provider: "projects/${{ secrets.PROJECT_ID }}/locations"
service_account: "deploy@${{ secrets.PROJECT_ID }}.iam.gserviceaccount.com"
- name: Deploy to Cloud Run
run: |
gcloud run deploy twinnlinks-api \
--region=canada-central1 \
--source=. \
--allow-unauthenticated
The token expires in an hour. If it's compromised, the blast radius is tiny. If credentials leak, they're useless within 60 minutes.
Logging Pipeline Architecture
Every request gets logged. Not just errors — everything. We need to prove who accessed what, when.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Client │────▶│ Cloud Run │────▶│ Cloud │
│ │ │ Service │ │ Logging │
└─────────────┘ └─────────────┘ └──────┬──────┘
│
┌───────▼────────┐
│ Log Router │
│ Sink │
└───────┬────────┘
│
┌───────────────────────┼───────────────────────┐
│ │ │
┌────▼────┐ ┌──────▼─────┐ ┌───────▼──────┐
│ BigQuery│ │ Cloud │ │ Pub/Sub │
│(Analysis│ │ Storage │ │ (Alerting) │
│ + Audit)│ │ (Long-term │ └──────────────┘
└─────────┘ │ Retention)│
└────────────┘
The critical path: Log Router sinks route logs to immutable Cloud Storage buckets before any processing happens. This gives us tamper-evident audit logs — if someone tries to modify logs, we'll know.
Schema-Level PII Controls
We don't just log strings. We log structured data with PII awareness.
import { logEntry } from '@google-cloud/logging';
import { inspect } from '@google-cloud/dlp';
const dlp = new DLP();
export async function logAuditEvent(event: AuditEvent) {
// Redact PII before logging
const [redacted] = await dlp.redact({
parent: `projects/${PROJECT_ID}`,
inspectConfig: {
infoTypes: [
{ name: 'PHONE_NUMBER' },
{ name: 'EMAIL_ADDRESS' },
{ name: 'PERSON_NAME' }
]
},
items: [{ value: JSON.stringify(event) }]
});
const entry = logEntry();
entry.setResource({
type: 'cloud_run_revision',
labels: {
service_name: 'twinnlinks-api',
location: 'canada-central1'
}
});
entry.setToStdout();
entry.setJsonPayload({
...event,
user_id: redacted.items[0].value, // Redacted
timestamp: new Date().toISOString(),
correlation_id: event.correlation_id
});
await entry.write();
}
The raw event with PII never leaves the service boundary. The DLP API handles redaction before anything hits the logging layer.
Data Retention: The 7-Year Reality
PIPEDA requires retaining certain records for up to 7 years. We handle this through:
resource "google_storage_bucket" "audit_logs" {
name = "twinnlinks-audit-logs"
location = "CANADA-CENTRAL1"
force_destroy = false
lifecycle_rule {
condition {
age = 2555 # 7 years in days
}
action {
type = "Delete"
}
}
uniform_bucket_level_access = true
versioning {
enabled = true
}
}
After 7 years, logs are automatically deleted. We don't manually intervene. We don't make judgment calls. The policy runs, and compliance happens.
Monitoring: We Know Before You Do
We don't wait for patients to tell us something's wrong.
# alerting-policies/compliance-breaches.yml
- alert: PII_Access_Anomaly
expr: |
rate(log_entries{category="pii_access"}[5m])
> 1.5 * rate(log_entries{category="pii_access"}[1h] offset 1h)
annotations:
summary: "Unusual PII access pattern detected"
description: "PII access rate is 50% above baseline"
- alert: Non_Canadian_Data_Access
expr: |
sum(log_entries{region!~"canada-.*"})
annotations:
summary: "Data accessed from non-Canadian region"
description: "Immediate investigation required"
If PII access spikes, if data leaves Canada, if anything looks off — we get paged. Compliance isn't reactive.
What's Next
We're actively working on:
Resources
Disclaimer: This article describes our current infrastructure. It's not legal or compliance advice. For specific guidance, consult with qualified professionals.