Operations Guide

This operations guide covers EA Consent Delta Writer deployment and administration in 8 sections:

Getting Started (1-2): Overview and prerequisites
Setup & Configuration (3-4): Docker deployment and configuration
Operations (5-7): Health monitoring, security controls, and logging
Maintenance (8): Troubleshooting and incident response

New to deploying EA Consent Delta Writer? Start with sections 1-4 for initial deployment, then reference sections 5-8 for ongoing operations and troubleshooting.

EA Consent Delta Writer – Operations Guide

1. Overview

The EA Consent Delta Writer is a Docker containerized service that transforms EA Consent Trust Blocks into Delta tables for the Privacy Network. It operates as a stateless HTTP API that ingests TrustBlock JWTs containing Verifiable Credentials, decodes them, and writes structured data to two Delta tables conforming to the EA Consent schema.

What it does:

Receives batches of Trust Block JWTs via HTTP API (1-100 blocks per request)
Decodes TrustBlock and ConsentVC JWTs without signature verification (Phase 1)
Transforms consent data into two Delta Share tables:
- ea-consent-tb - Complete Verifiable Credential data
- ea-consent-verification-keys - Cryptographic public keys for VC verification
Writes Delta tables to S3 storage using append-only transactions
Provides health monitoring and runtime statistics

How it runs:

Single Docker container with two HTTP applications on separate ports
Port 8080: API endpoints for Trust Block ingestion
Port 8081: Monitoring endpoints for health checks and statistics
Tini init system for proper multi-process signal handling
Non-root user (UID 1000) for container security

How it fits in the Privacy Network:

Upstream dependency: EA Consent Issuer produces Trust Block JWTs
Downstream consumers: Delta Sharing Server (delta-io/delta-sharing) serves the Delta tables
Storage: AWS S3 for Delta table persistence
Deployment boundary: Should run in private network behind API gateway or VPC

Important: This service writes Delta tables to S3. It does NOT serve the Delta Share protocol. For Delta Share serving, deploy the separate open-source delta-io/delta-sharing server.

2. Prerequisites

Required

Docker (version 20.10 or later)
- Install from docker.com
- Verify installation: docker --version
AWS S3 Access
- AWS account with S3 service enabled
- IAM user or role with S3 permissions (see IAM policy below)
- S3 bucket created in your AWS region (e.g., us-west-2)
- Note your bucket name and desired prefix path
Network Configuration
- Outbound HTTPS access to AWS S3 endpoints
- Inbound access control for ports 8080 and 8081 (private network recommended)
Configuration File
- YAML configuration file defining storage backend, ports, and environment settings
- See Configuration section for structure

Private network deployment - Deploy in VPC with private subnets and security groups
Centralized logging - Forward container logs to CloudWatch, Splunk, or similar
API Gateway (optional) - Adds authentication, rate limiting, and additional security layer
IAM roles (production) - Use EC2/ECS instance roles instead of static credentials

IAM Permissions Required

The service requires these S3 permissions on your target bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket",
        "s3:HeadBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}

3. Deployment

This section documents the container’s requirements so you can deploy it using your chosen infrastructure. The minimal docker run example below shows essential flags that partners can adapt to their deployment tools (docker-compose, Kubernetes, ECS, etc.).

Container Image

Registry: Docker Hub Image name: webshield/ea-consent-delta-writer Tag: :1.2 (version 1.2)

Pull the image:

docker pull webshield/ea-consent-delta-writer:1.2

Minimal Docker Run Example

This example shows the container’s essential requirements:

docker run -d \
  --name ea-consent-delta-writer \
  -p 8080:8080 \
  -p 8081:8081 \
  -v /path/to/config.yaml:/config/config.yaml:ro \
  -e CONFIG_PATH=/config/config.yaml \
  -e CLIENT_BEARER_TOKEN=your-secret-token-here \
  -e AWS_ACCESS_KEY_ID=YOUR_AWS_KEY \
  -e AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET \
  -e AWS_REGION=us-west-2 \
  webshield/ea-consent-delta-writer:1.2

Required Flags:

Ports (-p): Expose both 8080 (API) and 8081 (monitoring)
Volume (-v): Mount YAML config file (read-only recommended)
Environment (-e): CONFIG_PATH, CLIENT_BEARER_TOKEN, AWS credentials, AWS region

Container Requirements

Exposed Ports:

8080 - API application (Trust Block ingestion endpoints)
8081 - Monitoring application (health checks, status, metrics)

Volume Mounts:

/config/config.yaml - YAML configuration file (required, read-only recommended)

Environment Variables: See Configuration section for complete variable reference.

Base Image:

Python 3.12 slim
Non-root user (UID 1000)
Tini init system for signal handling

Health Check Endpoint:

GET http://localhost:8081/ea-consent-delta-writer/health
Container includes built-in health check (30s interval, 40s start period)

Dependencies:

AWS S3 connectivity (outbound HTTPS)
No database required (stateless service)

Verification Steps

After starting the container, verify it’s running correctly:

# 1. Check container is running
docker ps | grep ea-consent-delta-writer

# 2. Check health endpoint (should return {"status":"healthy"})
curl http://localhost:8081/ea-consent-delta-writer/health

# 3. Check status endpoint (returns detailed system info)
curl http://localhost:8081/ea-consent-delta-writer/status

# 4. View container logs
docker logs ea-consent-delta-writer

Expected startup behavior:

Container starts two Uvicorn processes (API on 8080, monitoring on 8081)
Validates configuration file on startup
Validates S3 bucket connectivity (unless SKIP_S3_VALIDATION=true)
Health endpoint returns 200 OK within 40 seconds

Network Requirements

Outbound (container to external services):

HTTPS to AWS S3 endpoints (port 443)
DNS resolution for s3.*.amazonaws.com

Inbound (external to container):

Port 8080 - Should be restricted to trusted sources (API gateway, internal network)
Port 8081 - Can be restricted to monitoring systems only

Recommended network isolation:

Deploy in private subnet with no public IP
Use security groups to whitelist known sources
Front with API gateway or load balancer for additional security

4. Configuration

The service is configured via a YAML configuration file and environment variables.

Configuration File Structure

Create a YAML configuration file (e.g., config.yaml):

# Port configuration
api_port: 8080
monitor_port: 8081

# Debug settings (set false in production)
debug: false
aws_debug: false
spark_debug: false

# Custom URI prefix (optional)
custom_uri_prefix: ""

# Metadata endpoint base URL (for service discovery URIs)
metadata_endpoint_base_url: "https://your-api-gateway.example.com"

# Environment identifier (must appear in storage.uri_base after "://")
environment: "production"

# Storage configuration
storage:
  provider: "s3"
  uri_base: "s3://your-bucket-name/production/delta-tables"
  settings:
    region: "us-west-2"
    endpoint_url: null  # For S3-compatible storage (future feature)

Mount this file into the container:

-v /path/to/config.yaml:/config/config.yaml:ro

Configuration Fields

Field	Type	Description	Default/Required
`api_port`	integer	Port for API application	8080
`monitor_port`	integer	Port for monitoring application	8081
`debug`	boolean	Enable debug mode (allows `?debug=true` query param, verbose logging)	false
`aws_debug`	boolean	Enable AWS SDK debug logging	false
`spark_debug`	boolean	Enable Spark/Delta/py4j debug logging	false
`custom_uri_prefix`	string	Custom URI prefix for endpoints (e.g., “/ea-consent”)	””
`metadata_endpoint_base_url`	string	Base URL for metadata endpoint URIs (used in service discovery)	✅ Required
`environment`	string	Environment identifier (e.g., “production”, “staging”). Must appear in `storage.uri_base` after `://` for validation.	✅ Required
`storage.provider`	string	Storage provider (currently only “s3” supported)	“s3”
`storage.uri_base`	string	Base S3 URI for Delta tables. Must contain the `environment` value (e.g., “s3://bucket/production/path” when environment=”production”).	✅ Required
`storage.settings.region`	string	AWS region for S3 operations	✅ Required
`storage.settings.endpoint_url`	string	Custom S3 endpoint URL (for S3-compatible storage, future feature)	null

Environment Variables

These variables control runtime behavior and credentials. All secrets should be provided via environment variables or secret stores—never committed to source control.

Variable	Description	Required	Example
`CONFIG_PATH`	Path to YAML configuration file inside container	✅ Required	`/config/config.yaml`
`CLIENT_BEARER_TOKEN`	Bearer token for API authentication. Clients must include `Authorization: Bearer <token>` header in all API requests.	✅ Required	`your-secret-token-here`
`AWS_ACCESS_KEY_ID`	AWS access key for S3 storage	✅ Required	`AKIAIOSFODNN7EXAMPLE`
`AWS_SECRET_ACCESS_KEY`	AWS secret key paired with the access key	✅ Required	`wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY`
`AWS_REGION`	AWS region for S3 operations (can also be set in config file)	⚙️ Optional	`us-west-2`
`SKIP_S3_VALIDATION`	Skip S3 connectivity check on startup (for local testing only)	⚙️ Optional	`true`

Security notes:

Use strong, randomly generated tokens for CLIENT_BEARER_TOKEN (minimum 32 characters)
Use IAM roles attached to your container platform (ECS, Kubernetes) instead of static AWS credentials in production
Rotate CLIENT_BEARER_TOKEN and AWS credentials every 90 days if IAM roles are not available
Never log or expose credentials in application logs or responses
Store credentials in secure secret stores (AWS Secrets Manager, HashiCorp Vault, etc.)

S3 Bucket Structure

The service writes Delta tables to S3 in this structure:

s3://your-bucket-name/production/delta-tables/
├── ea-consent-tb/
│   ├── _delta_log/
│   │   └── 00000000000000000000.json
│   └── part-00000-*.parquet
└── ea-consent-verification-keys/
    ├── _delta_log/
    │   └── 00000000000000000000.json
    └── part-00000-*.parquet

Important:

Set storage.uri_base to the parent directory (e.g., s3://your-bucket-name/production/delta-tables), NOT to individual table directories
Table names use hyphens (not underscores): ea-consent-tb, ea-consent-verification-keys
The service automatically creates the two table subdirectories

Startup Validation

On startup, the service performs these validations:

Configuration file syntax - Parses YAML and validates required fields
Environment identifier validation - Ensures environment value appears in storage.uri_base after ://
S3 bucket connectivity - Tests S3 access (unless SKIP_S3_VALIDATION=true)
AWS credentials validity - Verifies credentials can authenticate

If validation fails:

Container will not start
Error logged to stdout/stderr
Exit code non-zero

Example validation failure:

ValueError: Environment 'production' must appear in storage.uri_base after '://'.
Found: 's3://bucket/staging/delta-tables'

Resolution: Fix configuration file and restart container.

5. Health Monitoring

The service provides monitoring endpoints on port 8081 (monitoring application) for health checks, liveness probes, and operational metrics.

Available Monitoring Endpoints

Health Check Endpoint

Purpose: Lightweight health check for container orchestration and load balancers Endpoint: GET /ea-consent-delta-writer/health Port: 8081

curl http://localhost:8081/ea-consent-delta-writer/health

Response (200 OK):

{
  "status": "healthy"
}

Use cases:

Docker/Kubernetes liveness probes
Load balancer health checks
Basic uptime monitoring
Container startup verification

Recommended probe configuration:

Interval: 30 seconds
Timeout: 10 seconds
Start period: 40 seconds (allows startup time)
Failure threshold: 3 consecutive failures

Status Endpoint

Purpose: Comprehensive system status with process details, memory usage, and ingestion statistics Endpoint: GET /ea-consent-delta-writer/status Port: 8081

curl http://localhost:8081/ea-consent-delta-writer/status

Response (200 OK):

{
  "status": "healthy",
  "version": "1.2.0",
  "system_info": {
    "process_id": 12345,
    "hostname": "ea-consent-delta-writer-7d8f9c",
    "python_version": "3.12.0",
    "fastapi_version": "0.115.0",
    "memory_usage_mb": 156.7,
    "uptime_seconds": 3600.5
  },
  "statistics": {
    "total_requests": 42,
    "successful_requests": 38,
    "partial_requests": 3,
    "failed_requests": 1,
    "total_trust_blocks_submitted": 1250,
    "total_trust_blocks_succeeded": 1235,
    "total_trust_blocks_failed": 15
  }
}

Use cases:

Operational dashboards and monitoring
Performance analysis and capacity planning
Debugging and troubleshooting
Service health verification

Important notes:

Statistics are tracked in-memory and reset on service restart
partial_requests indicates requests where some blocks succeeded and some failed (best-effort processing)
Memory usage should be monitored for trends (unexpected growth may indicate issues)

Container Health Checks

The Docker container includes a built-in health check that polls the /health endpoint:

Check service health:

curl http://localhost:8081/ea-consent-delta-writer/health
# Expected: {"status":"healthy"}

Check detailed status:

curl http://localhost:8081/ea-consent-delta-writer/status | jq

Monitoring Best Practices

Liveness Probes: Use /health endpoint for quick liveness checks (lightweight, fast response)
Metrics Collection: Poll /status endpoint every 30-60 seconds for operational metrics
Log Aggregation: Forward container stdout/stderr logs to centralized logging (CloudWatch, Splunk, Datadog)
Alerting Thresholds: Set up alerts for:
- Service downtime - 3+ consecutive health check failures
- High failure rates - total_trust_blocks_failed increasing significantly
- Memory growth - memory_usage_mb trending upward over time
- Zero activity - total_requests not increasing (possible network isolation)
- Partial success spike - Sudden increase in partial_requests (data quality issues)
Dashboard Metrics: Track these key metrics over time:
- Request success rate: successful_requests / total_requests
- Block success rate: total_trust_blocks_succeeded / total_trust_blocks_submitted
- Memory usage trend: memory_usage_mb (should be stable)
- Uptime: uptime_seconds (resets on restarts)

Accessing Logs

View container logs for detailed operational information:

# Follow logs in real-time
docker logs -f ea-consent-delta-writer

# View recent logs (last 100 lines)
docker logs --tail=100 ea-consent-delta-writer

# View logs with timestamps
docker logs -t ea-consent-delta-writer

Log format: JSON structured logging to stdout/stderr with correlation IDs for request tracing.

6. Authorization & Security

The service is designed to operate within a private network or VPC behind an API gateway or reverse proxy.

Security Model

Layer	Implementation	Status
Network Isolation	Private VPC / API Gateway	✅ Required
API Authentication	Built-in CLIENT_BEARER_TOKEN validation	✅ Implemented
TLS/HTTPS	Reverse proxy or gateway	✅ Recommended
Rate Limiting	Gateway or reverse proxy	⚙️ Optional

API Authentication

API endpoints require bearer token authentication via CLIENT_BEARER_TOKEN
Configured via environment variable at container startup
All API requests must include Authorization: Bearer <token> header
Unauthenticated requests are rejected with 401 Unauthorized

Additional Gateway Security (Optional):

API Gateway can provide additional layers: mTLS, IP allowlisting, OAuth2, rate limiting
TLS/HTTPS termination should be handled by reverse proxy or API gateway
Monitoring port (8081) should be accessible only to internal monitoring systems

S3 encryption:

Use HTTPS for all S3 operations (enforced by boto3 default)
Enable S3 bucket encryption at rest (AES-256 or KMS)

Security Features Summary

The following table shows current security capabilities:

Security Checklist for Operators

Before deploying to production, ensure these security controls are in place:

7. Logs & Metrics

Log Output

Where logs are written:

Container stdout/stderr (standard Docker logging)
JSON structured logging format
No log files inside container (follows 12-factor app principles)

Log types:

Startup logs: Configuration loading, S3 validation, port binding
API access logs: HTTP requests with method, path, status, duration
Processing logs: Trust Block decoding, transformation, Delta table writes
Error logs: Validation failures, S3 errors, processing errors
Debug logs: Verbose AWS/Spark logging (when debug=true or aws_debug=true)

Log format:

{
  "timestamp": "2024-11-23T10:30:45.123Z",
  "level": "INFO",
  "message": "Trust block processed successfully",
  "correlation_id": "abc123-def456",
  "trust_block_id": "tb_789",
  "ea_consent_id": "consent_012"
}

Correlation IDs: Each API request includes a correlation ID for tracing requests across log entries (via asgi-correlation-id middleware).

Accessing Logs

View container logs:

# Follow logs in real-time
docker logs -f ea-consent-delta-writer

# View recent logs (last 100 lines)
docker logs --tail=100 ea-consent-delta-writer

# View logs with timestamps
docker logs -t ea-consent-delta-writer

# Search logs for errors
docker logs ea-consent-delta-writer 2>&1 | grep -i error

# Search logs for specific correlation ID
docker logs ea-consent-delta-writer 2>&1 | grep "abc123-def456"

Log Verbosity Control

Control log verbosity via configuration file:

# Minimal logging (production default)
debug: false
aws_debug: false
spark_debug: false

# Verbose logging (debugging/troubleshooting)
debug: true
aws_debug: true
spark_debug: true

Log levels:

debug: false - INFO level and above (recommended for production)
debug: true - DEBUG level (verbose, use for troubleshooting only)
aws_debug: true - AWS SDK debug logging (very verbose, use for S3 connectivity issues)
spark_debug: true - Spark/Delta/py4j debug logging (very verbose, use for Delta table issues)

Warning: Enabling debug modes significantly increases log volume and may impact performance. Use only for troubleshooting.

Metrics Endpoints

Status endpoint: GET /ea-consent-delta-writer/status (port 8081)

Returns comprehensive metrics in JSON format (see Health Monitoring for full response schema).

Key metrics to track:

Request metrics:
- total_requests - Total API requests received
- successful_requests - Requests where all blocks succeeded
- partial_requests - Requests with some successes and some failures
- failed_requests - Requests that completely failed
Trust Block metrics:
- total_trust_blocks_submitted - Total blocks submitted across all requests
- total_trust_blocks_succeeded - Blocks successfully processed
- total_trust_blocks_failed - Blocks that failed processing
System metrics:
- memory_usage_mb - Current process memory usage
- uptime_seconds - Seconds since service started
- version - Service version

Log Aggregation and Retention

The service outputs JSON logs to `stdout`, making it easy to integrate with log aggregation platforms:

8. Troubleshooting

This section covers common operational issues, error scenarios, and resolution steps.

Common Issues

Symptom	Likely Cause	Resolution
Container exits immediately after startup	Missing or invalid configuration file	Verify `CONFIG_PATH` points to valid YAML file. Check volume mount: `docker inspect ea-consent-delta-writer` and verify config file exists on host. Validate YAML syntax: `python -c "import yaml; yaml.safe_load(open('config.yaml'))"`
`/health` endpoint returns 500 or connection refused	Service failed to initialize or dependencies unavailable	Check container logs: `docker logs ea-consent-delta-writer`. Look for S3 connectivity errors, configuration validation errors, or Python exceptions. Verify all required environment variables are set (CONFIG_PATH, CLIENT_BEARER_TOKEN, AWS credentials).
API calls return 401 Unauthorized	Missing or invalid bearer token	Ensure `Authorization: Bearer <token>` header is included in request. Verify token matches `CLIENT_BEARER_TOKEN` environment variable set at container startup. Check token for typos or extra whitespace. Test: `curl -H "Authorization: Bearer your-token" http://localhost:8080/ea-consent-delta-writer/api/v1/trust-block-batches`.
S3 connectivity errors on startup	Invalid AWS credentials, incorrect bucket name, or network issues	Verify AWS credentials are set correctly. Check IAM permissions include `s3:HeadBucket`, `s3:PutObject`, `s3:ListBucket`. Verify `storage.uri_base` bucket exists and region matches `storage.settings.region`. Test S3 access from host: `aws s3 ls s3://your-bucket-name/`.
Environment validation error on startup	`environment` value not found in `storage.uri_base`	Ensure `environment` field in config matches a segment in `storage.uri_base`. Example: if `environment: "production"`, then `storage.uri_base` must contain “production” after `s3://` (e.g., `s3://bucket/production/path`).
API calls return 422 Unprocessable Entity	Invalid request payload or missing required fields	Ensure `Content-Type: application/json` header is set. Verify JSON syntax (no trailing commas). Ensure `trust_blocks` array contains 1-100 items. Verify each item is a non-empty string. Review error response for specific field validation failures.
Partial success responses (`status: "partial_success"`)	Some Trust Blocks failed validation or processing	This is expected behavior (best-effort processing). Review `results` array in response for per-block status and error messages. Common causes: invalid JWT format, missing required fields in JWT payload, Pydantic validation failures. Fix problematic JWTs and resubmit.
High memory usage (`memory_usage_mb` increasing)	Large batch sizes, debug mode enabled, or memory leak	Reduce batch size to 50-100 blocks per request. Set `debug: false` in config (debug responses significantly increase memory). Restart container and monitor memory over time. Report issue if memory continues growing.
S3 write failures (errors in logs)	Insufficient IAM permissions, incorrect bucket config, or network issues	Verify IAM policy includes `s3:PutObject`, `s3:ListBucket`. Check `storage.uri_base` in config matches actual bucket name and region. Verify VPC/security group allows outbound HTTPS to S3. Check S3 bucket versioning/lifecycle policies for conflicts. Enable AWS debug logging: `aws_debug: true` in config.
Health check shows “unhealthy” status	Service not fully started, port conflicts, or application crash	Wait 40 seconds for startup period. Check logs for startup errors. Verify ports 8080 and 8081 are not in use by other processes: `netstat -tuln \\| grep 808`. Verify monitoring port 8081 is exposed in Docker run command.
Zero requests in statistics (`total_requests: 0`)	No traffic reaching container, network isolation, or wrong endpoint URL	Verify network connectivity to port 8080. Check firewall/security group rules allow inbound traffic. Verify clients are using correct endpoint URL. Test connectivity: `curl http://container-ip:8080/ea-consent-delta-writer/api/v1/trust-block-batches`.
Logs not appearing or empty	Container crashed at startup, wrong container name, or logging driver issue	List running containers: `docker ps -a` (includes stopped). Check container status and exit code. Try running in foreground to see errors: `docker run` without `-d` flag. Verify correct container name in logs command.

Quick Navigation

Table of Contents

1. Overview

2. Prerequisites

Required

Recommended

IAM Permissions Required

3. Deployment

Container Image

Minimal Docker Run Example

Container Requirements

Verification Steps

Network Requirements

4. Configuration

Configuration File Structure

Configuration Fields

Environment Variables

S3 Bucket Structure

Startup Validation

5. Health Monitoring

Available Monitoring Endpoints

Health Check Endpoint

Status Endpoint

Container Health Checks

Monitoring Best Practices

Accessing Logs

6. Authorization & Security

Security Model

API Authentication

Security Features Summary

Security Checklist for Operators

7. Logs & Metrics

Log Output

Accessing Logs

Log Verbosity Control

Metrics Endpoints

Log Aggregation and Retention

The service outputs JSON logs to `stdout`, making it easy to integrate with log aggregation platforms:

8. Troubleshooting

Common Issues

EA Consent Delta Writer – Operations Guide

Quick Navigation

Table of Contents

1. Overview

2. Prerequisites

Required

Recommended

IAM Permissions Required

3. Deployment

Container Image

Minimal Docker Run Example

Container Requirements

Verification Steps

Network Requirements

4. Configuration

Configuration File Structure

Configuration Fields

Environment Variables

S3 Bucket Structure

Startup Validation

5. Health Monitoring

Available Monitoring Endpoints

Health Check Endpoint

Status Endpoint

Container Health Checks

Monitoring Best Practices

Accessing Logs

6. Authorization & Security

Security Model

API Authentication

Security Features Summary

Security Checklist for Operators

7. Logs & Metrics

Log Output

Accessing Logs

Log Verbosity Control

Metrics Endpoints

Log Aggregation and Retention

The service outputs JSON logs to stdout, making it easy to integrate with log aggregation platforms:

8. Troubleshooting

Common Issues

The service outputs JSON logs to `stdout`, making it easy to integrate with log aggregation platforms: