EA Consent Delta Writer – Operations Guide

Quick Navigation

This operations guide covers EA Consent Delta Writer deployment and administration in 8 sections:

  • Getting Started (1-2): Overview and prerequisites
  • Setup & Configuration (3-4): Docker deployment and configuration
  • Operations (5-7): Health monitoring, security controls, and logging
  • Maintenance (8): Troubleshooting and incident response

New to deploying EA Consent Delta Writer? Start with sections 1-4 for initial deployment, then reference sections 5-8 for ongoing operations and troubleshooting.

Table of Contents

  1. EA Consent Delta Writer – Operations Guide
    1. Quick Navigation
    2. 1. Overview
    3. 2. Prerequisites
      1. Required
      2. Recommended
      3. IAM Permissions Required
    4. 3. Deployment
      1. Container Image
      2. Minimal Docker Run Example
      3. Container Requirements
      4. Verification Steps
      5. Network Requirements
    5. 4. Configuration
      1. Configuration File Structure
      2. Configuration Fields
      3. Environment Variables
      4. S3 Bucket Structure
      5. Startup Validation
    6. 5. Health Monitoring
      1. Available Monitoring Endpoints
        1. Health Check Endpoint
        2. Status Endpoint
      2. Container Health Checks
      3. Monitoring Best Practices
      4. Accessing Logs
    7. 6. Authorization & Security
      1. Security Model
      2. API Authentication
      3. Security Features Summary
      4. Security Checklist for Operators
    8. 7. Logs & Metrics
      1. Log Output
      2. Accessing Logs
      3. Log Verbosity Control
      4. Metrics Endpoints
      5. Log Aggregation and Retention
    9. The service outputs JSON logs to stdout, making it easy to integrate with log aggregation platforms:
    10. 8. Troubleshooting
      1. Common Issues

1. Overview

The EA Consent Delta Writer is a Docker containerized service that transforms EA Consent Trust Blocks into Delta tables for the Privacy Network. It operates as a stateless HTTP API that ingests TrustBlock JWTs containing Verifiable Credentials, decodes them, and writes structured data to two Delta tables conforming to the EA Consent schema.

What it does:

  • Receives batches of Trust Block JWTs via HTTP API (1-100 blocks per request)
  • Decodes TrustBlock and ConsentVC JWTs without signature verification (Phase 1)
  • Transforms consent data into two Delta Share tables:
    • ea-consent-tb - Complete Verifiable Credential data
    • ea-consent-verification-keys - Cryptographic public keys for VC verification
  • Writes Delta tables to S3 storage using append-only transactions
  • Provides health monitoring and runtime statistics

How it runs:

  • Single Docker container with two HTTP applications on separate ports
  • Port 8080: API endpoints for Trust Block ingestion
  • Port 8081: Monitoring endpoints for health checks and statistics
  • Tini init system for proper multi-process signal handling
  • Non-root user (UID 1000) for container security

How it fits in the Privacy Network:

  • Upstream dependency: EA Consent Issuer produces Trust Block JWTs
  • Downstream consumers: Delta Sharing Server (delta-io/delta-sharing) serves the Delta tables
  • Storage: AWS S3 for Delta table persistence
  • Deployment boundary: Should run in private network behind API gateway or VPC

Important: This service writes Delta tables to S3. It does NOT serve the Delta Share protocol. For Delta Share serving, deploy the separate open-source delta-io/delta-sharing server.


2. Prerequisites

Required

  1. Docker (version 20.10 or later)
    • Install from docker.com
    • Verify installation: docker --version
  2. AWS S3 Access
    • AWS account with S3 service enabled
    • IAM user or role with S3 permissions (see IAM policy below)
    • S3 bucket created in your AWS region (e.g., us-west-2)
    • Note your bucket name and desired prefix path
  3. Network Configuration
    • Outbound HTTPS access to AWS S3 endpoints
    • Inbound access control for ports 8080 and 8081 (private network recommended)
  4. Configuration File
    • YAML configuration file defining storage backend, ports, and environment settings
    • See Configuration section for structure
  • Private network deployment - Deploy in VPC with private subnets and security groups
  • Centralized logging - Forward container logs to CloudWatch, Splunk, or similar
  • API Gateway (optional) - Adds authentication, rate limiting, and additional security layer
  • IAM roles (production) - Use EC2/ECS instance roles instead of static credentials

IAM Permissions Required

The service requires these S3 permissions on your target bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket",
        "s3:HeadBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    }
  ]
}

3. Deployment

This section documents the container’s requirements so you can deploy it using your chosen infrastructure. The minimal docker run example below shows essential flags that partners can adapt to their deployment tools (docker-compose, Kubernetes, ECS, etc.).

Container Image

Registry: Docker Hub Image name: webshield/ea-consent-delta-writer Tag: :1.2 (version 1.2)

Pull the image:

docker pull webshield/ea-consent-delta-writer:1.2

Minimal Docker Run Example

This example shows the container’s essential requirements:

docker run -d \
  --name ea-consent-delta-writer \
  -p 8080:8080 \
  -p 8081:8081 \
  -v /path/to/config.yaml:/config/config.yaml:ro \
  -e CONFIG_PATH=/config/config.yaml \
  -e CLIENT_BEARER_TOKEN=your-secret-token-here \
  -e AWS_ACCESS_KEY_ID=YOUR_AWS_KEY \
  -e AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET \
  -e AWS_REGION=us-west-2 \
  webshield/ea-consent-delta-writer:1.2

Required Flags:

  • Ports (-p): Expose both 8080 (API) and 8081 (monitoring)
  • Volume (-v): Mount YAML config file (read-only recommended)
  • Environment (-e): CONFIG_PATH, CLIENT_BEARER_TOKEN, AWS credentials, AWS region

Container Requirements

Exposed Ports:

  • 8080 - API application (Trust Block ingestion endpoints)
  • 8081 - Monitoring application (health checks, status, metrics)

Volume Mounts:

  • /config/config.yaml - YAML configuration file (required, read-only recommended)

Environment Variables: See Configuration section for complete variable reference.

Base Image:

  • Python 3.12 slim
  • Non-root user (UID 1000)
  • Tini init system for signal handling

Health Check Endpoint:

  • GET http://localhost:8081/ea-consent-delta-writer/health
  • Container includes built-in health check (30s interval, 40s start period)

Dependencies:

  • AWS S3 connectivity (outbound HTTPS)
  • No database required (stateless service)

Verification Steps

After starting the container, verify it’s running correctly:

# 1. Check container is running
docker ps | grep ea-consent-delta-writer

# 2. Check health endpoint (should return {"status":"healthy"})
curl http://localhost:8081/ea-consent-delta-writer/health

# 3. Check status endpoint (returns detailed system info)
curl http://localhost:8081/ea-consent-delta-writer/status

# 4. View container logs
docker logs ea-consent-delta-writer

Expected startup behavior:

  • Container starts two Uvicorn processes (API on 8080, monitoring on 8081)
  • Validates configuration file on startup
  • Validates S3 bucket connectivity (unless SKIP_S3_VALIDATION=true)
  • Health endpoint returns 200 OK within 40 seconds

Network Requirements

Outbound (container to external services):

  • HTTPS to AWS S3 endpoints (port 443)
  • DNS resolution for s3.*.amazonaws.com

Inbound (external to container):

  • Port 8080 - Should be restricted to trusted sources (API gateway, internal network)
  • Port 8081 - Can be restricted to monitoring systems only

Recommended network isolation:

  • Deploy in private subnet with no public IP
  • Use security groups to whitelist known sources
  • Front with API gateway or load balancer for additional security

4. Configuration

The service is configured via a YAML configuration file and environment variables.

Configuration File Structure

Create a YAML configuration file (e.g., config.yaml):

# Port configuration
api_port: 8080
monitor_port: 8081

# Debug settings (set false in production)
debug: false
aws_debug: false
spark_debug: false

# Custom URI prefix (optional)
custom_uri_prefix: ""

# Metadata endpoint base URL (for service discovery URIs)
metadata_endpoint_base_url: "https://your-api-gateway.example.com"

# Environment identifier (must appear in storage.uri_base after "://")
environment: "production"

# Storage configuration
storage:
  provider: "s3"
  uri_base: "s3://your-bucket-name/production/delta-tables"
  settings:
    region: "us-west-2"
    endpoint_url: null  # For S3-compatible storage (future feature)

Mount this file into the container:

-v /path/to/config.yaml:/config/config.yaml:ro

Configuration Fields

Field Type Description Default/Required
api_port integer Port for API application 8080
monitor_port integer Port for monitoring application 8081
debug boolean Enable debug mode (allows ?debug=true query param, verbose logging) false
aws_debug boolean Enable AWS SDK debug logging false
spark_debug boolean Enable Spark/Delta/py4j debug logging false
custom_uri_prefix string Custom URI prefix for endpoints (e.g., “/ea-consent”) ””
metadata_endpoint_base_url string Base URL for metadata endpoint URIs (used in service discovery) ✅ Required
environment string Environment identifier (e.g., “production”, “staging”). Must appear in storage.uri_base after :// for validation. ✅ Required
storage.provider string Storage provider (currently only “s3” supported) “s3”
storage.uri_base string Base S3 URI for Delta tables. Must contain the environment value (e.g., “s3://bucket/production/path” when environment=”production”). ✅ Required
storage.settings.region string AWS region for S3 operations ✅ Required
storage.settings.endpoint_url string Custom S3 endpoint URL (for S3-compatible storage, future feature) null

Environment Variables

These variables control runtime behavior and credentials. All secrets should be provided via environment variables or secret stores—never committed to source control.

Variable Description Required Example
CONFIG_PATH Path to YAML configuration file inside container ✅ Required /config/config.yaml
CLIENT_BEARER_TOKEN Bearer token for API authentication. Clients must include Authorization: Bearer <token> header in all API requests. ✅ Required your-secret-token-here
AWS_ACCESS_KEY_ID AWS access key for S3 storage ✅ Required AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY AWS secret key paired with the access key ✅ Required wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
AWS_REGION AWS region for S3 operations (can also be set in config file) ⚙️ Optional us-west-2
SKIP_S3_VALIDATION Skip S3 connectivity check on startup (for local testing only) ⚙️ Optional true

Security notes:

  • Use strong, randomly generated tokens for CLIENT_BEARER_TOKEN (minimum 32 characters)
  • Use IAM roles attached to your container platform (ECS, Kubernetes) instead of static AWS credentials in production
  • Rotate CLIENT_BEARER_TOKEN and AWS credentials every 90 days if IAM roles are not available
  • Never log or expose credentials in application logs or responses
  • Store credentials in secure secret stores (AWS Secrets Manager, HashiCorp Vault, etc.)

S3 Bucket Structure

The service writes Delta tables to S3 in this structure:

s3://your-bucket-name/production/delta-tables/
├── ea-consent-tb/
│   ├── _delta_log/
│   │   └── 00000000000000000000.json
│   └── part-00000-*.parquet
└── ea-consent-verification-keys/
    ├── _delta_log/
    │   └── 00000000000000000000.json
    └── part-00000-*.parquet

Important:

  • Set storage.uri_base to the parent directory (e.g., s3://your-bucket-name/production/delta-tables), NOT to individual table directories
  • Table names use hyphens (not underscores): ea-consent-tb, ea-consent-verification-keys
  • The service automatically creates the two table subdirectories

Startup Validation

On startup, the service performs these validations:

  1. Configuration file syntax - Parses YAML and validates required fields
  2. Environment identifier validation - Ensures environment value appears in storage.uri_base after ://
  3. S3 bucket connectivity - Tests S3 access (unless SKIP_S3_VALIDATION=true)
  4. AWS credentials validity - Verifies credentials can authenticate

If validation fails:

  • Container will not start
  • Error logged to stdout/stderr
  • Exit code non-zero

Example validation failure:

ValueError: Environment 'production' must appear in storage.uri_base after '://'.
Found: 's3://bucket/staging/delta-tables'

Resolution: Fix configuration file and restart container.


5. Health Monitoring

The service provides monitoring endpoints on port 8081 (monitoring application) for health checks, liveness probes, and operational metrics.

Available Monitoring Endpoints

Health Check Endpoint

Purpose: Lightweight health check for container orchestration and load balancers Endpoint: GET /ea-consent-delta-writer/health Port: 8081

curl http://localhost:8081/ea-consent-delta-writer/health

Response (200 OK):

{
  "status": "healthy"
}

Use cases:

  • Docker/Kubernetes liveness probes
  • Load balancer health checks
  • Basic uptime monitoring
  • Container startup verification

Recommended probe configuration:

  • Interval: 30 seconds
  • Timeout: 10 seconds
  • Start period: 40 seconds (allows startup time)
  • Failure threshold: 3 consecutive failures

Status Endpoint

Purpose: Comprehensive system status with process details, memory usage, and ingestion statistics Endpoint: GET /ea-consent-delta-writer/status Port: 8081

curl http://localhost:8081/ea-consent-delta-writer/status

Response (200 OK):

{
  "status": "healthy",
  "version": "1.2.0",
  "system_info": {
    "process_id": 12345,
    "hostname": "ea-consent-delta-writer-7d8f9c",
    "python_version": "3.12.0",
    "fastapi_version": "0.115.0",
    "memory_usage_mb": 156.7,
    "uptime_seconds": 3600.5
  },
  "statistics": {
    "total_requests": 42,
    "successful_requests": 38,
    "partial_requests": 3,
    "failed_requests": 1,
    "total_trust_blocks_submitted": 1250,
    "total_trust_blocks_succeeded": 1235,
    "total_trust_blocks_failed": 15
  }
}

Use cases:

  • Operational dashboards and monitoring
  • Performance analysis and capacity planning
  • Debugging and troubleshooting
  • Service health verification

Important notes:

  • Statistics are tracked in-memory and reset on service restart
  • partial_requests indicates requests where some blocks succeeded and some failed (best-effort processing)
  • Memory usage should be monitored for trends (unexpected growth may indicate issues)

Container Health Checks

The Docker container includes a built-in health check that polls the /health endpoint:

Check service health:

curl http://localhost:8081/ea-consent-delta-writer/health
# Expected: {"status":"healthy"}

Check detailed status:

curl http://localhost:8081/ea-consent-delta-writer/status | jq

Monitoring Best Practices

  1. Liveness Probes: Use /health endpoint for quick liveness checks (lightweight, fast response)
  2. Metrics Collection: Poll /status endpoint every 30-60 seconds for operational metrics
  3. Log Aggregation: Forward container stdout/stderr logs to centralized logging (CloudWatch, Splunk, Datadog)
  4. Alerting Thresholds: Set up alerts for:
    • Service downtime - 3+ consecutive health check failures
    • High failure rates - total_trust_blocks_failed increasing significantly
    • Memory growth - memory_usage_mb trending upward over time
    • Zero activity - total_requests not increasing (possible network isolation)
    • Partial success spike - Sudden increase in partial_requests (data quality issues)
  5. Dashboard Metrics: Track these key metrics over time:
    • Request success rate: successful_requests / total_requests
    • Block success rate: total_trust_blocks_succeeded / total_trust_blocks_submitted
    • Memory usage trend: memory_usage_mb (should be stable)
    • Uptime: uptime_seconds (resets on restarts)

Accessing Logs

View container logs for detailed operational information:

# Follow logs in real-time
docker logs -f ea-consent-delta-writer

# View recent logs (last 100 lines)
docker logs --tail=100 ea-consent-delta-writer

# View logs with timestamps
docker logs -t ea-consent-delta-writer

Log format: JSON structured logging to stdout/stderr with correlation IDs for request tracing.


6. Authorization & Security

The service is designed to operate within a private network or VPC behind an API gateway or reverse proxy.

Security Model

Layer Implementation Status
Network Isolation Private VPC / API Gateway ✅ Required
API Authentication Built-in CLIENT_BEARER_TOKEN validation ✅ Implemented
TLS/HTTPS Reverse proxy or gateway ✅ Recommended
Rate Limiting Gateway or reverse proxy ⚙️ Optional

API Authentication

  • API endpoints require bearer token authentication via CLIENT_BEARER_TOKEN
  • Configured via environment variable at container startup
  • All API requests must include Authorization: Bearer <token> header
  • Unauthenticated requests are rejected with 401 Unauthorized

Additional Gateway Security (Optional):

  • API Gateway can provide additional layers: mTLS, IP allowlisting, OAuth2, rate limiting
  • TLS/HTTPS termination should be handled by reverse proxy or API gateway
  • Monitoring port (8081) should be accessible only to internal monitoring systems

S3 encryption:

  • Use HTTPS for all S3 operations (enforced by boto3 default)
  • Enable S3 bucket encryption at rest (AES-256 or KMS)

Security Features Summary

The following table shows current security capabilities:

Security Checklist for Operators

Before deploying to production, ensure these security controls are in place:

  • Deploy inside private subnet or behind API gateway (no public internet access)
  • Configure CLIENT_BEARER_TOKEN environment variable with strong secret token
  • Enable HTTPS/TLS for all endpoints (via gateway or load balancer)
  • Configure security groups/firewalls to restrict ingress to trusted IP ranges
  • Use IAM roles instead of static AWS credentials
  • Rotate CLIENT_BEARER_TOKEN and AWS credentials every 90 days
  • Enable S3 bucket encryption at rest (AES-256 or KMS)
  • Forward logs to centralized SIEM or security monitoring system
  • Set up alerts for anomalous traffic patterns or failure spikes
  • Implement rate limiting at API gateway or load balancer level
  • Document and test incident response procedures
  • Review and approve all configuration changes before deployment
  • Implement network egress controls (restrict outbound to AWS S3 only)

7. Logs & Metrics

Log Output

Where logs are written:

  • Container stdout/stderr (standard Docker logging)
  • JSON structured logging format
  • No log files inside container (follows 12-factor app principles)

Log types:

  • Startup logs: Configuration loading, S3 validation, port binding
  • API access logs: HTTP requests with method, path, status, duration
  • Processing logs: Trust Block decoding, transformation, Delta table writes
  • Error logs: Validation failures, S3 errors, processing errors
  • Debug logs: Verbose AWS/Spark logging (when debug=true or aws_debug=true)

Log format:

{
  "timestamp": "2024-11-23T10:30:45.123Z",
  "level": "INFO",
  "message": "Trust block processed successfully",
  "correlation_id": "abc123-def456",
  "trust_block_id": "tb_789",
  "ea_consent_id": "consent_012"
}

Correlation IDs: Each API request includes a correlation ID for tracing requests across log entries (via asgi-correlation-id middleware).

Accessing Logs

View container logs:

# Follow logs in real-time
docker logs -f ea-consent-delta-writer

# View recent logs (last 100 lines)
docker logs --tail=100 ea-consent-delta-writer

# View logs with timestamps
docker logs -t ea-consent-delta-writer

# Search logs for errors
docker logs ea-consent-delta-writer 2>&1 | grep -i error

# Search logs for specific correlation ID
docker logs ea-consent-delta-writer 2>&1 | grep "abc123-def456"

Log Verbosity Control

Control log verbosity via configuration file:

# Minimal logging (production default)
debug: false
aws_debug: false
spark_debug: false

# Verbose logging (debugging/troubleshooting)
debug: true
aws_debug: true
spark_debug: true

Log levels:

  • debug: false - INFO level and above (recommended for production)
  • debug: true - DEBUG level (verbose, use for troubleshooting only)
  • aws_debug: true - AWS SDK debug logging (very verbose, use for S3 connectivity issues)
  • spark_debug: true - Spark/Delta/py4j debug logging (very verbose, use for Delta table issues)

Warning: Enabling debug modes significantly increases log volume and may impact performance. Use only for troubleshooting.

Metrics Endpoints

Status endpoint: GET /ea-consent-delta-writer/status (port 8081)

Returns comprehensive metrics in JSON format (see Health Monitoring for full response schema).

Key metrics to track:

  • Request metrics:
    • total_requests - Total API requests received
    • successful_requests - Requests where all blocks succeeded
    • partial_requests - Requests with some successes and some failures
    • failed_requests - Requests that completely failed
  • Trust Block metrics:
    • total_trust_blocks_submitted - Total blocks submitted across all requests
    • total_trust_blocks_succeeded - Blocks successfully processed
    • total_trust_blocks_failed - Blocks that failed processing
  • System metrics:
    • memory_usage_mb - Current process memory usage
    • uptime_seconds - Seconds since service started
    • version - Service version

Log Aggregation and Retention

The service outputs JSON logs to stdout, making it easy to integrate with log aggregation platforms:

8. Troubleshooting

This section covers common operational issues, error scenarios, and resolution steps.

Common Issues

Symptom Likely Cause Resolution
Container exits immediately after startup Missing or invalid configuration file Verify CONFIG_PATH points to valid YAML file. Check volume mount: docker inspect ea-consent-delta-writer and verify config file exists on host. Validate YAML syntax: python -c "import yaml; yaml.safe_load(open('config.yaml'))"
/health endpoint returns 500 or connection refused Service failed to initialize or dependencies unavailable Check container logs: docker logs ea-consent-delta-writer. Look for S3 connectivity errors, configuration validation errors, or Python exceptions. Verify all required environment variables are set (CONFIG_PATH, CLIENT_BEARER_TOKEN, AWS credentials).
API calls return 401 Unauthorized Missing or invalid bearer token Ensure Authorization: Bearer <token> header is included in request. Verify token matches CLIENT_BEARER_TOKEN environment variable set at container startup. Check token for typos or extra whitespace. Test: curl -H "Authorization: Bearer your-token" http://localhost:8080/ea-consent-delta-writer/api/v1/trust-block-batches.
S3 connectivity errors on startup Invalid AWS credentials, incorrect bucket name, or network issues Verify AWS credentials are set correctly. Check IAM permissions include s3:HeadBucket, s3:PutObject, s3:ListBucket. Verify storage.uri_base bucket exists and region matches storage.settings.region. Test S3 access from host: aws s3 ls s3://your-bucket-name/.
Environment validation error on startup environment value not found in storage.uri_base Ensure environment field in config matches a segment in storage.uri_base. Example: if environment: "production", then storage.uri_base must contain “production” after s3:// (e.g., s3://bucket/production/path).
API calls return 422 Unprocessable Entity Invalid request payload or missing required fields Ensure Content-Type: application/json header is set. Verify JSON syntax (no trailing commas). Ensure trust_blocks array contains 1-100 items. Verify each item is a non-empty string. Review error response for specific field validation failures.
Partial success responses (status: "partial_success") Some Trust Blocks failed validation or processing This is expected behavior (best-effort processing). Review results array in response for per-block status and error messages. Common causes: invalid JWT format, missing required fields in JWT payload, Pydantic validation failures. Fix problematic JWTs and resubmit.
High memory usage (memory_usage_mb increasing) Large batch sizes, debug mode enabled, or memory leak Reduce batch size to 50-100 blocks per request. Set debug: false in config (debug responses significantly increase memory). Restart container and monitor memory over time. Report issue if memory continues growing.
S3 write failures (errors in logs) Insufficient IAM permissions, incorrect bucket config, or network issues Verify IAM policy includes s3:PutObject, s3:ListBucket. Check storage.uri_base in config matches actual bucket name and region. Verify VPC/security group allows outbound HTTPS to S3. Check S3 bucket versioning/lifecycle policies for conflicts. Enable AWS debug logging: aws_debug: true in config.
Health check shows “unhealthy” status Service not fully started, port conflicts, or application crash Wait 40 seconds for startup period. Check logs for startup errors. Verify ports 8080 and 8081 are not in use by other processes: netstat -tuln \| grep 808. Verify monitoring port 8081 is exposed in Docker run command.
Zero requests in statistics (total_requests: 0) No traffic reaching container, network isolation, or wrong endpoint URL Verify network connectivity to port 8080. Check firewall/security group rules allow inbound traffic. Verify clients are using correct endpoint URL. Test connectivity: curl http://container-ip:8080/ea-consent-delta-writer/api/v1/trust-block-batches.
Logs not appearing or empty Container crashed at startup, wrong container name, or logging driver issue List running containers: docker ps -a (includes stopped). Check container status and exit code. Try running in foreground to see errors: docker run without -d flag. Verify correct container name in logs command.