EA Consent Delta Writer – Operations Guide
Quick Navigation
This operations guide covers EA Consent Delta Writer deployment and administration in 8 sections:
- Getting Started (1-2): Overview and prerequisites
- Setup & Configuration (3-4): Docker deployment and configuration
- Operations (5-7): Health monitoring, security controls, and logging
- Maintenance (8): Troubleshooting and incident response
New to deploying EA Consent Delta Writer? Start with sections 1-4 for initial deployment, then reference sections 5-8 for ongoing operations and troubleshooting.
Table of Contents
- EA Consent Delta Writer – Operations Guide
1. Overview
The EA Consent Delta Writer is a Docker containerized service that transforms EA Consent Trust Blocks into Delta tables for the Privacy Network. It operates as a stateless HTTP API that ingests TrustBlock JWTs containing Verifiable Credentials, decodes them, and writes structured data to two Delta tables conforming to the EA Consent schema.
What it does:
- Receives batches of Trust Block JWTs via HTTP API (1-100 blocks per request)
- Decodes TrustBlock and ConsentVC JWTs without signature verification (Phase 1)
- Transforms consent data into two Delta Share tables:
ea-consent-tb- Complete Verifiable Credential dataea-consent-verification-keys- Cryptographic public keys for VC verification
- Writes Delta tables to S3 storage using append-only transactions
- Provides health monitoring and runtime statistics
How it runs:
- Single Docker container with two HTTP applications on separate ports
- Port 8080: API endpoints for Trust Block ingestion
- Port 8081: Monitoring endpoints for health checks and statistics
- Tini init system for proper multi-process signal handling
- Non-root user (UID 1000) for container security
How it fits in the Privacy Network:
- Upstream dependency: EA Consent Issuer produces Trust Block JWTs
- Downstream consumers: Delta Sharing Server (delta-io/delta-sharing) serves the Delta tables
- Storage: AWS S3 for Delta table persistence
- Deployment boundary: Should run in private network behind API gateway or VPC
Important: This service writes Delta tables to S3. It does NOT serve the Delta Share protocol. For Delta Share serving, deploy the separate open-source delta-io/delta-sharing server.
2. Prerequisites
Required
- Docker (version 20.10 or later)
- Install from docker.com
- Verify installation:
docker --version
- AWS S3 Access
- AWS account with S3 service enabled
- IAM user or role with S3 permissions (see IAM policy below)
- S3 bucket created in your AWS region (e.g.,
us-west-2) - Note your bucket name and desired prefix path
- Network Configuration
- Outbound HTTPS access to AWS S3 endpoints
- Inbound access control for ports 8080 and 8081 (private network recommended)
- Configuration File
- YAML configuration file defining storage backend, ports, and environment settings
- See Configuration section for structure
Recommended
- Private network deployment - Deploy in VPC with private subnets and security groups
- Centralized logging - Forward container logs to CloudWatch, Splunk, or similar
- API Gateway (optional) - Adds authentication, rate limiting, and additional security layer
- IAM roles (production) - Use EC2/ECS instance roles instead of static credentials
IAM Permissions Required
The service requires these S3 permissions on your target bucket:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:HeadBucket"
],
"Resource": [
"arn:aws:s3:::your-bucket-name",
"arn:aws:s3:::your-bucket-name/*"
]
}
]
}
3. Deployment
This section documents the container’s requirements so you can deploy it using your chosen infrastructure. The minimal docker run example below shows essential flags that partners can adapt to their deployment tools (docker-compose, Kubernetes, ECS, etc.).
Container Image
Registry: Docker Hub
Image name: webshield/ea-consent-delta-writer
Tag: :1.2 (version 1.2)
Pull the image:
docker pull webshield/ea-consent-delta-writer:1.2
Minimal Docker Run Example
This example shows the container’s essential requirements:
docker run -d \
--name ea-consent-delta-writer \
-p 8080:8080 \
-p 8081:8081 \
-v /path/to/config.yaml:/config/config.yaml:ro \
-e CONFIG_PATH=/config/config.yaml \
-e CLIENT_BEARER_TOKEN=your-secret-token-here \
-e AWS_ACCESS_KEY_ID=YOUR_AWS_KEY \
-e AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET \
-e AWS_REGION=us-west-2 \
webshield/ea-consent-delta-writer:1.2
Required Flags:
- Ports (
-p): Expose both 8080 (API) and 8081 (monitoring) - Volume (
-v): Mount YAML config file (read-only recommended) - Environment (
-e): CONFIG_PATH, CLIENT_BEARER_TOKEN, AWS credentials, AWS region
Container Requirements
Exposed Ports:
- 8080 - API application (Trust Block ingestion endpoints)
- 8081 - Monitoring application (health checks, status, metrics)
Volume Mounts:
/config/config.yaml- YAML configuration file (required, read-only recommended)
Environment Variables: See Configuration section for complete variable reference.
Base Image:
- Python 3.12 slim
- Non-root user (UID 1000)
- Tini init system for signal handling
Health Check Endpoint:
GET http://localhost:8081/ea-consent-delta-writer/health- Container includes built-in health check (30s interval, 40s start period)
Dependencies:
- AWS S3 connectivity (outbound HTTPS)
- No database required (stateless service)
Verification Steps
After starting the container, verify it’s running correctly:
# 1. Check container is running
docker ps | grep ea-consent-delta-writer
# 2. Check health endpoint (should return {"status":"healthy"})
curl http://localhost:8081/ea-consent-delta-writer/health
# 3. Check status endpoint (returns detailed system info)
curl http://localhost:8081/ea-consent-delta-writer/status
# 4. View container logs
docker logs ea-consent-delta-writer
Expected startup behavior:
- Container starts two Uvicorn processes (API on 8080, monitoring on 8081)
- Validates configuration file on startup
- Validates S3 bucket connectivity (unless
SKIP_S3_VALIDATION=true) - Health endpoint returns 200 OK within 40 seconds
Network Requirements
Outbound (container to external services):
- HTTPS to AWS S3 endpoints (port 443)
- DNS resolution for
s3.*.amazonaws.com
Inbound (external to container):
- Port 8080 - Should be restricted to trusted sources (API gateway, internal network)
- Port 8081 - Can be restricted to monitoring systems only
Recommended network isolation:
- Deploy in private subnet with no public IP
- Use security groups to whitelist known sources
- Front with API gateway or load balancer for additional security
4. Configuration
The service is configured via a YAML configuration file and environment variables.
Configuration File Structure
Create a YAML configuration file (e.g., config.yaml):
# Port configuration
api_port: 8080
monitor_port: 8081
# Debug settings (set false in production)
debug: false
aws_debug: false
spark_debug: false
# Custom URI prefix (optional)
custom_uri_prefix: ""
# Metadata endpoint base URL (for service discovery URIs)
metadata_endpoint_base_url: "https://your-api-gateway.example.com"
# Environment identifier (must appear in storage.uri_base after "://")
environment: "production"
# Storage configuration
storage:
provider: "s3"
uri_base: "s3://your-bucket-name/production/delta-tables"
settings:
region: "us-west-2"
endpoint_url: null # For S3-compatible storage (future feature)
Mount this file into the container:
-v /path/to/config.yaml:/config/config.yaml:ro
Configuration Fields
| Field | Type | Description | Default/Required |
|---|---|---|---|
api_port |
integer | Port for API application | 8080 |
monitor_port |
integer | Port for monitoring application | 8081 |
debug |
boolean | Enable debug mode (allows ?debug=true query param, verbose logging) |
false |
aws_debug |
boolean | Enable AWS SDK debug logging | false |
spark_debug |
boolean | Enable Spark/Delta/py4j debug logging | false |
custom_uri_prefix |
string | Custom URI prefix for endpoints (e.g., “/ea-consent”) | ”” |
metadata_endpoint_base_url |
string | Base URL for metadata endpoint URIs (used in service discovery) | ✅ Required |
environment |
string | Environment identifier (e.g., “production”, “staging”). Must appear in storage.uri_base after :// for validation. |
✅ Required |
storage.provider |
string | Storage provider (currently only “s3” supported) | “s3” |
storage.uri_base |
string | Base S3 URI for Delta tables. Must contain the environment value (e.g., “s3://bucket/production/path” when environment=”production”). |
✅ Required |
storage.settings.region |
string | AWS region for S3 operations | ✅ Required |
storage.settings.endpoint_url |
string | Custom S3 endpoint URL (for S3-compatible storage, future feature) | null |
Environment Variables
These variables control runtime behavior and credentials. All secrets should be provided via environment variables or secret stores—never committed to source control.
| Variable | Description | Required | Example |
|---|---|---|---|
CONFIG_PATH |
Path to YAML configuration file inside container | ✅ Required | /config/config.yaml |
CLIENT_BEARER_TOKEN |
Bearer token for API authentication. Clients must include Authorization: Bearer <token> header in all API requests. |
✅ Required | your-secret-token-here |
AWS_ACCESS_KEY_ID |
AWS access key for S3 storage | ✅ Required | AKIAIOSFODNN7EXAMPLE |
AWS_SECRET_ACCESS_KEY |
AWS secret key paired with the access key | ✅ Required | wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY |
AWS_REGION |
AWS region for S3 operations (can also be set in config file) | ⚙️ Optional | us-west-2 |
SKIP_S3_VALIDATION |
Skip S3 connectivity check on startup (for local testing only) | ⚙️ Optional | true |
Security notes:
- Use strong, randomly generated tokens for CLIENT_BEARER_TOKEN (minimum 32 characters)
- Use IAM roles attached to your container platform (ECS, Kubernetes) instead of static AWS credentials in production
- Rotate CLIENT_BEARER_TOKEN and AWS credentials every 90 days if IAM roles are not available
- Never log or expose credentials in application logs or responses
- Store credentials in secure secret stores (AWS Secrets Manager, HashiCorp Vault, etc.)
S3 Bucket Structure
The service writes Delta tables to S3 in this structure:
s3://your-bucket-name/production/delta-tables/
├── ea-consent-tb/
│ ├── _delta_log/
│ │ └── 00000000000000000000.json
│ └── part-00000-*.parquet
└── ea-consent-verification-keys/
├── _delta_log/
│ └── 00000000000000000000.json
└── part-00000-*.parquet
Important:
- Set
storage.uri_baseto the parent directory (e.g.,s3://your-bucket-name/production/delta-tables), NOT to individual table directories - Table names use hyphens (not underscores):
ea-consent-tb,ea-consent-verification-keys - The service automatically creates the two table subdirectories
Startup Validation
On startup, the service performs these validations:
- Configuration file syntax - Parses YAML and validates required fields
- Environment identifier validation - Ensures
environmentvalue appears instorage.uri_baseafter:// - S3 bucket connectivity - Tests S3 access (unless
SKIP_S3_VALIDATION=true) - AWS credentials validity - Verifies credentials can authenticate
If validation fails:
- Container will not start
- Error logged to stdout/stderr
- Exit code non-zero
Example validation failure:
ValueError: Environment 'production' must appear in storage.uri_base after '://'.
Found: 's3://bucket/staging/delta-tables'
Resolution: Fix configuration file and restart container.
5. Health Monitoring
The service provides monitoring endpoints on port 8081 (monitoring application) for health checks, liveness probes, and operational metrics.
Available Monitoring Endpoints
Health Check Endpoint
Purpose: Lightweight health check for container orchestration and load balancers
Endpoint: GET /ea-consent-delta-writer/health
Port: 8081
curl http://localhost:8081/ea-consent-delta-writer/health
Response (200 OK):
{
"status": "healthy"
}
Use cases:
- Docker/Kubernetes liveness probes
- Load balancer health checks
- Basic uptime monitoring
- Container startup verification
Recommended probe configuration:
- Interval: 30 seconds
- Timeout: 10 seconds
- Start period: 40 seconds (allows startup time)
- Failure threshold: 3 consecutive failures
Status Endpoint
Purpose: Comprehensive system status with process details, memory usage, and ingestion statistics
Endpoint: GET /ea-consent-delta-writer/status
Port: 8081
curl http://localhost:8081/ea-consent-delta-writer/status
Response (200 OK):
{
"status": "healthy",
"version": "1.2.0",
"system_info": {
"process_id": 12345,
"hostname": "ea-consent-delta-writer-7d8f9c",
"python_version": "3.12.0",
"fastapi_version": "0.115.0",
"memory_usage_mb": 156.7,
"uptime_seconds": 3600.5
},
"statistics": {
"total_requests": 42,
"successful_requests": 38,
"partial_requests": 3,
"failed_requests": 1,
"total_trust_blocks_submitted": 1250,
"total_trust_blocks_succeeded": 1235,
"total_trust_blocks_failed": 15
}
}
Use cases:
- Operational dashboards and monitoring
- Performance analysis and capacity planning
- Debugging and troubleshooting
- Service health verification
Important notes:
- Statistics are tracked in-memory and reset on service restart
partial_requestsindicates requests where some blocks succeeded and some failed (best-effort processing)- Memory usage should be monitored for trends (unexpected growth may indicate issues)
Container Health Checks
The Docker container includes a built-in health check that polls the /health endpoint:
Check service health:
curl http://localhost:8081/ea-consent-delta-writer/health
# Expected: {"status":"healthy"}
Check detailed status:
curl http://localhost:8081/ea-consent-delta-writer/status | jq
Monitoring Best Practices
- Liveness Probes: Use
/healthendpoint for quick liveness checks (lightweight, fast response) - Metrics Collection: Poll
/statusendpoint every 30-60 seconds for operational metrics - Log Aggregation: Forward container stdout/stderr logs to centralized logging (CloudWatch, Splunk, Datadog)
- Alerting Thresholds: Set up alerts for:
- Service downtime - 3+ consecutive health check failures
- High failure rates -
total_trust_blocks_failedincreasing significantly - Memory growth -
memory_usage_mbtrending upward over time - Zero activity -
total_requestsnot increasing (possible network isolation) - Partial success spike - Sudden increase in
partial_requests(data quality issues)
- Dashboard Metrics: Track these key metrics over time:
- Request success rate:
successful_requests / total_requests - Block success rate:
total_trust_blocks_succeeded / total_trust_blocks_submitted - Memory usage trend:
memory_usage_mb(should be stable) - Uptime:
uptime_seconds(resets on restarts)
- Request success rate:
Accessing Logs
View container logs for detailed operational information:
# Follow logs in real-time
docker logs -f ea-consent-delta-writer
# View recent logs (last 100 lines)
docker logs --tail=100 ea-consent-delta-writer
# View logs with timestamps
docker logs -t ea-consent-delta-writer
Log format: JSON structured logging to stdout/stderr with correlation IDs for request tracing.
6. Authorization & Security
The service is designed to operate within a private network or VPC behind an API gateway or reverse proxy.
Security Model
| Layer | Implementation | Status |
|---|---|---|
| Network Isolation | Private VPC / API Gateway | ✅ Required |
| API Authentication | Built-in CLIENT_BEARER_TOKEN validation | ✅ Implemented |
| TLS/HTTPS | Reverse proxy or gateway | ✅ Recommended |
| Rate Limiting | Gateway or reverse proxy | ⚙️ Optional |
API Authentication
- API endpoints require bearer token authentication via
CLIENT_BEARER_TOKEN - Configured via environment variable at container startup
- All API requests must include
Authorization: Bearer <token>header - Unauthenticated requests are rejected with 401 Unauthorized
Additional Gateway Security (Optional):
- API Gateway can provide additional layers: mTLS, IP allowlisting, OAuth2, rate limiting
- TLS/HTTPS termination should be handled by reverse proxy or API gateway
- Monitoring port (8081) should be accessible only to internal monitoring systems
S3 encryption:
- Use HTTPS for all S3 operations (enforced by boto3 default)
- Enable S3 bucket encryption at rest (AES-256 or KMS)
Security Features Summary
The following table shows current security capabilities:
Security Checklist for Operators
Before deploying to production, ensure these security controls are in place:
- Deploy inside private subnet or behind API gateway (no public internet access)
- Configure CLIENT_BEARER_TOKEN environment variable with strong secret token
- Enable HTTPS/TLS for all endpoints (via gateway or load balancer)
- Configure security groups/firewalls to restrict ingress to trusted IP ranges
- Use IAM roles instead of static AWS credentials
- Rotate CLIENT_BEARER_TOKEN and AWS credentials every 90 days
- Enable S3 bucket encryption at rest (AES-256 or KMS)
- Forward logs to centralized SIEM or security monitoring system
- Set up alerts for anomalous traffic patterns or failure spikes
- Implement rate limiting at API gateway or load balancer level
- Document and test incident response procedures
- Review and approve all configuration changes before deployment
- Implement network egress controls (restrict outbound to AWS S3 only)
7. Logs & Metrics
Log Output
Where logs are written:
- Container stdout/stderr (standard Docker logging)
- JSON structured logging format
- No log files inside container (follows 12-factor app principles)
Log types:
- Startup logs: Configuration loading, S3 validation, port binding
- API access logs: HTTP requests with method, path, status, duration
- Processing logs: Trust Block decoding, transformation, Delta table writes
- Error logs: Validation failures, S3 errors, processing errors
- Debug logs: Verbose AWS/Spark logging (when
debug=trueoraws_debug=true)
Log format:
{
"timestamp": "2024-11-23T10:30:45.123Z",
"level": "INFO",
"message": "Trust block processed successfully",
"correlation_id": "abc123-def456",
"trust_block_id": "tb_789",
"ea_consent_id": "consent_012"
}
Correlation IDs: Each API request includes a correlation ID for tracing requests across log entries (via asgi-correlation-id middleware).
Accessing Logs
View container logs:
# Follow logs in real-time
docker logs -f ea-consent-delta-writer
# View recent logs (last 100 lines)
docker logs --tail=100 ea-consent-delta-writer
# View logs with timestamps
docker logs -t ea-consent-delta-writer
# Search logs for errors
docker logs ea-consent-delta-writer 2>&1 | grep -i error
# Search logs for specific correlation ID
docker logs ea-consent-delta-writer 2>&1 | grep "abc123-def456"
Log Verbosity Control
Control log verbosity via configuration file:
# Minimal logging (production default)
debug: false
aws_debug: false
spark_debug: false
# Verbose logging (debugging/troubleshooting)
debug: true
aws_debug: true
spark_debug: true
Log levels:
debug: false- INFO level and above (recommended for production)debug: true- DEBUG level (verbose, use for troubleshooting only)aws_debug: true- AWS SDK debug logging (very verbose, use for S3 connectivity issues)spark_debug: true- Spark/Delta/py4j debug logging (very verbose, use for Delta table issues)
Warning: Enabling debug modes significantly increases log volume and may impact performance. Use only for troubleshooting.
Metrics Endpoints
Status endpoint: GET /ea-consent-delta-writer/status (port 8081)
Returns comprehensive metrics in JSON format (see Health Monitoring for full response schema).
Key metrics to track:
- Request metrics:
total_requests- Total API requests receivedsuccessful_requests- Requests where all blocks succeededpartial_requests- Requests with some successes and some failuresfailed_requests- Requests that completely failed
- Trust Block metrics:
total_trust_blocks_submitted- Total blocks submitted across all requeststotal_trust_blocks_succeeded- Blocks successfully processedtotal_trust_blocks_failed- Blocks that failed processing
- System metrics:
memory_usage_mb- Current process memory usageuptime_seconds- Seconds since service startedversion- Service version
Log Aggregation and Retention
The service outputs JSON logs to stdout, making it easy to integrate with log aggregation platforms:
8. Troubleshooting
This section covers common operational issues, error scenarios, and resolution steps.
Common Issues
| Symptom | Likely Cause | Resolution |
|---|---|---|
| Container exits immediately after startup | Missing or invalid configuration file | Verify CONFIG_PATH points to valid YAML file. Check volume mount: docker inspect ea-consent-delta-writer and verify config file exists on host. Validate YAML syntax: python -c "import yaml; yaml.safe_load(open('config.yaml'))" |
/health endpoint returns 500 or connection refused |
Service failed to initialize or dependencies unavailable | Check container logs: docker logs ea-consent-delta-writer. Look for S3 connectivity errors, configuration validation errors, or Python exceptions. Verify all required environment variables are set (CONFIG_PATH, CLIENT_BEARER_TOKEN, AWS credentials). |
| API calls return 401 Unauthorized | Missing or invalid bearer token | Ensure Authorization: Bearer <token> header is included in request. Verify token matches CLIENT_BEARER_TOKEN environment variable set at container startup. Check token for typos or extra whitespace. Test: curl -H "Authorization: Bearer your-token" http://localhost:8080/ea-consent-delta-writer/api/v1/trust-block-batches. |
| S3 connectivity errors on startup | Invalid AWS credentials, incorrect bucket name, or network issues | Verify AWS credentials are set correctly. Check IAM permissions include s3:HeadBucket, s3:PutObject, s3:ListBucket. Verify storage.uri_base bucket exists and region matches storage.settings.region. Test S3 access from host: aws s3 ls s3://your-bucket-name/. |
| Environment validation error on startup | environment value not found in storage.uri_base |
Ensure environment field in config matches a segment in storage.uri_base. Example: if environment: "production", then storage.uri_base must contain “production” after s3:// (e.g., s3://bucket/production/path). |
| API calls return 422 Unprocessable Entity | Invalid request payload or missing required fields | Ensure Content-Type: application/json header is set. Verify JSON syntax (no trailing commas). Ensure trust_blocks array contains 1-100 items. Verify each item is a non-empty string. Review error response for specific field validation failures. |
Partial success responses (status: "partial_success") |
Some Trust Blocks failed validation or processing | This is expected behavior (best-effort processing). Review results array in response for per-block status and error messages. Common causes: invalid JWT format, missing required fields in JWT payload, Pydantic validation failures. Fix problematic JWTs and resubmit. |
High memory usage (memory_usage_mb increasing) |
Large batch sizes, debug mode enabled, or memory leak | Reduce batch size to 50-100 blocks per request. Set debug: false in config (debug responses significantly increase memory). Restart container and monitor memory over time. Report issue if memory continues growing. |
| S3 write failures (errors in logs) | Insufficient IAM permissions, incorrect bucket config, or network issues | Verify IAM policy includes s3:PutObject, s3:ListBucket. Check storage.uri_base in config matches actual bucket name and region. Verify VPC/security group allows outbound HTTPS to S3. Check S3 bucket versioning/lifecycle policies for conflicts. Enable AWS debug logging: aws_debug: true in config. |
| Health check shows “unhealthy” status | Service not fully started, port conflicts, or application crash | Wait 40 seconds for startup period. Check logs for startup errors. Verify ports 8080 and 8081 are not in use by other processes: netstat -tuln \| grep 808. Verify monitoring port 8081 is exposed in Docker run command. |
Zero requests in statistics (total_requests: 0) |
No traffic reaching container, network isolation, or wrong endpoint URL | Verify network connectivity to port 8080. Check firewall/security group rules allow inbound traffic. Verify clients are using correct endpoint URL. Test connectivity: curl http://container-ip:8080/ea-consent-delta-writer/api/v1/trust-block-batches. |
| Logs not appearing or empty | Container crashed at startup, wrong container name, or logging driver issue | List running containers: docker ps -a (includes stopped). Check container status and exit code. Try running in foreground to see errors: docker run without -d flag. Verify correct container name in logs command. |