Security & Compliance

This page documents the security controls implemented in the system, including secrets management, API authentication, access logging, TLS configuration, and CI/CD security scanning.


Overview of Security Controls

Control

Implementation

Secrets management

Docker Secrets (file-based, mounted at /run/secrets/)

API authentication

JWT Bearer tokens (HS256, 15-minute expiry)

Access logging

Structured JSON logs with anonymised client IPs

Transport security

TLS 1.3 via nginx (external termination)

Container scanning

Trivy on every push (CRITICAL/HIGH fails CI)

Dependency auditing

pip-audit on every push


1. Docker Secrets Management

Plaintext passwords are never stored in environment variables or committed to version control. Two secrets are provisioned via Docker Secrets:

Secret Name

File Path (container)

Used By

jwt_secret

/run/secrets/jwt_secret

ray-serve (API token signing)

grafana_admin_password

/run/secrets/grafana_admin_password

grafana (admin login)

Generating secrets

./generate_secrets.sh
chmod 644 ./secrets/*

The secrets/ directory is listed in .gitignore.

How secrets are loaded in code

The api/auth.py module reads secrets with a Docker Secrets → environment variable → hardcoded-default cascade:

def _load_secret(name: str, env_fallback: str, default: str = "") -> str:
    secret_path = Path(f"/run/secrets/{name}")
    if secret_path.exists():
        return secret_path.read_text().strip()
    return os.environ.get(env_fallback, default)

In production (Docker), /run/secrets/jwt_secret is always present. In development (local), set the JWT_SECRET environment variable.

Warning

The default development fallback key ("dev-only-insecure-key") must never be used in production. Always run generate_secrets.sh before deploying.


2. JWT Bearer Token Authentication

All API endpoints except /health, /ready, and /metrics are protected by JWT Bearer token authentication.

Token lifecycle

  1. Client calls POST /auth/token with username and password.

  2. The API validates credentials against the configured hash.

  3. A signed HS256 JWT is returned with a 15-minute expiry (exp claim).

  4. The client includes the token in Authorization: Bearer <token> headers.

  5. The get_current_user FastAPI dependency validates the signature and expiry on every protected request.

Token structure

Tokens are standard JWT with these payload fields:

  • sub — username

  • iat — issued-at timestamp

  • exp — expiry timestamp (iat + 900 seconds by default)

Implementation details

The JWT implementation uses only Python standard library (hmac, hashlib) — no external JWT library is required. Signature verification uses hmac.compare_digest to prevent timing attacks.

Disabling authentication for development

Set AUTH_ENABLED=false in the ray-serve environment:

environment:
  AUTH_ENABLED: "false"

Warning

This disables all authentication checks and returns "anonymous" for all requests. Never use this in production.


3. Access Logging

Every HTTP request is logged as a structured JSON record by the AccessLogMiddleware (api/middleware.py).

Log fields

Field

Description

request_id

UUID4, unique per request; also returned in X-Request-ID header

timestamp

ISO 8601 UTC timestamp

method

HTTP method (GET, POST, etc.)

path

URL path

query

Query string (omitted if empty)

status

HTTP response status code

duration_ms

Request processing time in milliseconds

client_ip

Anonymised IPv4 address (last octet replaced with xxx)

user_agent

Truncated to 120 characters

Example log entry

{
  "request_id": "a1b2c3d4-...",
  "timestamp": "2024-04-28T10:00:00.123+00:00",
  "method": "POST",
  "path": "/upload",
  "status": 202,
  "duration_ms": 45.2,
  "client_ip": "192.168.1.xxx",
  "user_agent": "Mozilla/5.0 ..."
}

IP anonymisation

For IPv4, the last octet is replaced with xxx. For IPv6, the final segment is replaced. This provides sufficient traceability for security audits while protecting individual user privacy.


4. TLS 1.3 (Transport Security)

External traffic is protected by TLS 1.3 via nginx:

  • Port 80 redirects to HTTPS (except /health for Docker healthchecks).

  • Port 443 serves the frontend with ssl_protocols TLSv1.3.

  • Certificates are mounted from ./certs/ as a read-only Docker volume.

Generating self-signed certificates (development)

./generate-certs.sh

For production, replace certs/server.crt and certs/server.key with certificates from a trusted Certificate Authority (e.g., Let’s Encrypt).

The certs/ directory is listed in .gitignore.


5. CI/CD Security Scanning

The GitHub Actions workflow .github/workflows/security.yml runs on every push to main or develop and every pull request.

Trivy container image scanning

Three Docker images are scanned:

  • ray-serve (api/Dockerfile)

  • scene3d-ui (frontend/Dockerfile)

  • ml-pipeline-builder (Dockerfile.pipeline)

The scan fails the CI pipeline on any CRITICAL or HIGH severity vulnerability that has a fix available. SARIF reports are uploaded to the GitHub Security tab.

pip-audit dependency auditing

Python dependencies from both pyproject.toml (root pipeline) and api/pyproject.toml (API container) are audited for known CVEs. Reports are uploaded as workflow artifacts.


Protected API Endpoints

The following endpoints require a valid JWT token:

/upload, /drift, /drift/trigger-retrain, /status/{job_id}, /jobs/{job_id}, /download/*, /clusters/*, /jobs/{job_id}/insights

The following endpoints are unauthenticated (infrastructure use only):

/health, /ready, /metrics, /auth/token


Alertmanager Authentication

The Alertmanager webhook that triggers Airflow retraining DAGs uses HTTP Basic Auth within the mlops_net Docker network:

basic_auth:
  username: 'airflow'
  password: 'airflow'

For external access, this should be replaced with a scoped service account token and the Airflow API should not be exposed outside the Docker network.


Environment Variables

The following environment variables affect security behaviour:

Variable

Default

Description

AUTH_ENABLED

true

Set to false to disable JWT auth (dev only)

JWT_EXPIRY_SECONDS

900

Token lifetime in seconds

JWT_SECRET

(from Docker Secret)

HMAC signing key fallback for local dev

API_USERNAME

admin

API login username

API_PASSWORD_HASH

SHA-256 of “admin”

SHA-256 hex digest of the API password


Security Recommendations for Production

  • Replace the default admin/admin credentials with a strong password and store its SHA-256 hash in a Docker Secret (api_password_hash).

  • Use certificates from a trusted CA instead of self-signed certificates.

  • Restrict mlops_net to internal-only traffic; never expose Airflow, MLflow, or Prometheus ports to the public internet.

  • Enable Grafana authentication (set GF_AUTH_ANONYMOUS_ENABLED=false).

  • Rotate jwt_secret and grafana_admin_password regularly.

  • Review Trivy and pip-audit reports in the GitHub Security tab after every push.