API Reference

The API Gateway is a FastAPI application served via Ray Serve on port 8000. It provides endpoints for authentication, job management, inference, drift monitoring, and system health.

Base URL: http://localhost:8000


Authentication

Most endpoints require a JWT Bearer token. Obtain one via POST /auth/token and pass it in the Authorization: Bearer <token> header.

Tokens expire after 15 minutes (configurable via JWT_EXPIRY_SECONDS environment variable). Requests with expired or missing tokens receive HTTP 401.

The following endpoints are unauthenticated (infrastructure probes):

  • GET /health

  • GET /ready

  • GET /metrics


Endpoint Reference

POST /auth/token

Obtain a JWT access token.

Request Body (JSON)

{
  "username": "admin",
  "password": "admin"
}

Response 200

{
  "access_token": "eyJhbGciOiJIUzI1NiIs...",
  "token_type": "bearer",
  "expires_in": 900
}

Response 401

{ "detail": "Invalid credentials" }

Example

curl -X POST http://localhost:8000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"username": "admin", "password": "admin"}'

GET /health

Basic liveness probe. No authentication required.

Response 200

{
  "status": "ok",
  "version": "2.0.0",
  "timestamp": 1714300000.123
}

GET /ready

Readiness probe that pings the GPU worker. No authentication required.

Response 200

{
  "status": "ready",
  "device": "NVIDIA A100 (40.0 GB)"
}

Response 503 — returned when the GPU worker has not finished loading model weights.


GET /metrics

Prometheus metrics endpoint (text/plain). No authentication required. Scraped automatically by Prometheus every 10 seconds.

Key metrics exposed:

Metric Name

Description

api_requests_total

Total HTTP requests labelled by method, endpoint, and status

api_errors_total

Total 4xx/5xx responses labelled by endpoint

inference_latency_seconds

Histogram of end-to-end reconstruction wall-clock time

registered_images_ratio

Fraction of images placed in the last reconstruction

active_jobs_total

Number of currently running reconstruction jobs

model_server_ready

1 if the GPU worker is ready, 0 otherwise

data_valid_images_total

Number of valid images in the current dataset


POST /upload

Upload a ZIP archive and start a reconstruction job. Auth required.

Requestmultipart/form-data

Field

Type

Description

file

File

ZIP archive containing images (.jpg, .jpeg, .png, .tif, .tiff, .bmp, .webp)

dataset_name

string

Logical dataset name (default: "custom")

scene_name

string

Logical scene name (default: "scene_01")

Response 202

{
  "job_id": "3f7a91b2-1234-5678-abcd-ef0123456789",
  "message": "Pipeline started."
}

Response 400 — invalid or non-ZIP file.

Response 413 — upload exceeds the configured size limit.

Example

curl -X POST http://localhost:8000/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@photos.zip" \

GET /status/{job_id} | GET /jobs/{job_id}

Check the status of a reconstruction job. Both paths return identical responses. Auth required.

Path Parameters

  • job_id — UUID returned from POST /upload

Response 200

{
  "job_id": "3f7a91b2-...",
  "stage": "matching",
  "status": "matching",
  "progress": 30,
  "message": "Running MASt3R feature matching on GPU …",
  "created_at": 1714300000.0,
  "started_at": 1714300005.0,
  "finished_at": null,
  "n_images": 45,
  "n_points": 0,
  "registration_rate": null,
  "error": null,
  "download_url": null,
  "has_drift": false,
  "drift_severity": "low"
}

Stage values and their progress percentages:

Stage

Progress

Meaning

queued

0%

Waiting for the pipeline semaphore

extracting

10%

Unpacking the ZIP archive

matching

30%

Running MASt3R + ALIKED + SuperPoint on GPU

triangulating

70%

COLMAP incremental SfM in progress

decimating

85%

Voxel downsampling of point cloud

success

100%

Reconstruction complete

failed

0%

Pipeline error — see error field

Response 404 — job ID not found.


GET /download/jobs/{job_id}

Download all PLY files for a completed job as a ZIP archive. Auth required.

Response 200application/zip containing one or more .ply files.

Response 409 — job is not yet complete.

Response 404 — PLY file not available (reconstruction produced no 3D points).


GET /download/jobs/{job_id}/csv

Download the raw submission CSV in IMC2025 format. Auth required.

Response 200text/csv

Columns: dataset, scene, image, rotation_matrix, translation_vector. Images that could not be registered have semicolon-separated nan values.


GET /download/jobs/{job_id}/{filename}

Download a single named PLY file from a completed job. Auth required.

  • filename — e.g. cluster0_decimated_model0_3f7a91b2.ply

Response 200application/octet-stream


GET /clusters/{job_id}

Retrieve per-cluster reconstruction statistics. Auth required.

Response 200

{
  "clusters": [
    {
      "id": 0,
      "name": "cluster0_model0",
      "num_points3D": 124532,
      "filename": "cluster0_decimated_model0_3f7a91b2.ply"
    }
  ]
}

GET /jobs/{job_id}/insights

Retrieve consolidated reconstruction and drift insights. Auth required.

Response 200

{
  "registration_rate": 0.9333,
  "n_points": 124532,
  "has_drift": false,
  "drift_severity": "low",
  "drift_report": {
    "drift_detected": false,
    "severity": "low",
    "checks": {}
  },
  "recommendation": "No action needed."
}

The recommendation field provides a plain-language action suggestion based on drift severity.


POST /drift

Check a ZIP archive for data drift without starting a reconstruction. Auth required.

Requestmultipart/form-data

  • file — ZIP archive of images

Response 200 — drift report JSON with per-feature drift flags and severity.


POST /drift/trigger-retrain

Manually trigger the Airflow experiment_pipeline_dag retraining DAG. Auth required.

Response 200

{ "status": "triggered" }

Response 502 — Airflow API is unreachable.


Error Responses

All error responses follow FastAPI’s standard format:

{
  "detail": "Human-readable error message"
}

Common HTTP status codes:

Code

Meaning

400

Bad request (invalid file, malformed input)

401

Missing or expired JWT token

404

Resource not found (job ID, PLY file)

409

Conflict (e.g., download requested before job is done)

413

Upload too large

500

Internal server error in the pipeline

502

Upstream service (Airflow) unreachable

503

GPU worker not ready