Installation Guide

This guide covers two installation paths: the recommended Docker-based setup and the native Python setup for development.


Prerequisites

  • Docker >= 24.0 and Docker Compose >= 2.20

  • NVIDIA GPU with CUDA 12.6 support (required for inference)

  • NVIDIA Container Toolkit installed on the host

  • Git and Git LFS

  • At least 16 GB GPU VRAM and 32 GB system RAM recommended


Step 1 — Clone the Repository

git clone https://github.com/your-org/MLOps-Project-ME22B214.git
cd MLOps-Project-ME22B214
git lfs pull          # Downloads pre-trained model weights

Step 2 — Download the Dataset

kaggle competitions download -c image-matching-challenge-2025
unzip image-matching-challenge-2025.zip -d data/
mv data/image-matching-challenge-2025/* data/
rm -r data/image-matching-challenge-2025

The data/ directory should now contain train/, test/, train_labels.csv, and train_thresholds.csv.



Native Python Setup (Developer Mode)

Use this path if you need to develop or debug outside Docker.

Step 3a — Build ASMK

cd extra/
git clone https://github.com/jenicek/asmk
cd asmk/cython/
cythonize *.pyx
cd ..
python -m build --no-isolation
pip install dist/*.whl
cd ../../

Step 3b — Build CroCo / DUSt3R Kernels

DUSt3R relies on RoPE positional embeddings, which require compiled CUDA kernels:

cd extra/
git clone https://github.com/naver/croco.git
cd croco/models/curope/
python -m build --no-isolation
pip install dist/*.whl
cd ../../

Step 3c — Build Remaining Packages

Build any additional packages in bundle/oss/ as .whl files using python -m build --no-isolation in their respective directories, then move the compiled .whl files to bundle/oss/.

Step 3d — Create the Python Virtual Environment

pip install uv
uv venv
source .venv/bin/activate
uv pip install -e .
export LD_LIBRARY_PATH=.venv/lib/python3.11/site-packages/torch/lib:$LD_LIBRARY_PATH

The project requires Python 3.11 exactly (requires-python = "==3.11.*").


Pre-trained Model Weights

Model weights are stored under extra/pretrained_models/ via Git LFS. If you need to download them manually:

Model

Download URL

ALIKED aliked-n16.pth

https://github.com/Shiaoming/ALIKED/raw/main/models/aliked-n16.pth

ISC isc_ft_v107.pth.tar

https://github.com/lyakaap/ISC21-Descriptor-Track-1st/releases/download/v1.0.1/isc_ft_v107.pth.tar

MASt3R main weights

https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth

MASt3R retrieval weights

https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth

MASt3R codebook

https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_codebook.pkl


Verifying the Installation

Once all services are running, verify the stack is healthy:

# API health check
curl http://localhost:8000/health

# GPU worker readiness
curl http://localhost:8000/ready

# Obtain a JWT token
curl -X POST http://localhost:8000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"username": "admin", "password": "admin"}'

A successful /health response looks like:

{
  "status": "ok",
  "version": "2.0.0",
  "timestamp": 1714300000.0
}

Troubleshooting Installation

``ray-serve`` container exits immediately

Check that the NVIDIA Container Toolkit is installed and that docker run --gpus all nvidia/cuda:12.6.3-base-ubuntu22.04 nvidia-smi succeeds.

Port conflicts

If ports 8000, 5000, or 8080 are in use on your host, edit the ports: mappings in docker-compose.yaml before launching.

Airflow DB migration fails

Ensure postgres is healthy before running airflow-init: docker compose logs postgres.

Git LFS quota exceeded

Download model weights manually using the URLs above and place them under extra/pretrained_models/.