User Guide

In the following sections, we will describe how to create an algorithm module for Compox.

How to create an algorithm module

The algorithm module is a Python package that contains the algorithm code and assets. The algorithm module should be structured in a specific way in order to work properly with Compox.

See also:

  • ../docs/algorithm_training_guide.md

  • ../docs/training_client_workflow.md

The algorithm should be structured as follows:

algorithm_name/
    |-- __init__.py
    |-- Runner.py
    |-- pyproject.toml
    |-- files/
    |   |-- file1
    |   `-- file2
    `-- some_internal_submodule/
        |-- __init__.py
        |-- module1.py
        `-- module2.py

The Runner.py file

The Runner.py file is a mandatory component of the algorithm module. It serves as the entry point for Compox to run the algorithm. It must define a class named Runner. The Runner class can inherit from BaseRunner (for generic behavior) or from a Runner class specific to the algorithm type (see below). Runner classes can be imported from the compox.algorithm_utils package.

Why this exists: Compox always loads and instantiates Runner as the algorithm entry point, so keeping it in a predictable location allows deployment, caching, and execution to work consistently.

Algorithm types

The algorithm type is defined in the algorithm’s pyproject.toml file. Your Runner inheritance should match the declared type, but Compox does not infer the type from the class. If you inherit from BaseRunner, set algorithm_type = "Generic" (or leave it as Undefined for development, but not for production). For example, an Image2Image algorithm receives an image as input and returns an image as output. In that case, the pyproject.toml file should contain:

[tool.compox]
algorithm_type = "Image2Image"

and the Runner should inherit from the matching Runner class:

from compox.algorithm_utils.Image2ImageRunner import Image2ImageRunner

class Runner(Image2ImageRunner):
    """
    The runner class for the denoiser algorithm.
    """

The following algorithm types are currently supported:

  • Image2Image

  • Image2Embedding

  • Image2Segmentation

  • Image2Alignment

  • Segmentation2Segmentation

  • Generic

Undefined exists as a fallback/default but should not be used for real algorithms.

Why this exists: typed runners provide schema and convenience helpers so you can focus on model logic instead of wiring input/output formats. BaseRunner is for algorithms with custom schemas or non-standard inputs/outputs.

Algorithm tags

The algorithm tags are a useful tool to categorize algorithms for frontend applications. Tags allow clients to assume that algorithms with the same tag follow the same input/output schemas.

Why this exists: clients can filter and group algorithms by capability (e.g., “denoising”), and safely assume consistent I/O across similarly tagged algorithms.

The preprocess, inference and postprocess methods

The run method calls preprocess, then inference, then postprocess. Each of these methods accepts two arguments (after self): the input data for that stage and a dictionary of user arguments (args).

  • preprocess(self, input_data: dict, args: dict | None = None) typically loads data using fetch_data, prepares it, and returns the result for inference.

  • inference(self, data: Any, args: dict | None = None) runs the model or algorithm.

  • postprocess(self, data: Any, args: dict | None = None) should upload output datasets using post_data and return a list of dataset IDs.

The input_data dictionary contains identifiers provided by the user (commonly input_dataset_ids).

Why this exists: separating the pipeline makes data flow and logging explicit, enables progress reporting, and allows easier debugging.

The fetch_data method for BaseRunner

fetch_data retrieves datasets by IDs and validates them using a Pydantic schema. It expects a list of file ID strings.

Example of fetching data:

embeddings = self.fetch_data(input_data["input_dataset_ids"], EmbeddingSchema)

The Pydantic schemas are defined in compox/src/compox/algorithm_utils/io_schemas.py, but you are not required to use them. You can define your own schemas by inheriting from DataSchema (useful for type checking and validation).

Why this exists: schemas provide consistent validation and type hints for downstream code, while still allowing custom formats when needed.

The fetch_data method for specific algorithm types

Runner subclasses for specific algorithm types use predefined schemas, so fetch_data does not take a schema argument. It still expects a list of file ID strings.

Example for Image2Image:

input_data = self.fetch_data(input_data["input_dataset_ids"])

This fetches datasets validated against the ImageSchema.

The post_data method for BaseRunner

post_data uploads output datasets and validates them with a Pydantic schema. It expects a list of dictionaries, one per output dataset.

Example of posting data:

output_dataset_ids = self.post_data(output, MaskSchema)

The post_data method for specific algorithm types

For specific algorithm types, post_data uses predefined schemas, so no schema argument is needed.

Example for Image2Image:

output_dataset_ids = self.post_data(output)

The load_assets method

You can override load_assets to load model weights or other files once and cache them on the Runner instance. Use self.fetch_asset(...) to load files stored in the algorithm package. Paths are relative to the Runner module root (e.g., files/weights.pt). fetch_asset returns an io.BytesIO object that you can pass to libraries like torch.load.

Example:

state_dict_bytes = self.fetch_asset("files/vit_b.pt")
state_dict = torch.load(state_dict_bytes)

Why this exists: model weights and large resources are expensive to load, so Compox caches them on the Runner instance for reuse across requests. These attributes are locked to avoid unsafe mutation across threads.

The log_message method

Log messages to Compox:

self.log_message("This is an info message.", logging_level="INFO")

The set_progress method

Report execution progress (float between 0 and 1):

self.set_progress(0.5)

Sessions (optional)

Executions can be associated with a session_token to reuse an in‑memory cache across runs. From a Runner perspective, this cache is accessed via:

  • save_item_to_session(obj, key)

  • load_item_from_session(key)

  • remove_item_from_session(key)

Why this exists: some algorithms benefit from reusing expensive intermediates (e.g., feature caches, preprocessed inputs) across multiple executions without reloading from storage.

Notes:

  • Sessions are FastAPI background task only. Celery mode does not support sessions.

  • Sessions are in‑memory (single process) and expire after a fixed timeout, so treat them as an optimization rather than persistent storage.

  • The client supplies session_token on execution requests; the server can also generate one when missing.

The pyproject.toml file

The pyproject.toml file contains algorithm metadata. It must be in the algorithm root.

Mandatory fields

[project]
name = "algorithm_name"
version = "major.minor.patch"

Why this exists: Compox uses name + major version to identify an algorithm line and uses minor versions to track distinct builds.

Versioning behavior (AlgorithmDeployer)

Compox derives versioning from the [project] version string in pyproject.toml:

  • Major version = the first segment (before the first dot)

  • Minor version = the second segment (between first and second dot)

  • Patch version is currently ignored by the deployer

When an algorithm is deployed, Compox searches the algorithm store for an existing record with the same algorithm name and major version. The behavior is:

  • If found: Compox compares the newly built module ID and assets dictionary with the latest stored minor version. If either differs, it inserts a new minor version entry and increments latest_algorithm_minor_version. If both are identical, it does not insert a new minor version.

  • If not found: Compox creates a new algorithm record with latest_algorithm_minor_version initialized from the project.version minor segment, and stores the module/assets under that.

Notes:

  • The stored minor versions are not the original pyproject.toml patch version; only the major/minor segments drive versioning.

  • Re-deploying the same algorithm with identical module and assets is a no‑op for minor versions (no new entry is added).

  • If you change only non‑code assets, a new minor version is created because the assets dictionary changes.

Why this exists: this makes deployments deterministic and deduplicated; you can update assets or code without forcing a new algorithm identity while still keeping a history of builds.

Algorithm type, tags, description

[tool.compox]
algorithm_type = "AlgorithmType"
tags = ["tag1", "tag2"]
description = "This is a super cool algorithm."
removable = false
exportable = true

Supported devices

Supported devices are a list of strings: "cpu", "gpu", or "mps". The default_device must be included in supported_devices, otherwise validation raises an error.

supported_devices = ["cpu", "gpu"]
default_device = "cpu"

Additional parameters

Additional parameters are a list of objects with name, description, and a config section. You can also provide an optional displayed_name for a more human-friendly UI label. If omitted, Compox derives it automatically from name.

additional_parameters = [
  { name = "some_string_parameter", displayed_name = "Some string parameter", description = "This parameter strings.", config = { type = "string", default = "hello", adjustable = true } },
  { name = "threshold", description = "Threshold used during inference.", config = { type = "float_range", default = 0.5, min = 0.0, max = 1.0, step = 0.05, decimal_precision = 2, adjustable = true } },
]

Parameter types:

Parameter type

Configuration fields

string

type, default, adjustable

int

type, default, adjustable

float

type, default, adjustable, decimal_precision(optional)

bool

type, default, adjustable

int_range

type, default, min, max, step, adjustable

float_range

type, default, min, max, step, adjustable, decimal_precision(optional)

string_enum

type, default, options, adjustable

int_enum

type, default, options, adjustable

float_enum

type, default, options, adjustable, decimal_precision(optional)

string_list

type, default, options, adjustable

int_list

type, default, options, adjustable

float_list

type, default, options, adjustable, decimal_precision(optional)

bool_list

type, default, options, adjustable

Notes:

  • displayed_name is optional. If not provided, Compox generates one from name by replacing _ and - with spaces and capitalizing the result.

  • decimal_precision is optional and only valid for float-based parameter types.

  • decimal_precision must be greater than or equal to 0.

Training parameters

Training parameters use the same schema as additional parameters:

training_parameters = [
  { name = "epochs", displayed_name = "Epochs", description = "Training epochs.", config = { type = "int", default = 10, adjustable = true } },
]

Other fields

check_importable = false
obfuscate = true
hash_module = true (deprecated; ignored, deduplication is always on)
hash_assets = true  (deprecated; ignored, deduplication is always on)
removable = false
exportable = true

Why these exist:

  • check_importable helps catch packaging mistakes early.

  • obfuscate reduces casual code exposure in stored modules.

  • (deprecated) hash_module and hash_assets are ignored. Deduplication by content hash is always enabled.

  • removable controls whether the deploy delete endpoint is allowed to remove this algorithm (defaults to false).

  • exportable controls whether the export endpoint can package this algorithm (defaults to true). If false, export returns HTTP 403.

The files directory

Optional. Store data assets your algorithm needs at runtime. Load them via self.fetch_asset(...).

Why this exists: code is zipped and cached separately from assets, so non‑Python files are stored and retrieved from the asset store by path.

The some_internal_submodule directory

Optional. Include internal modules used by your Runner.

Why this exists: any Python modules inside the algorithm directory are packaged into the module zip, so you can keep helper code alongside your Runner.

Example of a dummy algorithm

algorithm_name/
    |-- __init__.py
    |-- Runner.py
    |-- pyproject.toml
    |-- files/
    |   `-- some_heavy_model.pt
    `-- my_big_model/
        |-- __init__.py
        `-- utils.py

Runner example:

from my_big_model.utils import MyBigModel
from compox.algorithm_utils.BaseRunner import BaseRunner
from compox.algorithm_utils.io_schemas import ImageSchema, SegmentationSchema
import numpy as np
import torch

class Runner(BaseRunner):
    """
    The runner class for the foo algorithm.
    """

    def load_assets(self):
        """
        The assets to load for the foo algorithm.
        """
        some_model = MyBigModel()
        self.log_message("Loading the Foo assets.")
        state_dict_bytes = self.fetch_asset("files/some_heavy_model.pt")
        state_dict = torch.load(state_dict_bytes)
        some_model.load_state_dict(state_dict)
        self.my_big_model = some_model

    def preprocess(self, input_data: dict, args: dict | None = None) -> np.ndarray:
        self.log_message("Preprocessing the Foo input data.")
        my_data = self.fetch_data(input_data["input_dataset_ids"], ImageSchema)
        input_array = np.array(my_data[0]["image"])
        return input_array

    def inference(self, data: np.ndarray, args: dict | None = None) -> torch.Tensor:
        self.log_message("Running the Foo inference.")
        some_user_defined_args = args.get("some_user_defined_args", None)
        if some_user_defined_args is not None:
            self.log_message(f"User defined args: {some_user_defined_args}")
        output = self.my_big_model(data, some_user_defined_args)
        self.set_progress(0.5)
        self.log_message("The Foo inference is done.")
        return output

    def postprocess(self, inference_output: torch.Tensor, args: dict | None = None) -> list[str]:
        self.log_message("Postprocessing the Foo output.")
        output = inference_output.detach().numpy()
        output_dicts = [{"mask": output}]
        output_dataset_ids = self.post_data(output_dicts, SegmentationSchema)
        return output_dataset_ids

pyproject.toml example:

[project]
name = "foo"
version = "0.1.0"

[tool.compox]
algorithm_type = "Generic"
tags = ["foo", "bar"]
description = "This algorithm does foo and bar."
additional_parameters = [
  { name = "some_user_defined_args", description = "This is a user defined argument.", config = { type = "string", default = "hello", adjustable = true } },
]
check_importable = false
obfuscate = true
hash_module = true
hash_assets = true

Denoising algorithm template

Here a working template for developing a denoising algorithm will be presented. This guide will cover the specifics needed to develop an image denoising algorithm. To see how compox algorithm should generally be structured, please refer to the algorithms/readme.md file.

The algorithm folder is structured as follows:

template_denoising_algorithm/
    ├── __init__.py
    ├── Runner.py
    ├── pyproject.toml
    └── image_denoising/
        ├── __init__.py
        └── denoising_utils.py
    └── README.md

The pyproject.toml file

The pyproject.toml is a file that contains the algorithm metadata. This file is used by compox to properly deploy the algorithm as a service. The pyproject.toml file should be placed in the root directory of the algorithm.

First, let’s create the pyproject.toml file. Under the [project] section, you should provide the name and version of the algorithm. The name should be unique and should not contain any spaces. The version should be in the format major.minor.patch. The algorithm name and versions is used to identify the algorithm in compox so it is important to provide a unique name and version.

[project]
name = "template_denosing_algorithm"
version = "1.0.0"

Next, you should fill out the [tool.compox] section. This section contains the metadata that compox uses to deploy the algorithm as a service. algorithm_type defines the algorithm input and output types, you may either use some predefined algorithm types or define your own. The predefined algorithm types are located in compox.algorithm_utils. For an image denoising algorithm, we will use the the Image2Image type. This type is suitable for image denoising as both our input and output is an image (or a sequence of images).

[tool.compox]
algorithm_type = "Image2Image"

Each algorithm type has a set of potential tags, which are used to specify the general algorithm functionality. Mutliple tags can be provided for one algorithm. For image denoising algorithms, we will use the image-denoising tag.

tags = ["image-denoising"]

The description field should contain a brief description of the algorithm.

description = "Denoises a sequence of images using the total variation denoising algorithm."

For the denoising algorithm, we will add a denoising_weight parameter that will control the denoising strength. Because we want to set a range for the denoising weight, we will use the float_range parameter type. The default field should contain the default value of the parameter. The min and max fields should contain the minimum and maximum values of the parameter. The step field should contain the step size of the parameter. The adjustable field should be set to true if the parameter should be exposed to the user to adjust.

additional_parameters = [
    {name = "denoising_weight", displayed_name = "Denoising weight", description = "The weight of the denoising term between 0 and 1. Higher values will result in more denoising, but can distort the image.", config = {type = "float_range", default = 0.1, min = 0.0, max = 1.0, step = 0.05, decimal_precision = 2, adjustable = true}}
]

To see more information about the possible parameter types see the How to create an algorithm module section.

displayed_name is optional and controls the human-friendly UI label. decimal_precision is optional and only valid for float-based parameter types.

The algorithm dependencies

The algorithm can use any libraries from the global compox environment. Additional dependencies can be provided as python submodules. Here we will use the scikit-image and numpy libraries to handle the image data. We also implemented a simple image_denoising module that contains an __init__.py file and a denosing_utils.py file. The denoising_utils.py file contains the denoise_image function that performs the denoising of the images. The image_denoising module should be placed in the root directory of the algorithm.

from skimage.restoration import (
    denoise_tv_chambolle,
)

def denoise_image(image, weight=0.1):
    """
    Denoise the image using the total variation denoising algorithm.

    Parameters
    ----------
    image : np.ndarray
        The image to denoise.
    weight: float
        The weight parameter for the denoising algorithm.
    Returns
    -------
    np.ndarray
        The denoised image.
    """

    return denoise_tv_chambolle(image, weight=weight)

The Runner.py file

The Runner.py file is the main file of the algorithm. This file should contain the algorithm implementation. The Runner.py file should be placed in the root directory of the algorithm.

Because we specified the algorithm type as Image2Image, the Runner.py file should contain a class that inherits from the Image2ImageRunner class. The Image2ImageRunner class is located in the compox.algorithm_utils module. The Image2ImageRunner class contains the necessary methods to handle the input and output of the algorithm.

from compox.algorithm_utils.Image2ImageRunner import Image2ImageRunner

class Runner(Image2ImageRunner):
    """
    The runner class for the denoiser algorithm.
    """

    def __init__(self, task_handler, device: str = "cpu"):
        """
        The denoising runner.
        """
        super().__init__(task_handler, device)

We can implement a load_assets method to load any assets that the algorithm requires upon initilaization of the Runner. The important bit is that the attributes that are loaded in the load_assets method are cached with the algorithm and do not have to be reloaded for each algorithm call. This can greatly speed up the algorithm execution. Since we do not need any assets for the denoising algorithm, we can leave the load_assets method empty.

def load_assets(self):
    """
    Here you can load the assets needed for the algorithm. This can be
    the model, the weights, etc. The assets are loaded upon the first
    call of the algorithm and are cached with the algorithm instance.
    """
    pass

Next, we can implement the inference method, where we perform the denoising of the images. The data will be passed to the inference method as a numpy array. The inference method should return a numpy array with the denoised images of the same shape as the input images. You can use the self.log_message method to log messages to the compox log. The self.set_progress method can be used to update the progress with a float value between 0 and 1.

def inference(self, data: np.ndarray, args: dict | None = None) -> np.ndarray:
    """
    Run the inference.

    Parameters
    ----------
    input_data : dict
        The input data.

    Returns
    -------
    np.ndarray
        The denoised images.
    """
    self.log_message("Starting inference.")
    # now we retrieve the input data
    # we will min max normalize the images
    min_val = np.min(data)
    max_val = np.max(data)
    images = (data - min_val) / (max_val - min_val)

    # here we will get the optional argument of denoising weight
    denosing_weight = args.get("denoising_weight", 0.1)

    # we can post messages to the log
    self.log_message(
        f"Starting denoising of {images.shape[0]} images with weight {denosing_weight}."
    )

    # we will denoise the images
    denoised_images = np.zeros_like(images)
    for i in range(images.shape[0]):
        denoised_images[i] = denoise_image(
            images[i], weight=denosing_weight
        )
        # this will update the progress bar
        self.set_progress(i / images.shape[0])

    # we will nromalize the output
    denoised_images = (denoised_images - denoised_images.min()) / (
        denoised_images.max() - denoised_images.min()
    )
    denoised_images = denoised_images.astype(np.float32)

    # we will pass the denoised images to the postprocess method
    return denoised_images

To customize the behavior of fetching and processing the input data, and postprocessing and uploading the output data, we can implement the preprocess and postprocess methods. The preprocess method is called before the inference method and is used to fetch the input data. The postprocess method is called after the inference method and is used to process the output data. In our case, we will not implement any custom behavior for these methods. You can refer to the compoxorithm_utils.Image2ImageRunner class for more information about these methods.

Deploying the algorithm

To deploy the finished algorithm, use:

compox deploy-algorithms --config app_server.yaml --name template_denoising_algorithm

This deploys the algorithm to Compox. The algorithm can also be added through the Compox systray interface by clicking Add Algorithm and selecting the algorithm directory.

Segmentation algorithm template

This guide will cover the specifics needed to develop an image segmentation algorithm. To see how compox algorithm should generally be structured, please refer to the algorithms/readme.md file.

The algorithm folder is structured as follows:

template_segmentation_algorithm/
    ├── __init__.py
    ├── Runner.py
    ├── pyproject.toml
    └── image_segmentation/
        ├── __init__.py
        └── segmentation_utils.py
    └── README.md

The pyproject.toml file

The pyproject.toml is a file that contains the algorithm metadata. This file is used by compox to properly deploy the algorithm as a service. The pyproject.toml file should be placed in the root directory of the algorithm.

First, let’s create the pyproject.toml file. Under the [project] section, you should provide the name and version of the algorithm. The name should be unique and should not contain any spaces. The version should be in the format major.minor.patch. The algorithm name and versions is used to identify the algorithm in compox so it is important to provide a unique name and version.

[project]
name = "template_segmentation_algorithm"
version = "1.0.0"

Next, we will fill out the [tool.compox] section. This section contains the metadata that compox uses to deploy the algorithm as a service. algorithm_type defines the algorithm input and output types, you may either use some predefined algorithm types or define your own. The predefined algorithm types are located in compox.algorithm_utils. For an image segmentation algorithm, we will use the the Image2Segmentation type. This type is suitable for image segmentation as the input is a sequence of images and the output is a sequence of segmentation masks.

[tool.compox]
algorithm_type = "Image2Segmentation"

Each algorithm type has a set of potential tags, which are used to specify the general algorithm functionality. Mutliple tags can be provided for one algorithm. For image segmentation algorithms, we will use the image-segmentation tag.

tags = ["image-segmenation"]

The description field should contain a brief description of the algorithm.

description = "Performs a binary segmentation of a 3-D image using a skimage filter."

Here we will add a thresholding_algorithm parameter that will allow the user to select the thresholding algorithm to use. The optional displayed_name field provides a human-friendly UI label. The type field is set to string_enum to specify that the parameter is a string with a predefined set of values. The default field is set to otsu to specify the default value of the parameter. The options field is set to a list of strings that specify the possible values of the parameter. The adjustable field is set to true to specify that the user should be able to select the thresholding algorithm to apply.

additional_parameters = [
    {name = "thresholding_algorithm", displayed_name = "Thresholding algorithm", description = "The thresholding algorithm to use.", config = {type = "string_enum", default = "otsu", options = ["otsu", "yen", "li", "minimum", "mean", "triangle", "isodata", "local"], adjustable = true}},
]

To see more information about the possible parameter types see the How to create an algorithm module section.

If you later add float-based parameters to this template, you can also provide decimal_precision inside config to control how many decimal places the UI should display.

The algorithm dependencies

The algorithm can use any libraries from the global compox environment. Additional dependencies can be provided as python submodules. Here we will use the numpy library to handle the image data. We also implemented a simple image_segmentation module that contains an __init__.py file and a segmentation_utils.py file. The segmentation_utils.py file contains the threshold_image function that performs segmentation of an image using a selected algorithm. The image_segmentation module should be placed in the root directory of the algorithm.

import skimage.filters as skif


def threshold_image(image, thresholding_algorithm):
    """
    Threshold the image using the specified thresholding algorithm.

    Parameters
    ----------
    image : np.ndarray
        The image to threshold.
    thresholding_algorithm : str
        The thresholding algorithm to use.

    Returns
    -------
    np.ndarray
        The thresholded image.
    """
    if thresholding_algorithm == "otsu":
        threshold = skif.threshold_otsu(image)
    elif thresholding_algorithm == "yen":
        threshold = skif.threshold_yen(image)
    elif thresholding_algorithm == "li":
        threshold = skif.threshold_li(image)
    elif thresholding_algorithm == "minimum":
        threshold = skif.threshold_minimum(image)
    elif thresholding_algorithm == "mean":
        threshold = skif.threshold_mean(image)
    elif thresholding_algorithm == "triangle":
        threshold = skif.threshold_triangle(image)
    elif thresholding_algorithm == "isodata":
        threshold = skif.threshold_isodata(image)
    elif thresholding_algorithm == "local":
        threshold = skif.threshold_local(image)
    else:
        raise ValueError(
            f"Invalid thresholding algorithm: {thresholding_algorithm}"
        )

    return image > threshold

The Runner.py file

The Runner.py file is the main file of the algorithm. This file should contain the algorithm implementation. The Runner.py file should be placed in the root directory of the algorithm.

Because we specified the algorithm type as Image2Segmentation, the Runner.py file should contain a class that inherits from the Image2SegmentationRunner class. The Image2SegmentationRunner class is located in the compox.algorithm_utils module. The Image2SegmentationRunner class contains the necessary methods to handle the input and output of the algorithm.

import numpy as np
from compox.algorithm_utils.Image2SegmentationRunner import (
    Image2SegmentationRunner,
)
from image_segmentation.segmentation_utils import threshold_image


class Runner(Image2SegmentationRunner):
    """
    The runner class for the image segmentation algorithm.
    """

    def __init__(self, task_handler, device: str = "cpu") -> None:
        """
        The aligner runner.
        """
        super().__init__(task_handler, device=device)

We can implement a load_assets method to load any assets that the algorithm requires upon initilaization of the Runner. The important bit is that the attributes that are loaded in the load_assets method are cached with the algorithm and do not have to be reloaded for each algorithm call. This can greatly speed up the algorithm execution. Since we do not need any assets for the segmentation algorithm, we can leave the load_assets method empty.

def load_assets(self):
    """
    Here you can load the assets needed for the algorithm. This can be
    the model, the weights, etc. The assets are loaded upon the first
    call of the algorithm and are cached with the algorithm instance.
    """
    pass

Next, we can implement the inference method, where we perform the segmentation of the images. The inference will receive a numpy array with the images to be segmented. The inference method must return a numpy array with the segmentation masks of the same shape as the input images. The inference method can also receive a dictionary with the arguments for the algorithm. The arguments are passed to the algorithm from compox and can be used to customize the behavior of the algorithm. In our case, we will use the thresholding_algorithm argument to specify the thresholding algorithm to use. You can also report the progress of the algorithm by calling the set_progress method. The set_progress method takes a float value between 0 and 1, where 0 is the start of the algorithm and 1 is the end of the algorithm. The log_message method can be used to log messages to compox log.

def inference(self, data: np.ndarray, args: dict | None = None) -> np.ndarray:
    """
    Run the inference.

    Parameters
    ----------
    data : np.ndarray
        The images to be segmented.
    args : dict
        The arguments for the algorithm.

    Returns
    -------
    np.ndarray
        The segmented images.
    """

    # now we retrieve the input data
    thresholding_algorithm = args.get("thresholding_algorithm", "otsu")
    # we can post messages to the log
    self.log_message(
        f"Starting inference with thresholding algorithm: {thresholding_algorithm}"
    )

    # here we will threshold the images
    mask = threshold_image(data, thresholding_algorithm)

    # we can also log progress
    self.set_progress(0.5)

    # pass the mask to the postprocess
    return mask

To customize the behavior of fetching and processing the input data, and postprocessing and uploading the output data, we can implement the preprocess and postprocess methods. The preprocess method is called before the inference method and is used to fetch the input data. The postprocess method is called after the inference method and is used to process the output data. In our case, we will not implement any custom behavior for these methods. You can refer to the compox.algorithm_utils.Image2SegmentationRunner class for more information about these methods.

Deploying the algorithm

To deploy the finished algorithm, use:

compox deploy-algorithms --config app_server.yaml --name template_segmentation_algorithm

This deploys the algorithm to Compox. The algorithm can also be added through the Compox systray interface by clicking Add Algorithm and selecting the algorithm directory.

Registration algorithm template

Here a working template for developing an image registration algorithm will be presented. To see how compox algorithm should generally be structured, please refer to the algorithms/readme.md file.

The algorithm folder is structured as follows:

template_registration_algorithm/
    ├── __init__.py
    ├── Runner.py
    ├── pyproject.toml
    └── image_registration/
        ├── __init__.py
        └── registration_utils.py
    └── README.md

The pyproject.toml file

The pyproject.toml is a file that contains the algorithm metadata. This file is used by compox to properly deploy the algorithm as a service. The pyproject.toml file should be placed in the root directory of the algorithm.

First, let’s create the pyproject.toml file. Under the [project] section, you should provide the name and version of the algorithm. The name should be unique and should not contain any spaces. The version should be in the format major.minor.patch. The algorithm name and versions is used to identify the algorithm in compox so it is important to provide a unique name and version.

[project]
name = "template_registration_algorithm"
version = "1.0.0"

Next, we will fill out the [tool.compox] section. This section contains the metadata that compox uses to deploy the algorithm as a service. algorithm_type defines the algorithm input and output types, you may either use some predefined algorithm types or define your own. The predefined algorithm types are located in compox.algorithm_utils. For an image registration algorithm, we will use the the Image2Alignment type. This type is suitable for image segmentation as the input is a sequence of images and the output is a sequence homography matrices.

[tool.compox]
algorithm_type = "Image2Alignment"

Each algorithm type has a set of potential tags, which are used to specify the general algorithm functionality. Mutliple tags can be provided for one algorithm. For image registration algorithms, we will use the image-alignment tag.

tags = ["image-alignment"]

The description field should contain a brief description of the algorithm.

description = "Generates homography matrices for aligning a sequence of images."

Here we will add a max_translation parameter that defines the maximum translation as a fraction of the image size. Because we want to set a range for the parameter, we will use the float_range type. The displayed_name field provides a human-friendly UI label. The default field should contain the default value of the parameter. The min and max fields should contain the minimum and maximum values of the parameter. The step field should contain the step size of the parameter. The decimal_precision field controls how many decimal places the UI should display for float-based values. The adjustable field should be set to true if we want to expose the parameter to the user to adjust.

 {name = "max_translation", displayed_name = "Max translation", description = "Maximum translation as a fraction of the image size.", config = {type = "float_range", default = 0.25, min = 0.0, max = 1.0, step = 0.05, decimal_precision = 2, adjustable = true}}

To see more information about the possible parameter types see the How to create an algorithm module section.

The algorithm dependencies

The algorithm can use any libraries from the global compox environment. Additional dependencies can be provided as python submodules. Here we will use the numpy library to handle the image data. We also implemented a simple image_registration module that contains an __init__.py file and a registration_utils.py file. The registration_utils.py file contains the get_random_translation function that generates a random homography matrix with a maximum translation defined by the max_translation parameters as a fraction of the input image size. The image_registration module should be placed in the root directory of the algorithm.

import numpy as np

def get_random_translation(image: np.ndarray, max_translation: float = 0.25):
    """
    Get a random translation matrix.

    Parameters
    ----------
    image : np.ndarray
        The image.
    max_translation : float
        The maximum translation.

    Returns
    -------
    np.ndarray
        The translation matrix.
    """

    # get the image dimensions
    height, width = image.shape[:2]
    h = np.eye(3)

    # random translation
    h[0, 2] = np.random.uniform(
        -max_translation * width, max_translation * width
    )
    h[1, 2] = np.random.uniform(
        -max_translation * height, max_translation * height
    )

    return h

The Runner.py file

The Runner.py file is the main file of the algorithm. This file should contain the algorithm implementation. The Runner.py file should be placed in the root directory of the algorithm.

Because we specified the algorithm type as Image2Alignment, the Runner.py file should contain a class that inherits from the Image2AlignmentRunner class. The Image2AlignmentRunner class is located in the compox.algorithm_utils module. The Image2AlignmentRunner class contains the necessary methods to handle the input and output of the algorithm.

import numpy as np

from compox.algorithm_utils.Image2AlignmentRunner import (
    Image2AlignmentRunner,
)
from image_registration.registration_utils import get_random_translation

class Runner(Image2AlignmentRunner):
    """
    The runner class for the denoiser algorithm.
    """

    def __init__(self, task_handler, device: str = "cpu"):
        """
        The image registration runner.
        """
        super().__init__(task_handler, device)

We can implement a load_assets method to load any assets that the algorithm requires upon initilaization of the Runner. The important bit is that the attributes that are loaded in the load_assets method are cached with the algorithm and do not have to be reloaded for each algorithm call. This can greatly speed up the algorithm execution. Since we do not need any assets for the image registration algorithm, we can leave the load_assets method empty.

def load_assets(self):
    """
    Here you can load the assets needed for the algorithm. This can be
    the model, the weights, etc. The assets are loaded upon the first
    call of the algorithm and are cached with the algorithm instance.
    """
    pass

Next, we can implement the inference method, where we perform the registration of the images. The data will be passed to the inference method as a numpy array. The inference method return a list of homography matrices represented by numpy arrays. You can also report the progress of the algorithm by calling the set_progress method. The set_progress method takes a float value between 0 and 1, where 0 is the start of the algorithm and 1 is the end of the algorithm. The log_message method can be used to log messages to the compox log.

def inference(self, data: np.ndarray, args: dict | None = None) -> list[np.ndarray]:
    """
    Run the inference.

    Parameters
    ----------
    data : np.ndarray
        The input images

    Returns
    -------
    list[np.ndarray]
        The output homography matrices.
    """
    self.log_message("Starting inference.")
    # now we retrieve the input data
    max_translation = args.get("max_translation", 0.25)
    # we can post messages to the log
    self.log_message(f"Registering {data.shape[0]} images.")

    # we will denoise the images
    matrices = []
    for i in range(data.shape[0] - 1):
        matrix = get_random_translation(
            data[i], max_translation=max_translation
        )
        matrices.append(matrix)
        self.set_progress(i / data.shape[0])
    # we will pass the homography matrices to the output
    return matrices

To customize the behavior of fetching and processing the input data, and postprocessing and uploading the output data, we can implement the preprocess and postprocess methods. The preprocess method is called before the inference method and is used to fetch the input data. The postprocess method is called after the inference method and is used to process the output data. In our case, we will not implement any custom behavior for these methods. You can refer to the compox.algorithm_utils.Image2AlignmentRunner class for more information about these methods.

Deploying the algorithm

To deploy the finished algorithm, use:

compox deploy-algorithms --config app_server.yaml --name template_registration_algorithm

This deploys the algorithm to Compox. The algorithm can also be added through the Compox systray interface by clicking Add Algorithm and selecting the algorithm directory.

Implementing the train() method in Compox algorithm runners

This guide is for algorithm developers implementing training logic in their Runner classes.

Where train() is called

  • Training is started by POST /api/v0/train-algorithm.

  • The server creates a TrainingHandler and calls runner.run_training(...).

  • BaseRunner.run_training() sets status to RUNNING, calls self.train(...), and on success calls TrainingHandler.mark_as_completed().

Key implications:

  • If train() raises, the training job is marked FAILED.

  • If a stop request is posted, TaskHandler raises TaskStoppedException and status becomes STOPPED.


Required method: train(self, training_data, args)

In BaseRunner, train() is the method you must override. It should not return anything. The training task is considered complete when train() finishes without error.

Signature in BaseRunner:

def train(self, training_data: list[str], args: dict | None = None) -> None:
    ...

Fetching training data

You receive training_data as a list of training sample IDs. Use the TrainingHandler helpers (available through BaseRunner) to load datasets:

dataset = self.get_training_dataset(training_sample_ids)

From there, you can use these training-specific save/load helpers on the runner. All of them are implemented by TrainingHandler and surfaced via BaseRunner.

Saving / downloading to TempStore

save_training_files_to_temp_store(folder_path, files, schema, parallel=True)

  • Use when you already have in‑memory data (e.g., numpy arrays) and want to persist them to the TempStore before training.

  • Inputs:

    • folder_path: target subfolder in TempStore

    • files: list of dicts (each dict is a logical file with HDF5 keys/values)

    • schema: Pydantic DataSchema for validation

  • Output:

    • list of Path objects pointing to saved files in TempStore

*download_files_to_temp_store(folder_path, file_ids, schema, batch_size=8, keys)

  • Use when you have a flat list of file IDs from data-store.

  • Inputs:

    • file_ids: list of object IDs in data-store

    • schema: Pydantic DataSchema for validation

    • *keys: optional HDF5 keys to extract (if omitted, all keys)

  • Output:

    • list of Path objects in TempStore

  • Notes:

    • Downloads in batches to reduce memory spikes.

download_dataset_to_temp_store(dataset, schemas)

  • Use when you have training samples and want the full manifest structure preserved.

  • Inputs:

    • dataset: a TrainingDataset created from sample IDs

    • schemas: dict mapping sample keys to Pydantic schemas (e.g. {"input": InputSchema, "target": TargetSchema})

  • Output:

    • local_samples: list of samples, each sample is a list of dicts whose values are local Paths in TempStore.

  • Temp layout:

    • <temp>/<sample_id>/<file_index>/<key>/...

Loading from TempStore

*load_files_from_temp_store(paths, parallel=True, keys)

  • Use when you already have a list of TempStore paths.

  • Inputs:

    • paths: list of file paths in TempStore

    • *keys: optional HDF5 keys to extract (if omitted, all keys)

  • Output:

    • list of dicts (in‑memory data)

load_dataset_from_temp_store(local_samples)

  • Use with the output from download_dataset_to_temp_store(...).

  • Input:

    • local_samples: list-of-list-of-dict structure with TempStore paths

  • Output:

    • Same structure, but values are loaded data dicts instead of paths.

Schema validation

All save/download methods validate against Pydantic DataSchema definitions (see compox.algorithm_utils.io_schemas). The schema defines expected HDF5 keys, their types, and any validation rules.


Reporting progress and state

Use these methods during training:

self.set_progress(0.5)  # float in [0.0, 1.0]
self.set_state({"epoch": 3, "loss": 0.12})
self.log_message("Epoch 3/10", logging_level="INFO")
  • set_progress updates TrainingRecord.progress.

  • set_state overwrites the current TrainingRecord.state.

  • log_message appends to the training log.


Saving checkpoints (training outputs)

To persist a trained model or intermediate state, call:

checkpoint_id = self.save_checkpoint(
    {"my_asset.pt": model_bytes},
    properties={"stage": "intermediate", "epoch": 3, "loss": 0.12},
)

Important rules:

  • Keys in the checkpoint dict must match asset paths already defined in the algorithm. TrainingHandler.save_checkpoint() validates this against the algorithm’s assets.

  • The checkpoint is stored in algorithm-checkpoint-store.

  • Each saved checkpoint ID is appended to TrainingRecord.output_checkpoint_ids.

Training completion:

  • TrainingHandler.mark_as_completed() requires at least one checkpoint. If none were saved, the training is marked failed.


Stopping behavior

If a stop request is posted:

  • TaskHandler._check_for_stop_request() raises TaskStoppedException.

  • Training is marked STOPPED.

Recommendation: Keep your training loop responsive so stop requests can be detected quickly.


Example skeleton

from compox.algorithm_utils.BaseRunner import BaseRunner
from compox.algorithm_utils.io_schemas import DataSchema
import numpy as np

class InputSchema(DataSchema):
    image: np.ndarray

class TargetSchema(DataSchema):
    mask: np.ndarray

class Runner(BaseRunner):
    def load_assets(self):
        # load model weights defined in algorithm assets
        self.weights = self.fetch_asset("model.pt")

    def train(self, training_data: list[str], args: dict | None = None):
        # 1) Build dataset from sample IDs
        dataset = self.get_training_dataset(training_data)

        # 2) Download full dataset to TempStore using schemas for each key
        schemas = {"input": InputSchema, "target": TargetSchema}
        local_samples = self.download_dataset_to_temp_store(dataset, schemas)

        # 3) Load the dataset into memory
        in_memory = self.load_dataset_from_temp_store(local_samples)

        # 4) Optional: derive extra files and save them to TempStore
        derived = []
        for sample in in_memory:
            for file_dict in sample:
                if "input" in file_dict:
                    img = file_dict["input"]["image"]
                    norm = (img - img.min()) / (img.max() - img.min() + 1e-8)
                    derived.append({"image": norm.astype(np.float32)})
        derived_paths = self.save_training_files_to_temp_store(
            "derived", derived, InputSchema, parallel=True
        )

        # 5) Load derived files back (flat load)
        derived_loaded = self.load_files_from_temp_store(derived_paths)

        epochs = args.get("num_epochs", 10)
        for epoch in range(epochs):
            # training step...
            self.log_message(f"Epoch {epoch+1}/{epochs}")
            self.set_progress(float(epoch + 1) / epochs)
            self.set_state(
                {
                    "epoch": epoch + 1,
                    "samples": len(in_memory),
                    "derived": len(derived_loaded),
                }
            )

            # intermediate checkpoint
            self.save_checkpoint(
                {"model.pt": b"model-bytes"},
                properties={
                    "stage": "intermediate",
                    "epoch": epoch + 1,
                },
            )

        # final checkpoint (required)
        self.save_checkpoint(
            {"model.pt": b"final-model-bytes"},
            properties={"stage": "final", "epoch": epochs},
        )

Common pitfalls

  • No checkpoints saved: training will fail at completion.

  • Checkpoint keys don’t match assets: save_checkpoint() raises.

  • Long loops without progress/logs: client sees “stalled” training.

  • Mutating cached assets: assets loaded in load_assets() are protected from reassignment.