Overview
If Image Artist is configured to submit Batch Analysis jobs to an external Slurm cluster, and those jobs include Phenologic.AI building blocks, each external cluster worker node must be configured to distribute AI workloads to a Deep Learning (DL) server.
Traditionally, this is done by creating a configuration file on each cluster node at /etc/acapella/config.init, containing:
userconfig AcapellaDeepLearning.DNN.HCSDLServerConnect="http://<DL-server>:8003"
Here, <DL-server> is the domain name of the DL server that Phenologic.AI jobs will be submitted to. This can be an external DL server, or the domain of the Image Artist server itself (in which case AI jobs will be distributed to the DL server running as part of the Image Artist Docker stack).
This per-node configuration allows each node to specify a unique DL server, enabling load distribution across multiple DL hosts.
Alternative Workflow
Instead of configuring each node individually, you can define the DL server URL centrally using an environment variable in the Slurm submission script used by the Image Artist host:
export HCSDLSERVER_CONNECT="http://<DL-server>:8003"
This variable replaces the need for /etc/acapella/config.init on each cluster node entirely, simplifying setup and maintenance.
Using this global configuration, all nodes will submit Phenologic.AI workflows to a single DL server.
When to Use This Approach
This method is ideal when:
- All cluster nodes submit jobs to the same DL server.
- You want to avoid manual configuration on each node.
- You prefer centralized control over DL server assignment.
Limitations
If your cluster uses multiple DL servers and you need to control which nodes submit to which server, the traditional per-node config.init method is more appropriate. It allows:
- Fine-grained control over job routing.
- Better load distribution across multiple DL servers.
Summary
Using an environment variable in the Slurm submission script offers a streamlined alternative to per-node configuration. It reduces administrative overhead and simplifies deployment, especially in homogeneous environments where all nodes target the same DL server.
Comments
0 comments
Article is closed for comments.