Batch Jobs

SpikeLab can submit analysis and spike sorting jobs to a Kubernetes cluster. Both workflows follow the same pattern: bundle inputs, submit a job, and retrieve results as an AnalysisWorkspace.

Prerequisites 

Install the batch jobs optional dependencies:

pip install spikelab[batch-jobs,s3]

Ensure Kubernetes access is configured:

kubectl version --client
kubectl config current-context

Configure AWS-compatible credentials in your shell (or use your normal credentials chain).

Setting Up a Session 

All batch operations go through a RunSession. Create one from a cluster profile:

from spikelab.batch_jobs import RunSession, load_cluster_profile

profile = load_cluster_profile("nrp")
session = RunSession.from_profile(profile)

The profile controls the default namespace, S3 prefix, Docker images, credential mounts, and policy thresholds. You can also load a custom profile from a YAML file with load_profile_from_name.

Analysis Jobs 

Analysis jobs run a user-provided script against an AnalysisWorkspace. The workspace is uploaded to the cluster, the script modifies it, and the updated workspace is returned.

Submitting 

from spikelab.batch_jobs import RunSession, JobSpec

result = session.submit_workspace_job(
    workspace=ws,                    # AnalysisWorkspace or path to saved workspace
    script="my_analysis.py",         # analysis script to run
    job_spec=JobSpec(
        name_prefix="my-analysis",
        container=ContainerSpec(image="spikelab/analysis:latest"),
        resources=ResourceSpec(
            requests_cpu="2",
            requests_memory="8Gi",
        ),
    ),
)

print(result.job_name)
print(result.run_id)

The session saves the workspace to HDF5, bundles it with the analysis script, uploads the bundle to S3, and submits a Kubernetes job. Inside the container, the workspace is loaded and made available to the script as a global workspace variable. The script reads from and writes to the workspace directly.

Writing an analysis script 

The analysis script runs inside the container with the workspace already loaded. Use it like a normal Python script:

# my_analysis.py
# 'workspace' is available as a global variable

sd = workspace.get("recording_01", "spikedata")

rates = sd.rates(unit="Hz")
workspace.store("recording_01", "firing_rates_hz", rates)

pop_rate = sd.get_pop_rate(square_width=20, gauss_sigma=100)
workspace.store("recording_01", "pop_rate", pop_rate)

After the script finishes, the workspace is saved and uploaded automatically.

Retrieving results 

Once the job completes, retrieve the updated workspace:

ws_updated = session.retrieve_result(result, local_dir="./results")

This downloads the workspace from S3 and returns an AnalysisWorkspace with all the results your script stored.

Sorting Jobs 

Sorting jobs run the SpikeLab spike sorting pipeline on raw recording files and return the sorted results as a workspace.

Submitting 

result = session.submit_sorting_job(
    recording_paths=["session1.raw.h5", "session2.raw.h5"],
    config="kilosort4",             # preset name, SortingPipelineConfig, or None
    config_overrides={"fr_min": 0.1, "snr_min": 6.0},
    job_spec=JobSpec(
        name_prefix="sorting-run",
        container=ContainerSpec(image="spikelab/sorting:latest"),
        resources=ResourceSpec(
            requests_cpu="4",
            requests_memory="16Gi",
            requests_gpu=1,
        ),
    ),
)

The session bundles the recording files and sorting configuration, uploads them to S3, and submits a job that runs sort_recording inside the container.

Retrieving results 

ws_sorted = session.retrieve_result(result, local_dir="./sorted")

The returned workspace contains one namespace per recording. Each namespace holds:

"spikedata" — the curated SpikeData object
"sorting_metadata" — sorting parameters, curation history, and unit counts per stage

Any QC figures generated during sorting are downloaded to the local directory alongside the workspace.

You can then continue directly with analysis:

sd = ws_sorted.get("session1", "spikedata")
print(f"{sd.N} curated units, {sd.length / 1000:.1f} seconds")

Policy Guardrails 

Before submission, a preflight policy check runs automatically. It reports:

PASS — checks are compliant
WARN — settings are risky but allowed
BLOCK — submission is blocked by default

Policy thresholds (GPU limits, runtime caps, sleep detection) are configured per-profile via the policy section in the cluster profile YAML.

Current checks:

Detect disallowed batch placeholders such as sleep infinity
Ensure GPU request/limit consistency
Warn when request/limit tuning is likely inefficient
Warn when runtimes exceed the configured maximum

To override a blocked submission when you understand and accept the trade-offs:

result = session.submit_workspace_job(
    workspace=ws,
    script="my_analysis.py",
    job_spec=job_spec,
    allow_policy_risk=True,
)

Docker Images 

Base images 

Build reusable base images for CPU and GPU workloads:

docker build -f docker/analysis-base/Dockerfile.cpu -t spikelab/analysis-base:cpu .
docker build -f docker/analysis-base/Dockerfile.gpu -t spikelab/analysis-base:gpu .

Temporary images 

Build and push a temporary image for a single run:

bash scripts/build_temp_image.sh gpu ghcr.io/<org>/spikelab-analysis-temp:<tag>
bash scripts/push_temp_image.sh ghcr.io/<org>/spikelab-analysis-temp:<tag>

Reference this tag in the ContainerSpec when creating your JobSpec.

Monitoring Jobs 

Use the spikelab-batch-jobs CLI to check job status and stream logs:

spikelab-batch-jobs job-status <job-name>
spikelab-batch-jobs job-logs <job-name> --follow
spikelab-batch-jobs job-delete <job-name>

The job name is available in result.job_name after submission.