Batch Jobs
SpikeLab can submit analysis and spike sorting jobs to a Kubernetes cluster.
Both workflows follow the same pattern: bundle inputs, submit a job, and
retrieve results as an AnalysisWorkspace.
Prerequisites
Install the batch jobs optional dependencies:
pip install spikelab[batch-jobs,s3]
Ensure Kubernetes access is configured:
kubectl version --client
kubectl config current-context
Configure AWS-compatible credentials in your shell (or use your normal credentials chain).
Setting Up a Session
All batch operations go through a
RunSession. Create one from a cluster
profile:
from spikelab.batch_jobs import RunSession, load_cluster_profile
profile = load_cluster_profile("nrp")
session = RunSession.from_profile(profile)
The profile controls the default namespace, S3 prefix, Docker images,
credential mounts, and policy thresholds. You can also load a custom profile
from a YAML file with load_profile_from_name.
Analysis Jobs
Analysis jobs run a user-provided script against an
AnalysisWorkspace. The workspace is
uploaded to the cluster, the script modifies it, and the updated workspace is
returned.
Submitting
from spikelab.batch_jobs import RunSession, JobSpec
result = session.submit_workspace_job(
workspace=ws, # AnalysisWorkspace or path to saved workspace
script="my_analysis.py", # analysis script to run
job_spec=JobSpec(
name_prefix="my-analysis",
container=ContainerSpec(image="spikelab/analysis:latest"),
resources=ResourceSpec(
requests_cpu="2",
requests_memory="8Gi",
),
),
)
print(result.job_name)
print(result.run_id)
The session saves the workspace to HDF5, bundles it with the analysis script,
uploads the bundle to S3, and submits a Kubernetes job. Inside the container,
the workspace is loaded and made available to the script as a global
workspace variable. The script reads from and writes to the workspace
directly.
Writing an analysis script
The analysis script runs inside the container with the workspace already loaded. Use it like a normal Python script:
# my_analysis.py
# 'workspace' is available as a global variable
sd = workspace.get("recording_01", "spikedata")
rates = sd.rates(unit="Hz")
workspace.store("recording_01", "firing_rates_hz", rates)
pop_rate = sd.get_pop_rate(square_width=20, gauss_sigma=100)
workspace.store("recording_01", "pop_rate", pop_rate)
After the script finishes, the workspace is saved and uploaded automatically.
Retrieving results
Once the job completes, retrieve the updated workspace:
ws_updated = session.retrieve_result(result, local_dir="./results")
This downloads the workspace from S3 and returns an
AnalysisWorkspace with all the results
your script stored.
Sorting Jobs
Sorting jobs run the SpikeLab spike sorting pipeline on raw recording files and return the sorted results as a workspace.
Submitting
result = session.submit_sorting_job(
recording_paths=["session1.raw.h5", "session2.raw.h5"],
config="kilosort4", # preset name, SortingPipelineConfig, or None
config_overrides={"fr_min": 0.1, "snr_min": 6.0},
job_spec=JobSpec(
name_prefix="sorting-run",
container=ContainerSpec(image="spikelab/sorting:latest"),
resources=ResourceSpec(
requests_cpu="4",
requests_memory="16Gi",
requests_gpu=1,
),
),
)
The session bundles the recording files and sorting configuration, uploads
them to S3, and submits a job that runs sort_recording inside the
container.
Retrieving results
ws_sorted = session.retrieve_result(result, local_dir="./sorted")
The returned workspace contains one namespace per recording. Each namespace holds:
"spikedata"— the curatedSpikeDataobject"sorting_metadata"— sorting parameters, curation history, and unit counts per stage
Any QC figures generated during sorting are downloaded to the local directory alongside the workspace.
You can then continue directly with analysis:
sd = ws_sorted.get("session1", "spikedata")
print(f"{sd.N} curated units, {sd.length / 1000:.1f} seconds")
Policy Guardrails
Before submission, a preflight policy check runs automatically. It reports:
PASS— checks are compliantWARN— settings are risky but allowedBLOCK— submission is blocked by default
Policy thresholds (GPU limits, runtime caps, sleep detection) are configured
per-profile via the policy section in the cluster profile YAML.
Current checks:
Detect disallowed batch placeholders such as
sleep infinityEnsure GPU request/limit consistency
Warn when request/limit tuning is likely inefficient
Warn when runtimes exceed the configured maximum
To override a blocked submission when you understand and accept the trade-offs:
result = session.submit_workspace_job(
workspace=ws,
script="my_analysis.py",
job_spec=job_spec,
allow_policy_risk=True,
)
Docker Images
Base images
Build reusable base images for CPU and GPU workloads:
docker build -f docker/analysis-base/Dockerfile.cpu -t spikelab/analysis-base:cpu .
docker build -f docker/analysis-base/Dockerfile.gpu -t spikelab/analysis-base:gpu .
Temporary images
Build and push a temporary image for a single run:
bash scripts/build_temp_image.sh gpu ghcr.io/<org>/spikelab-analysis-temp:<tag>
bash scripts/push_temp_image.sh ghcr.io/<org>/spikelab-analysis-temp:<tag>
Reference this tag in the ContainerSpec when creating your JobSpec.
Monitoring Jobs
Use the spikelab-batch-jobs CLI to check job status and stream logs:
spikelab-batch-jobs job-status <job-name>
spikelab-batch-jobs job-logs <job-name> --follow
spikelab-batch-jobs job-delete <job-name>
The job name is available in result.job_name after submission.