Spike Sorting
The spikelab.spike_sorting sub-package provides a full spike-sorting
pipeline: loading raw recordings, running a sorter backend (Kilosort2,
Kilosort4, or RT-Sort), extracting waveforms, curating units, and compiling
results into SpikeData objects.
See the Spike Sorting and Curation guide for usage examples and environment setup instructions.
Entry Points
- spikelab.spike_sorting.sort_recording(recording_files, config=None, sorter='kilosort2', intermediate_folders=None, results_folders=None, **kwargs)[source]
Run spike sorting on one or more recordings using any registered backend.
This is the primary entry point for the modular sorting pipeline.
- Parameters:
recording_files (list) – Paths to recording files or directories. Each entry is sorted independently. Directories have their contents concatenated before sorting and split back into per-file SpikeData afterward.
config (SortingPipelineConfig or None) – Pre-built configuration. When provided,
**kwargsare applied as overrides viaconfig.override(). When None, a fresh config is built fromsorter+**kwargs. Preset configs are available inspikelab.spike_sorting.config(e.g.KILOSORT2).sorter (str) – Registered sorter backend name. Only used when
configis None. Available:"kilosort2","kilosort4".intermediate_folders (list or None) – Intermediate result directories, one per recording. Auto-generated if None.
results_folders (list or None) – Output directories, one per recording. Auto-generated if None.
**kwargs – Override individual config fields (e.g.
snr_min=5.0,use_docker=True,fr_min=0.05). Seespikelab.spike_sorting.configfor all available parameters, grouped by:RecordingConfig,SorterConfig,WaveformConfig,CurationConfig,CompilationConfig,FigureConfig,ExecutionConfig.
- Returns:
- One SpikeData per original recording
file. For directory inputs, the concatenated recording is split back into per-file SpikeData objects.
- Return type:
Notes
Pickle files (
sorted_spikedata_curated.pkland optionallysorted_spikedata.pkl) are saved to each results folder.hdf5_plugin_path(passed via config or kwargs) setsos.environ['HDF5_PLUGIN_PATH']before any recording is loaded. This is needed for Maxwell.h5files and applies to all backends.
- spikelab.spike_sorting.sort_multistream(recording, stream_ids, config=None, sorter='kilosort2', **kwargs)[source]
Sort a multi-stream recording across multiple stream IDs.
Calls
sort_recordingonce per stream ID, routing each stream to its own intermediate and results folders. Validates that the requested stream IDs exist in the recording file before sorting.- Parameters:
recording (str or Path) – Path to a single multi-stream recording file (e.g. MaxTwo
.raw.h5) or a directory of such files. When a directory is given, all files are concatenated per stream.stream_ids (list of str) – Stream identifiers to sort, e.g.
["well000", "well001", "well002"].config (SortingPipelineConfig or None) – Pre-built configuration. When provided,
**kwargsare applied as overrides.sorter (str) – Registered sorter backend name (default
"kilosort2"). Only used whenconfigis None.**kwargs –
Override individual config fields. The following must not be provided:
intermediate_foldersandresults_foldersare auto-generated per stream.stream_idis set automatically per iteration.
- Returns:
{stream_id: list[SpikeData]}.- Return type:
results (dict)
Notes
Stream ID validation uses SpikeInterface’s extractor for the recording format. Currently supports Maxwell
.h5files. For other formats, validation is skipped and invalid stream IDs will produce errors at loading time.When recording is a directory of files, each file is concatenated per stream before sorting. Channel count and sampling frequency must match across files (raises
ValueError); mismatched channel IDs or locations produce warnings.
Configuration
Configuration dataclass for the spike sorting pipeline.
Replaces the ~80 module-level globals in kilosort2.py with a single typed, inspectable configuration object that is passed explicitly to every pipeline function.
- class spikelab.spike_sorting.config.RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=<factory>, rec_chunks_s=<factory>, start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000)[source]
Bases:
objectParameters for recording loading and preprocessing.
- __init__(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=<factory>, rec_chunks_s=<factory>, start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000)
- class spikelab.spike_sorting.config.SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False)[source]
Bases:
objectParameters for the spike sorter itself.
- __init__(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False)
- class spikelab.spike_sorting.config.RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None)[source]
Bases:
objectParameters for the RT-Sort detection and sorting backend.
RT-Sort is an action-potential-propagation-based spike sorter using a deep learning detection model followed by codetection clustering and template matching. See van der Molen, Lim et al. 2024 (PLOS ONE, DOI: 10.1371/journal.pone.0312438) for algorithmic details.
- Parameters:
model_path (str or None) – Path to a folder containing
init_dict.jsonandstate_dict.ptfor a pretrainedModelSpikeSorter. When None, the bundled model corresponding toprobeis loaded.probe (str) – Which bundled pretrained model to use when
model_pathis None."mea"or"neuropixels".device (str) – PyTorch device for inference.
"cuda"or"cpu".num_processes (int or None) – Number of worker processes for parallel detection/clustering stages. None selects an automatic value based on CPU count.
recording_window_ms (tuple or None) –
(start_ms, end_ms)window of the recording to process. None processes the entire recording.save_rt_sort_pickle (bool) – If True, serialize the final
RTSortobject to the sorter output folder so the trained sequences can be re-used in Phase 2 stim-aware sorting.delete_inter (bool) – If True, delete the intermediate cache directory after sorting completes.
verbose (bool) – Print progress messages during sorting.
params (dict or None) – Override dictionary merged into the RT-Sort parameter set. Takes precedence over the preset defaults; useful for one-off tuning without editing a preset. Keys must match
detect_sequencesparameter names.detection_window_s (float or None) – If set, run sequence detection on only the first
detection_window_sseconds of the recording (the heavy GPU + clustering phase), then apply the resulting sequences to the full recording duringsort_offline. Decouples the detection-phase memory ceiling from total recording length.Noneuses the full window for both phases (legacy behavior).
- __init__(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None)
- class spikelab.spike_sorting.config.WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True)[source]
Bases:
objectParameters for waveform extraction and template computation.
Memory-budget note: the default extractor pre-allocates one
(n_spikes, nsamples, num_channels).npymemmap per unit before extraction begins. For high-unit-count sorters on high-density MEAs this grows to tens of GB (e.g. 400 units × 1018 channels = ~39 GB). When that exceeds host RAM, setstreaming=Trueto use a one-unit-at-a-time path that discards each unit’s waveforms after templates and metrics are computed — peak RAM becomes one unit’s buffer (~100 MB for MaxOne) regardless of total unit count. Waveform files are only written whensave_waveform_files=True.- __init__(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True)
- class spikelab.spike_sorting.config.CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0)[source]
Bases:
objectParameters for unit quality-control curation.
- __init__(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0)
- class spikelab.spike_sorting.config.CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False)[source]
Bases:
objectParameters for result compilation and export.
- __init__(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False)
- class spikelab.spike_sorting.config.FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=<factory>, scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)')[source]
Bases:
objectParameters for QC figure generation.
- __init__(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=<factory>, scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)')
- class spikelab.spike_sorting.config.ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True)[source]
Bases:
objectParameters for pipeline execution control.
- __init__(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True)
- class spikelab.spike_sorting.config.SortingPipelineConfig(recording=<factory>, sorter=<factory>, rt_sort=<factory>, waveform=<factory>, curation=<factory>, compilation=<factory>, figures=<factory>, execution=<factory>)[source]
Bases:
objectComplete configuration for a spike sorting pipeline run.
Groups all parameters into typed sub-configs. Passed explicitly to every pipeline function, replacing module-level globals.
- Parameters:
recording (RecordingConfig) – Recording loading and preprocessing.
sorter (SorterConfig) – Spike sorter selection and parameters.
rt_sort (RTSortConfig) – RT-Sort specific parameters (only used when
sorter.sorter_name == "rt_sort").waveform (WaveformConfig) – Waveform extraction and templates.
curation (CurationConfig) – Unit quality-control filters.
compilation (CompilationConfig) – Result export options.
figures (FigureConfig) – QC figure generation.
execution (ExecutionConfig) – Pipeline control and parallelism.
-
recording:
RecordingConfig
-
sorter:
SorterConfig
-
rt_sort:
RTSortConfig
-
waveform:
WaveformConfig
-
curation:
CurationConfig
-
compilation:
CompilationConfig
-
figures:
FigureConfig
-
execution:
ExecutionConfig
- classmethod from_kwargs(**kwargs)[source]
Build a config from flat keyword arguments.
Maps the flat parameter names used by
sort_with_kilosort2()to the nested sub-config fields. Unknown keys raiseTypeError.- Parameters:
**kwargs – Flat keyword arguments matching
sort_with_kilosort2()parameter names.- Returns:
Populated configuration.
- Return type:
config (SortingPipelineConfig)
- override(**kwargs)[source]
Return a copy of this config with selected fields overridden.
Accepts the same flat keyword arguments as
from_kwargs(). Unspecified fields retain their current values.- Parameters:
**kwargs – Flat keyword arguments to override.
- Returns:
New config with overrides.
- Return type:
config (SortingPipelineConfig)
- __init__(recording=<factory>, sorter=<factory>, rt_sort=<factory>, waveform=<factory>, curation=<factory>, compilation=<factory>, figures=<factory>, execution=<factory>)
- spikelab.spike_sorting.config.KILOSORT2 = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True))
Default configuration for Kilosort2. Parameters are compatible with Maxwell MEA and other probe types. Hardware-specific presets can be created by overriding parameters.
- spikelab.spike_sorting.config.KILOSORT2_DOCKER = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort2', sorter_path=None, sorter_params=None, use_docker=True), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True))
Kilosort2 with Docker (no local MATLAB needed).
- spikelab.spike_sorting.config.KILOSORT4 = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort4', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True))
Default configuration for Kilosort4. Kilosort4 is pure Python (PyTorch) — no MATLAB required. Default parameters are tuned for Neuropixels probes but work for other probe types. Hardware-specific presets (e.g. for Maxwell MEAs) can be created by overriding detection/filtering parameters.
- spikelab.spike_sorting.config.KILOSORT4_DOCKER = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='kilosort4', sorter_path=None, sorter_params=None, use_docker=True), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True))
Kilosort4 with Docker.
- spikelab.spike_sorting.config.RT_SORT_MEA = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='rt_sort', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='mea', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params=None, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True))
RT-Sort with the bundled MEA detection model. Uses the propagation-based RT-Sort algorithm (van der Molen, Lim et al. 2024, PLOS ONE) with the pretrained model tuned for Maxwell multi-electrode arrays.
- spikelab.spike_sorting.config.RT_SORT_NEUROPIXELS = SortingPipelineConfig(recording=RecordingConfig(stream_id=None, hdf5_plugin_path=None, first_n_mins=None, mea_y_max=None, gain_to_uv=None, offset_to_uv=None, rec_chunks=[], rec_chunks_s=[], start_time_s=None, end_time_s=None, freq_min=300, freq_max=6000), sorter=SorterConfig(sorter_name='rt_sort', sorter_path=None, sorter_params=None, use_docker=False), rt_sort=RTSortConfig(model_path=None, probe='neuropixels', device='cuda', num_processes=None, recording_window_ms=None, save_rt_sort_pickle=True, delete_inter=False, verbose=True, params={'stringent_thresh': 0.175, 'loose_thresh': 0.075, 'inference_scaling_numerator': 15.4, 'min_amp_dist_p': 0.1, 'max_latency_diff_spikes': 2.5, 'max_amp_median_diff_spikes': 0.45, 'max_latency_diff_sequences': 2.5, 'max_amp_median_diff_sequences': 0.45, 'max_root_amp_median_std_sequences': 2.5}, detection_window_s=None), waveform=WaveformConfig(ms_before=2.0, ms_after=2.0, pos_peak_thresh=2.0, max_waveforms_per_unit=300, compiled_ms_before=2.0, compiled_ms_after=2.0, scale_compiled_waveforms=True, std_at_peak=True, std_over_window_ms_before=0.5, std_over_window_ms_after=1.5, streaming=True, save_waveform_files=True), curation=CurationConfig(curate_first=True, curate_second=True, curation_epoch=None, fr_min=0.05, isi_viol_max=0.01, isi_violation_method='percent', snr_min=5.0, spikes_min_first=30, spikes_min_second=50, std_norm_max=1.0), compilation=CompilationConfig(compile_single_recording=True, compile_to_mat=False, compile_to_npz=True, compile_waveforms=False, save_electrodes=True, save_spike_times=True, save_raw_pkl=False, save_dl_data=False), figures=FigureConfig(create_figures=False, create_unit_figures=False, dpi=None, font_size=12, bar_x_label='Recording', bar_y_label='Number of Units', bar_label_rotation=0, bar_total_label='First Curation', bar_selected_label='Selected Curation', scatter_std_max_units_per_recording=None, scatter_recording_colors=['#f74343', '#fccd56', '#74fc56', '#56fcf6', '#1e1efa', '#fa1ed2'], scatter_recording_alpha=1.0, scatter_x_label='Number of Spikes', scatter_y_label='avg. STD / amplitude', scatter_x_max_buffer=300.0, scatter_y_max_buffer=0.2, templates_color_curated='#000000', templates_color_failed='#FF0000', templates_per_column=50, templates_y_spacing=50.0, templates_y_lim_buffer=10.0, templates_window_ms_before=5.0, templates_window_ms_after=5.0, templates_line_ms_before=1.0, templates_line_ms_after=4.0, templates_x_label='Time Rel. to Peak (ms)'), execution=ExecutionConfig(n_jobs=8, total_memory='16G', use_parallel_processing_for_raw_conversion=True, save_script=False, out_file='sort_with_kilosort2.out', random_seed=1, recompute_recording=False, recompute_sorting=False, reextract_waveforms=False, recurate_first=False, recurate_second=False, recompile_single_recording=False, delete_inter=True))
RT-Sort with the bundled Neuropixels detection model. Uses Neuropixels-tuned detection thresholds and merge parameters.
Backend Registry
Spike sorter backend registry.
Maps sorter names to their backend classes. Backends are imported lazily to avoid requiring all sorter dependencies at import time.
- spikelab.spike_sorting.backends.get_backend_class(sorter_name)[source]
Look up and import the backend class for a sorter name.
- Parameters:
sorter_name (str) – Registered sorter name (e.g.
"kilosort2").- Returns:
The
SorterBackendsubclass.- Return type:
cls
- Raises:
ValueError – If the sorter name is not registered.
- class spikelab.spike_sorting.backends.base.SorterBackend(config)[source]
Bases:
ABCInterface that each spike sorter backend must implement.
- Parameters:
config (SortingPipelineConfig) – Full pipeline configuration. Backends read their relevant sub-configs (
config.recording,config.sorter,config.waveform,config.execution).
- abstract load_recording(rec_path)[source]
Load and preprocess a single recording.
Handles format-specific loading (Maxwell
.h5, NWB, etc.), gain/offset scaling, and bandpass filtering.- Parameters:
rec_path (
Any) – Path to a recording file, a directory of files to concatenate, or a pre-loaded BaseRecording object.- Returns:
- A SpikeInterface
BaseRecordingready for sorting (scaled, filtered, single-segment).
- A SpikeInterface
- Return type:
recording
- abstract sort(recording, rec_path, recording_dat_path, output_folder)[source]
Run the spike sorter on a preprocessed recording.
- Parameters:
recording – SpikeInterface
BaseRecordingfromload_recording.rec_path – Original recording file path (for binary conversion or metadata).
recording_dat_path (Path) – Path for the binary
.datfile (used by sorters that require pre-converted input).output_folder (Path) – Directory for sorter output files.
- Returns:
- A SpikeInterface
BaseSortingwith detected units and spike trains.
- A SpikeInterface
- Return type:
sorting
- abstract extract_waveforms(recording, sorting, waveforms_folder, curation_folder, rec_path=None, rng=None)[source]
Extract per-unit waveforms and compute templates.
- Parameters:
recording – SpikeInterface
BaseRecording.sorting – SpikeInterface
BaseSortingfromsort.waveforms_folder (Path) – Root directory for waveform storage.
curation_folder (Path) – Directory for initial unit list and metadata.
- Returns:
An object providing at minimum:
sorting— the sorting object (possibly with centered spike times)recording— the recording objectsampling_frequency— floatpeak_ind— int (peak sample index in template)chans_max_all— dict or array mapping unit_id to max-amplitude channel indexuse_pos_peak— dict or array mapping unit_id to bool (polarity)get_computed_template(unit_id, mode)— returns(n_samples, n_channels)template arrayms_to_samples(ms)— time conversionroot_folder— Path to waveform files
This can be the custom
WaveformExtractor(Kilosort2 backend) or a wrapper around SpikeInterface’sWaveformExtractor(future backends).- Return type:
waveform_extractor
Classified Exceptions
When a sort fails, SpikeLab can classify the failure into one of three categories so that callers can implement skip/retry/stop policies without parsing generic error messages.
Classified spike-sorting exceptions shared across runners and curation.
Failures from Kilosort2, Kilosort4, and the downstream curation/waveform
code are grouped into three categories so callers can implement retry /
skip / hard-stop policies without parsing generic Exception messages:
BiologicalSortFailure— the recording itself cannot be sorted (too silent, all channels bad, no waveforms to compute metrics on). Recommended policy: mark the target as not-sortable, move on, do not retry.EnvironmentSortFailure— the host environment or container runtime is misconfigured. Recommended policy: hard stop and surface to the operator; retrying without intervention will loop.ResourceSortFailure— the job exhausted a machine resource (GPU memory today; disk/CPU in future). Recommended policy: retry with reduced parameters rather than skip or hard-stop.
Classifiers in _classifier inspect sorter logs and exception
chains to re-raise generic failures as one of the specific types below.
The classes are also usable directly from non-classifier paths (e.g.
curation code that already knows the exact condition).
- exception spikelab.spike_sorting._exceptions.SpikeSortingClassifiedError[source]
Bases:
RuntimeErrorBase class for all classified sort-pipeline failures.
Catch this when you want to treat any identified failure uniformly. Prefer catching the more specific categorical bases (
BiologicalSortFailure,EnvironmentSortFailure,ResourceSortFailure) when the policy differs by category.
- exception spikelab.spike_sorting._exceptions.BiologicalSortFailure[source]
Bases:
SpikeSortingClassifiedErrorFailure caused by the recording itself (too little signal).
- exception spikelab.spike_sorting._exceptions.EnvironmentSortFailure[source]
Bases:
SpikeSortingClassifiedErrorFailure caused by host or container environment misconfiguration.
- exception spikelab.spike_sorting._exceptions.ResourceSortFailure[source]
Bases:
SpikeSortingClassifiedErrorFailure caused by exhausting a machine resource.
- exception spikelab.spike_sorting._exceptions.InsufficientActivityError(message, *, sorter, threshold_crossings=None, units_at_failure=None, nspks_at_failure=None, log_path=None)[source]
Bases:
BiologicalSortFailureSorting crashed because the recording has too little spiking activity.
Kilosort2, Kilosort4, and RT-Sort all fail on near-silent recordings, but in different ways:
Kilosort2: mex kernels launch with degenerate grid/block configurations when template counts and per-batch spike counts approach zero. Pre-Blackwell GPUs tolerated these launches; newer architectures (compute capability ≥ 12) reject them with
CUDA error: invalid configuration argument.Kilosort4: sklearn’s
TruncatedSVDrejects an empty feature matrix, orKMeansfails then_samples >= n_clusterscheck, when the initial spike-detection pass finds essentially no events.RT-Sort:
detect_sequencesproduces zero propagation sequences when the recording lacks sufficient spiking activity for clustering. ReturnsNone, which causes anAttributeErrorwhensort_offlineis subsequently called.
- threshold_crossings
KS2 only; count of detected threshold crossings parsed from
kilosort2.log.Nonefor KS4 / RT-Sort.
- units_at_failure
KS2 template count at the crash, or KS4
n_sampleswhen KMeans complained.Nonewhen the log did not expose the value.
- nspks_at_failure
KS2 only; spikes-per-batch at the failing template-optimization step.
- log_path
Sorter log file carrying the full trace when located.
- sorter
Short identifier of the sorter that raised (
"kilosort2","kilosort4","rt_sort").
- exception spikelab.spike_sorting._exceptions.NoGoodChannelsError(message, *, sorter, total_channels=None, bad_channels=None, log_path=None)[source]
Bases:
BiologicalSortFailureAll channels were flagged as bad by the sorter’s good-channel check.
Distinct from
InsufficientActivityError: the signal may be noisy/present but no channel passes the sorter’sminfr_goodchannels(or equivalent) firing-rate threshold.- total_channels
Total channel count in the recording, when parsed.
- bad_channels
Channels flagged as bad.
- log_path
Sorter log file carrying the full trace when located.
- sorter
Short identifier of the sorter that raised.
- exception spikelab.spike_sorting._exceptions.SaturatedSignalError(message, *, channels_saturated=None, total_channels=None)[source]
Bases:
BiologicalSortFailureRecording appears flat or rail-saturated across all channels.
Typical causes: disconnected electrodes, loss of fluid contact, broken amplifier front-end, or a saved recording that never received real data. Distinct from
InsufficientActivityErrorbecause it reflects a hardware/acquisition fault rather than biology.The sort-time log signatures are ambiguous with near-silent biology, so this class is currently intended to be raised by dedicated pre-sort validators (e.g. per-channel variance / rail-clip checks) rather than by the post-failure classifiers. Callers that already know the condition may raise it directly.
- channels_saturated
Number of channels identified as saturated, when the caller provides this.
- total_channels
Total channel count in the recording.
- exception spikelab.spike_sorting._exceptions.EmptyWaveformMetricsError(message, *, metric_name=None)[source]
Bases:
BiologicalSortFailure,ValueErrorWaveform metrics (SNR, std-norm) cannot be computed.
Raised when curation requests a waveform-based metric but no precomputed values exist and
raw_dataon theSpikeDatais empty, so there is nothing to extract waveforms from.This is biology-adjacent: it typically means the upstream sorter produced units that have no usable waveform evidence attached, or that the pipeline skipped the waveform-extraction stage. Callers should treat it as “cannot curate this target” rather than retry.
Inherits from both
BiologicalSortFailure(for category-aware handling) andValueError(for backward compatibility with callers that historically caughtValueErrorfrom this site).- metric_name
The metric that could not be computed.
- exception spikelab.spike_sorting._exceptions.HDF5PluginMissingError(message, *, configured_path=None)[source]
Bases:
EnvironmentSortFailureHDF5 filter plugin is missing or the plugin path is misconfigured.
Typical signatures in the underlying exception chain: h5py / HDF5 errors about being unable to open a compressed dataset, or the inherited
HDF5_PLUGIN_PATHenvironment variable pointing to a non-existent directory.Recommended remediation (operator, not the library): set
HDF5_PLUGIN_PATHto a directory containing the compression plugin required by the recording’s HDF5 build before any h5py import. The exact directory and plugin name are deployment-specific.- configured_path
The value of
HDF5_PLUGIN_PATHat failure time, if known.
- exception spikelab.spike_sorting._exceptions.DockerEnvironmentError(message, *, reason)[source]
Bases:
EnvironmentSortFailureDocker daemon, client library, or image is unusable for sorting.
The
reasonstring narrows the failure mode so callers can render better diagnostics or choose different remediations without catching sub-exceptions.Recognized
reasonvalues:"daemon_down"— Cannot connect to the Docker daemon."client_missing"— The Pythondockerclient library is not installed in the sorting env."image_pull_failed"— Image pull returned an error (network, auth, or manifest-not-found)."permission_denied"— Socket permission denied; user not in thedockergroup or equivalent."other"— Docker is broken in a way that did not match any known signature; inspect__cause__for details.
- reason
One of the strings above.
- exception spikelab.spike_sorting._exceptions.ModelLoadingError(message, *, sorter='rt_sort', model_path=None)[source]
Bases:
EnvironmentSortFailureDetection model could not be loaded or is unusable.
Raised when RT-Sort’s
ModelSpikeSorter.load()fails — typically because PyTorch is missing, weights are corrupt, the model folder does not exist, or the architecture parameters do not match the saved state dict.- model_path
Path that was attempted, when known.
- sorter
Short identifier of the sorter that raised.
- exception spikelab.spike_sorting._exceptions.GPUOutOfMemoryError(message, *, sorter, log_path=None)[source]
Bases:
ResourceSortFailureThe sorter exhausted GPU memory.
Raised when either a PyTorch
CUDA out of memoryerror (KS4) or a MATLAB/mexCUDA_ERROR_OUT_OF_MEMORYdiagnostic (KS2) appears in the exception chain or sorter log.Recommended remediation: reduce batch size /
NT/nPCs, split the recording into shorter segments, or run on a larger-memory GPU. Retrying the same command unchanged will loop.- sorter
Short identifier of the sorter that raised.
- log_path
Sorter log file carrying the full trace when located.
Post-Failure Classifiers
The classifier module inspects sorter logs and exception chains to produce
specific SpikeSortingClassifiedError
subclasses from generic failures.
- spikelab.spike_sorting._classifier.classify_ks2_failure(output_folder, exc)[source]
Return a classified exception for a Kilosort2 failure, or
None.Priority: environment → resource → biology. Environment and resource errors can appear on any recording, so they take precedence over biology signatures that would otherwise be consistent with them.
- Return type:
- spikelab.spike_sorting._classifier.classify_ks4_failure(output_folder, exc)[source]
Return a classified exception for a Kilosort4 failure, or
None.Priority mirrors KS2. KS4 does not expose a distinct “all channels bad” diagnostic the same way KS2 does, so only the generic biology classifier (insufficient activity) is applied.
- Return type: