SpikeData

The SpikeData class is the primary data structure in SpikeLab. It holds spike trains for one or more units (neurons or electrodes), where each spike train is a sequence of spike times. All analysis methods assume spike times are in milliseconds. Most analysis workflows start by creating or loading a SpikeData object.

class spikelab.spikedata.SpikeData(train, *, N=None, length=None, start_time=0.0, neuron_attributes=None, metadata=None, raw_data=None, raw_time=None)[source]

Bases: object

Neuronal spike data with functionality for loading, processing, and analyzing.

train

List of numpy arrays, where each array contains the spike times for a particular neuron.

Type:

list[numpy.ndarray]

N

The number of neurons in the dataset.

Type:

int

length

The total duration of the time window in milliseconds.

Type:

float

start_time

The time origin in milliseconds (default 0.0). For event-centered data, this is typically -pre_ms. Spike times fall within [start_time, start_time + length].

Type:

float

neuron_attributes

A list of dictionaries containing information on each neuron.

Type:

list[dict] or None

metadata

A dictionary containing any additional information or metadata about the spike data.

Type:

dict

raw_data

If provided, this numpy array contains the raw time series data.

Type:

numpy.ndarray

raw_time

Either a numpy array of sample times, or a single float representing a sample rate in kHz.

Type:

numpy.ndarray or float

static from_idces_times(idces, times, N=None, **kwargs)[source]

Create a SpikeData object from lists of unit indices and spike times.

Parameters:
  • idces (list) – List of unit indices.

  • times (list) – List of spike times.

  • N (int) – Number of units (optional).

  • **kwargs – Additional keyword arguments for the SpikeData constructor.

Returns:

A new SpikeData object with the given

unit indices and spike times.

Return type:

spike_data (SpikeData)

Notes

  • This method is a wrapper around the _train_from_i_t_list helper function.

  • When idces is empty and N is None, defaults to 0 units and length=0.

static from_raster(raster, bin_size_ms, **kwargs)[source]

Create a SpikeData object from a spike raster matrix with shape (N, T).

Parameters:
  • raster (numpy.ndarray) – Spike raster matrix with shape (N [units], T [time bins]).

  • bin_size_ms (float) – Size of each time bin in milliseconds.

  • **kwargs – Additional keyword arguments for the SpikeData constructor.

Returns:

Object with the given spike raster.

Return type:

sd (SpikeData)

Notes

  • The generated spike times are evenly spaced within each time bin. For example, if a unit fires 3 times in a 10 ms time bin, those events go at 2.5, 5, and 7.5 ms after the start of the bin.

  • All metadata parameters of the regular constructor are accepted.

  • For event-centered rasters (where bin 0 corresponds to start_time, not t=0), pass start_time in kwargs so that spike times are correctly offset. Without it, bin 0 maps to t=0.

static from_events(events, N=None, **kwargs)[source]

Create a SpikeData object from a list of (unit index, time) pairs.

Parameters:
  • events (list) – List of (index, time) pairs.

  • N (int) – Number of units (default: maximum index in the events).

  • **kwargs – Additional keyword arguments for the SpikeData constructor.

Returns:

Object with the given events.

Return type:

sd (SpikeData)

Notes

  • This method is a wrapper around the from_idces_times helper function. All metadata parameters of the regular constructor are accepted.

static from_neo_spiketrains(spiketrains, **kwargs)[source]

Create a SpikeData object from a list of neo.SpikeTrain objects.

Parameters:
  • spiketrains (list) – List of neo.SpikeTrain objects.

  • **kwargs – Additional keyword arguments for the SpikeData constructor.

Returns:

Object with the given spike trains in

milliseconds.

Return type:

sd (SpikeData)

static from_thresholding(data, fs_Hz=20000.0, threshold_sigma=5.0, filter=True, hysteresis=True, direction='both')[source]

Create a SpikeData object by filtering and thresholding raw electrophysiological data.

Parameters:
  • data (numpy.ndarray) – Raw data with shape (channels, time).

  • fs_Hz (float) – Sampling frequency in Hz.

  • threshold_sigma (float) – Threshold in units of per-channel standard deviation.

  • filter (dict or bool) – Filter configuration or False to disable filtering; if True, a third-order Butterworth filter with passband 300 Hz to 6 kHz is used.

  • hysteresis (bool) – Use hysteresis for thresholding.

  • direction (str) – Direction of thresholding (‘both’, ‘up’, ‘down’).

Returns:

Object with the given raw data.

Return type:

sd (SpikeData)

Notes

  • To use different filter parameters, pass a dictionary, which will be passed as keyword arguments to butter_filter().

  • If filter is False, no filtering is done.

__init__(train, *, N=None, length=None, start_time=0.0, neuron_attributes=None, metadata=None, raw_data=None, raw_time=None)[source]

Initialize a SpikeData object from a list of spike trains.

Parameters:
  • train (list) – List of spike trains, each a list of spike times in milliseconds. Spike times can be negative for event-centered data (e.g. -200 to +300 around a stimulus event).

  • N (int) – Number of units (optional).

  • length (float) – Total duration of the time window in milliseconds (optional). For event-centered data with times from -200 to +300, length is 500. Defaults to the span from start_time to the latest spike time.

  • start_time (float) – Time of the first bin in milliseconds (default 0.0). For event-centered data, this is typically -pre_ms (e.g. -200.0). Spike times must fall within [start_time, start_time + length].

  • neuron_attributes (list) – List of neuron attributes (optional).

  • metadata (dict) – Dictionary of metadata (optional).

  • raw_data (numpy.ndarray) – Raw timeseries data with shape (channels, time) (optional).

  • raw_time (numpy.ndarray or float) – Raw time vector with shape (time) or sample rate in kHz (optional).

Notes

  • Arbitrary raw timeseries data, not associated with particular units, can be passed in as raw_data (an array with shape (channels, time)).

  • The raw_time argument can also be a sample rate in kHz, in which case it is generated assuming that the start of the raw data corresponds with t=0.

property times

Iterate spike times for all units in time order.

property events

Iterate (index,time) pairs for all units in time order.

idces_times()[source]

Generate matched arrays of unit indices and times for all events.

Returns:

Array of unit indices. times (numpy.ndarray): Array of times for all events.

Return type:

idces (numpy.ndarray)

Notes

  • This method is not a property unlike times and events because the lists must actually be constructed in memory.

property unit_locations: ndarray | None

Get unit locations as an (U, D) array where D is the spatial dimension.

Returns:

Array of unit locations, shape

(N, D). None if any unit lacks location data.

Return type:

locations (numpy.ndarray or None)

Notes

  • Extracts from neuron_attributes ‘location’, ‘x’/’y’/’z’ (x and y required), or ‘position’ keys.

property electrodes: ndarray | None

Get electrode/channel indices for each unit as a 1D array.

Returns:

1D array of electrode

indices. None if any unit lacks electrode data.

Return type:

electrodes (numpy.ndarray or None)

Notes

  • Extracts from neuron_attributes ‘electrode’, ‘channel’, or ‘ch’ keys.

frames(length, overlap=0)[source]

Split the recording into a SpikeSliceStack of fixed-length windows.

Parameters:
  • length (float) – Length of each window in milliseconds.

  • overlap (float) – Overlap between consecutive windows in milliseconds (default: 0).

Returns:

Stack of SpikeData windows, one per

frame.

Return type:

stack (SpikeSliceStack)

Notes

  • Windows that would extend past the end of the recording are excluded.

  • overlap must be strictly less than length.

align_to_events(events, pre_ms, post_ms, *, kind='spike', bin_size_ms=1.0, sigma_ms=10)[source]

Align spike trains to events and return an event-aligned slice stack.

Parameters:
  • events (array-like or str) – Event times in milliseconds, or a string key into self.metadata whose value is an array of event times in ms.

  • pre_ms (float) – Window duration before each event in milliseconds.

  • post_ms (float) – Window duration after each event in milliseconds.

  • kind (str) – "spike" to return a SpikeSliceStack, or "rate" to return a RateSliceStack. Default "spike".

  • bin_size_ms (float) – Time bin width in milliseconds. Only used when kind="rate". Default 1.0.

  • sigma_ms (float) – Gaussian smoothing sigma in milliseconds for ISI-based firing rate estimation. Only used when kind="rate". Default 10.

Returns:

Event-aligned slice

stack with one slice per event. Events whose window extends outside [start_time, start_time + length] are dropped with a warning.

Return type:

stack (SpikeSliceStack or RateSliceStack)

Notes

  • When events is a metadata key, the corresponding array must already be in milliseconds (as stored by load_spikedata_from_ibl).

binned(bin_size=40.0)[source]

Count the number of events in each time bin across all units.

Bins are relative to start_time: (start_time, start_time + bin_size], (start_time + bin_size, start_time + 2*bin_size], etc. A spike at exactly start_time is included in bin 0.

Parameters:

bin_size (float) – Size of the time bin in milliseconds.

Returns:

Array of the number of events in

each bin.

Return type:

binned_raster (numpy.ndarray)

binned_meanrate(bin_size=40, unit='kHz')[source]

Calculate the mean firing rate across the population in each time bin.

Parameters:
  • bin_size (float) – Size of the time bin in milliseconds.

  • unit (str) – Unit of the firing rate (‘Hz’ or ‘kHz’).

Returns:

Array of the mean firing rate

in each bin.

Return type:

binned_meanrate (numpy.ndarray)

Notes

  • The rate is calculated as the number of events in each bin divided by the bin size and number of units.

rates(unit='kHz')[source]

Calculate the mean firing rate of each neuron over the recording.

Parameters:

unit (str) – Unit of the firing rate (‘Hz’ or ‘kHz’).

Returns:

Array of the firing rate of each neuron.

Return type:

rates (numpy.ndarray)

resampled_isi(times, sigma_ms=10.0)[source]

Calculate instantaneous firing rate of each unit at the given times.

Computes interspike intervals and interpolates their inverse.

Parameters:
  • times (numpy.ndarray) – Array of times to resample the firing rate to.

  • sigma_ms (float) – Standard deviation of the Gaussian kernel in milliseconds.

Returns:

Object with inst_Frate_data (N, T) and times;

units: Hz (spikes/s).

Return type:

RateData

sliding_rate(window_size, step_size=None, sampling_rate=None, t_start=None, t_end=None, gauss_sigma=0.0, apply_square=True)[source]

Compute continuous firing rate of each unit using a sliding-window average.

For each time bin t, counts spikes in the centered window [t - W/2, t + W/2] and returns rate R(t) = N / W (spikes per time unit, e.g. kHz).

Parameters:
  • window_size (float) – Width of the sliding window in ms. Centered window [t - W/2, t + W/2].

  • step_size (float, optional) – Advance step for time bins in ms. Mutually exclusive with sampling_rate.

  • sampling_rate (float, optional) – Samples per ms; step_size = 1 / sampling_rate. Mutually exclusive with step_size.

  • t_start (float, optional) – Start of output time range in ms. Default: start_time - window_size/2.

  • t_end (float, optional) – End of output time range in ms. Default: start_time + length + window_size/2.

  • gauss_sigma (float, optional) – Gaussian smoothing sigma in ms. If 0, Gaussian smoothing is disabled.

  • apply_square (bool, optional) – If True, applies the square-window smoothing defined by window_size. If False, computes per-bin rates first and then applies optional Gaussian smoothing.

Returns:

Object with inst_Frate_data (N, T) and times;

units: spikes/ms (kHz).

Return type:

RateData

set_neuron_attribute(key, values, neuron_indices=None)[source]

Set an attribute across neurons in neuron_attributes.

Parameters:
  • key (str) – Name of the attribute.

  • values (single value or list) – Single value (applied to all) or list/array matching neuron_indices length for each neuron.

  • neuron_indices (list) – Neurons to update. If None, updates all.

get_neuron_attribute(key, default=None)[source]

Get an attribute across all neurons.

Parameters:
  • key (str) – Attribute name.

  • default (any) – Value if neuron lacks the attribute.

Returns:

List of values, one per neuron.

Return type:

values (list)

subset(units, by=None)[source]

Return a new SpikeData with only the selected units.

Units are selected either by their indices or by an ID stored under a given key in the neuron_attributes.

Parameters:
  • units (list) – List of unit indices to select.

  • by (str) – Key to select units by in the neuron_attributes. Index-based if None.

Returns:

New SpikeData object with the selected units.

Return type:

sd (SpikeData)

Notes

  • Units are included in the output according to their order in self.train, not the order in the unit list (which is treated as a set).

  • raw_data and raw_time are not propagated to the subset – they remain on the original SpikeData object.

  • If IDs are not unique, every neuron which matches is included in the output.

  • Neurons whose neuron_attributes entry does not have the key are always excluded.

neuron_to_channel_map(channel_attr=None)[source]

Return a mapping from neuron indices to channel indices.

Parameters:

channel_attr (str or None) – Name of the attribute in neuron_attributes that contains the channel index. If None, searches for common attribute names.

Returns:

Mapping from neuron index (int) to channel index

(int). Returns an empty dict if neuron_attributes is None.

Return type:

mapping (dict)

Notes

  • If neuron_attributes is None and channel information is required, or if the specified channel_attr doesn’t exist for all neurons, a ValueError is raised.

  • If channel_attr is not specified, attempts to find channel information using common attribute names: ‘channel’, ‘channel_id’, ‘channel_index’, ‘ch’, ‘channel_idx’.

subtime(start, end, shift_to=None)[source]

Extract a subset of time points from the spike data.

Spike times are shifted so that shift_to becomes t=0 in the new SpikeData. By default shift_to=start, so subtime(100, 200) produces spikes in the range [0, 100). For event-centered slicing, pass shift_to=event_time to produce spikes from -(event - start) to +(end - event).

Parameters:
  • start (float) – Starting time value (inclusive).

  • end (float) – Ending time value (exclusive).

  • shift_to (float or None) – The time value that becomes t=0 in the output. Defaults to start (standard behavior). For event-centered output, pass the event time so that t=0 corresponds to the event.

Returns:

New SpikeData object containing only the

specified time range.

Return type:

sd (SpikeData)

Notes

  • For standard data (start_time >= 0), negative start/end values are counted backwards from the end of the recording. For event-centered data (start_time < 0), negative values are treated as literal times.

  • Start and end can also be None or Ellipsis, in which case that end of the data is not truncated.

  • All metadata and neuron data are propagated.

  • The output SpikeData has start_time = start - shift_to and length = end - start.

append(spikeData, offset=0)[source]

Append spike times from another SpikeData object to this one.

Parameters:
  • spikeData (SpikeData) – SpikeData object to append.

  • offset (float) – Offset in milliseconds to add to the spike times of the appended data.

Returns:

New SpikeData object with the appended data.

Return type:

sd (SpikeData)

Notes

  • The two SpikeData objects must have the same number of neurons.

  • On metadata key collision, values from self take precedence.

sparse_raster(bin_size=1.0, time_offset=0.0)[source]

Bin spike times into a sparse (units, bins) matrix.

Entry (i, j) is the number of times unit i fired in bin j. Spike times are shifted by -start_time before binning so that bin 0 corresponds to start_time.

Parameters:
  • bin_size (float) – Size of the time bin in milliseconds.

  • time_offset (float) – Additional offset added to all spike times before binning (default 0.0). Use this to place spikes at their absolute recording position, e.g. time_offset=500 to shift all spikes by 500 ms in the raster.

Returns:

Sparse array where entry (i, j) is

the number of times unit i fired in bin j.

Return type:

raster (sparse.csr_matrix)

Notes

  • Bins are left-open, right-closed intervals relative to start_time.

  • A spike at exactly start_time is clipped into bin 0.

  • The number of bins is always ceil((length + time_offset) / bin_size).

raster(bin_size=1.0, time_offset=0.0)[source]

Bin spike times into a dense (units, bins) array.

Entry (i, j) is the number of times unit i fired in bin j.

Parameters:
  • bin_size (float) – Size of the time bin in milliseconds.

  • time_offset (float) – Additional offset added to spike times before binning (default 0.0).

Returns:

Dense array where entry (i, j) is the

number of times unit i fired in bin j.

Return type:

raster (numpy.ndarray)

Notes

  • Bins are left-open, right-closed intervals relative to start_time.

  • A spike at exactly start_time is clipped into bin 0.

channel_raster(bin_size=1.0, channel_attr=None)[source]

Create a raster aggregated by channel instead of neuron.

Parameters:
  • bin_size (float) – Size of the time bin in milliseconds.

  • channel_attr (str) – Name of the attribute in neuron_attributes that contains the channel index. If None, searches for common attribute names.

Returns:

Dense array where entry (c, j)

is the total number of spikes from all neurons on channel c in bin j.

Return type:

channel_raster (numpy.ndarray)

Notes

  • Channels are determined from neuron_attributes using the same logic as neuron_to_channel_map().

  • If neuron_attributes is None or no channel information can be found, a ValueError is raised.

get_waveform_traces(unit=None, ms_before=1.0, ms_after=2.0, channels=None, bandpass=None, filter_order=3, store=True, return_channel_waveforms=False, return_avg_waveform=True)[source]

Extract raw voltage waveforms around spike times from raw data.

Parameters:
  • unit (int, slice, list, or None) – Unit index selection. int extracts a single unit (returns a single waveform array); slice/list-like/range extracts a subset (returns a list); None extracts all units (returns a list).

  • ms_before (float) – Milliseconds before each spike time (default: 1.0).

  • ms_after (float) – Milliseconds after each spike time (default: 2.0).

  • channels (int, list, or None) – Channel(s) to extract. None uses neuron_to_channel_map or all channels; int for single channel; list for multiple; [] for mapped channel.

  • bandpass (tuple or None) – Optional (lowcut_Hz, highcut_Hz) for bandpass filtering.

  • filter_order (int) – Butterworth filter order (default: 3).

  • store (bool) – If True (default), stores waveforms and avg_waveform in neuron_attributes.

  • return_channel_waveforms (bool) – If True, include a per-channel dict in the return.

  • return_avg_waveform (bool) – If False, skip computing/returning avg_waveform (it will be None).

Returns:

If unit is an int, a single

3D array shaped (num_channels, num_samples, num_spikes). Otherwise, a list of 3D arrays, one per selected unit.

meta (dict): Dictionary with keys: fs_kHz, unit_indices,

channels, spike_times_ms, avg_waveforms (optional), channel_waveforms (optional).

Return type:

waveforms (numpy.ndarray or list)

interspike_intervals()[source]

Produce a list of arrays of interspike intervals per unit.

Returns:

List of arrays of interspike intervals per unit.

Return type:

isis (list)

concatenate_spike_data(sd)[source]

Combine units from another SpikeData object with this one.

Returns a new SpikeData with the units of both objects. If the other SpikeData has a different time range, it is subtimed to match this one.

Parameters:

sd (SpikeData) – SpikeData object whose units will be added.

Returns:

New SpikeData with units from both objects.

Return type:

combined (SpikeData)

Notes

  • raw_data and raw_time are carried over from self only.

  • If only one object has neuron_attributes, a RuntimeWarning is issued and the attributes from both are not merged.

spike_time_tilings(delt=20.0)[source]

Compute the spike time tiling coefficient matrix.

Parameters:

delt (float) – Time window in milliseconds (default: 20.0).

Returns:

Spike time tiling coefficient matrix.

Return type:

ret (PairwiseCompMatrix)

Notes

  • When numba is installed, computation is parallelised across all unit pairs using numba’s prange.

  • Reference: Cutts & Eglen. Detecting pairwise correlations in spike trains. J. Neurosci. 34:43, 14288-14303 (2014).

spike_time_tiling(i, j, delt=20.0)[source]

Calculate the spike time tiling coefficient between two units.

Parameters:
  • i (int) – Index of the first unit.

  • j (int) – Index of the second unit.

  • delt (float) – Time window in milliseconds (default: 20.0).

Returns:

Spike time tiling coefficient between the two units.

Return type:

ret (float)

Notes

  • Reference: Cutts & Eglen. Detecting pairwise correlations in spike trains. J. Neurosci. 34:43, 14288-14303 (2014).

get_pairwise_ccg(compare_func=<function compute_cross_correlation_with_lag>, bin_size=1.0, max_lag=350, n_jobs=-1)[source]

Compute pairwise cross-correlogram matrices from binned spike arrays.

Bins the spike trains into a binary raster and computes the pairwise similarity between all unit pairs using lagged cross-correlation (default) or lagged cosine similarity.

Parameters:
  • compare_func (callable) – Comparison function from utils. Must accept (ref_signal, comp_signal, max_lag=int) and return (score, lag). Default is compute_cross_correlation_with_lag.

  • bin_size (float) – Bin size in milliseconds for the binary raster (default: 1.0).

  • max_lag (float) – Maximum lag in milliseconds to search for the peak correlation (default: 350). Converted to bins internally.

  • n_jobs (int) – Number of threads for parallel computation. -1 uses all cores (default), 1 disables parallelism, None is serial.

Returns:

Matrix of maximum correlation

coefficients between all unit pairs. Diagonal is always 1.

lag_matrix (PairwiseCompMatrix): Matrix of time lags in bins at

which maximum correlation occurs. Positive lag means unit j leads unit i. Diagonal is always 0.

Return type:

corr_matrix (PairwiseCompMatrix)

latencies(times, window_ms=100.0)[source]

Compute latencies from each time to the nearest spike per unit.

Parameters:
  • times (list) – List of times.

  • window_ms (float) – Window in milliseconds (default: 100.0).

Returns:

2D list, each row is a list of latencies from

a time to the nearest spike in the train.

Return type:

latencies (list)

get_pairwise_latencies(window_ms=None, return_distributions=False)[source]

Compute pairwise nearest-spike latency distributions between all unit pairs.

For each ordered pair (i, j), and for each spike in train i, finds the closest spike in train j and records the signed latency (t_j - t_i). Both directions are computed independently.

Parameters:
  • window_ms (float or None) – If not None, discard latencies where the absolute distance exceeds this value (default: None, no filtering).

  • return_distributions (bool) – If True, also return a (U, U) numpy object array where entry [i, j] is an ndarray of all individual signed latencies from unit i to unit j (default: False).

Returns:

Matrix of mean signed

latencies in milliseconds. Entry [i, j] is the average latency from each spike in unit i to the nearest spike in unit j. Diagonal is 0.

std_latency (PairwiseCompMatrix): Matrix of latency standard

deviations. Entry [i, j] is the std of latencies from unit i to unit j. Diagonal is 0.

distributions (numpy.ndarray): Only returned when

return_distributions is True. A (U, U) object array where [i, j] is an ndarray of all signed latencies from unit i to unit j.

Return type:

mean_latency (PairwiseCompMatrix)

latencies_to_index(i, window_ms=100.0)[source]

Compute the latency from one unit to all other units.

Parameters:
  • i (int) – Index of the unit.

  • window_ms (float) – Window in milliseconds (default: 100.0).

Returns:

2D list, each row is a list of latencies per

neuron.

Return type:

latencies (list)

get_frac_active(edges, MIN_SPIKES, backbone_threshold, bin_size=1.0)[source]

Compute fraction of units active per burst and backbone identity.

Parameters:
  • edges (numpy.ndarray) – Array of shape (B, 2) containing [start, end] indices for each burst. Indices are in raster bin coordinates (bin index = time_ms / bin_size).

  • MIN_SPIKES (int) – Minimum number of spikes required for a unit to be considered active in a burst.

  • backbone_threshold (float) – Minimum fraction of bursts a unit must be active in to be considered a backbone unit (0 to 1).

  • bin_size (float) – Raster bin size in milliseconds (default 1.0). Must match the bin size used to compute edges.

Returns:

1D array where each value is the

fraction of bursts in which the neuron was active.

frac_per_burst (numpy.ndarray): 1D array where each value is the

fraction of neurons active in that burst.

backbone_units (numpy.ndarray): 1D array of the neuron/unit

indices that are backbone units.

Return type:

frac_per_unit (numpy.ndarray)

get_frac_spikes_in_burst(edges, bin_size=1.0)[source]

Compute the fraction of each unit’s spikes that fall inside burst windows.

Parameters:
  • edges (numpy.ndarray) – Array of shape (B, 2) containing [start, end] indices for each burst. Indices are in raster bin coordinates (bin index = time_ms / bin_size).

  • bin_size (float) – Raster bin size in milliseconds (default 1.0). Must match the bin size used to compute edges.

Returns:

1D array of shape (N,)

where each value is the fraction of the unit’s total spikes that fall inside any burst window. NaN for units with zero spikes.

Return type:

frac_spikes_in_burst (numpy.ndarray)

spike_shuffle(swap_per_spike=5, seed=None, bin_size=1)[source]

Shuffle the spike matrix using degree-preserving double-edge swaps.

Parameters:
  • swap_per_spike (int) – Determines total number of swaps: num_spikes * swap_per_spike (default: 5).

  • seed (int or None) – Random seed for repeatability. None means no seed is set (default: None).

  • bin_size (int) – Number of individual time steps per bin. Bins with multiple spikes are binarized to 1. A RuntimeWarning is issued when multi-spike bins are detected (default: 1).

Returns:

SpikeData object with shuffled

spike train matrix.

Return type:

shuffled_spike_data (SpikeData)

Notes

  • Each neuron’s average firing rate is preserved, but the specific time bin in which it spikes is shuffled.

  • Each time bin’s population rate is preserved, but the specific units active in each time bin are shuffled.

  • Every spike swap involves 2 different spikes so on average, every spike will get swapped 2*swap_per_spike times.

  • Reference: Okun, M. et al. Population rate dynamics and multineuron firing patterns in sensory cortex. J. Neurosci. 32, 17108-17119 (2012).

spike_shuffle_stack(n_shuffles, seed=None, swap_per_spike=5, bin_size=1)[source]

Generate multiple shuffled copies as a SpikeSliceStack.

Each shuffle is an independent call to spike_shuffle. The resulting stack can be used with SpikeSliceStack.apply to build null distributions for statistical testing.

Parameters:
  • n_shuffles (int) – Number of shuffled datasets to generate.

  • seed (int or None) – Base random seed. Each shuffle uses seed + i for reproducibility. None means no seed.

  • swap_per_spike (int) – Forwarded to spike_shuffle (default: 5).

  • bin_size (int) – Forwarded to spike_shuffle (default: 1).

Returns:

Stack of n_shuffles shuffled SpikeData

objects. All slices share the same time bounds.

Return type:

stack (SpikeSliceStack)

subset_stack(n_subsets, units_per_subset, seed=None)[source]

Generate multiple random unit subsets as a SpikeSliceStack.

Each subset is drawn by sampling units_per_subset unit indices without replacement from the full unit set. Draws are independent across subsets (with replacement across draws), so the same unit may appear in multiple subsets.

Parameters:
  • n_subsets (int) – Number of random subsets to generate.

  • units_per_subset (int) – Number of units in each subset.

  • seed (int or None) – Random seed for reproducibility.

Returns:

Stack of n_subsets subsetted SpikeData

objects. All slices share the same time bounds.

Return type:

stack (SpikeSliceStack)

Notes

  • The stack-level neuron_attributes is None because each subset contains a different set of units. Individual SpikeData objects within the stack carry their own subsetted attributes.

to_hdf5(filepath, *, style='ragged', raster_dataset='raster', raster_bin_size_ms=None, spike_times_dataset='spike_times', spike_times_index_dataset='spike_times_index', spike_times_unit='s', fs_Hz=None, group_per_unit='units', group_time_unit='s', idces_dataset='idces', times_dataset='times', times_unit='ms', raw_dataset=None, raw_time_dataset=None, raw_time_unit='ms')[source]

Export this SpikeData to an HDF5 file with flexible formatting.

Supports four storage styles: ‘raster’ (dense 2D array), ‘ragged’ (flat spike times with index array), ‘group’ (separate dataset per unit), and ‘paired’ (parallel index and time arrays).

Parameters:
  • filepath (str) – Path to the output HDF5 file.

  • style (str) – Storage format style (‘raster’, ‘ragged’, ‘group’, or ‘paired’). Defaults to ‘ragged’.

  • raster_dataset (str) – Dataset name for raster data (style=’raster’).

  • raster_bin_size_ms (float) – Bin size in milliseconds for rasterization. Required for ‘raster’ style.

  • spike_times_dataset (str) – Dataset name for flat spike times (style=’ragged’).

  • spike_times_index_dataset (str) – Dataset name for cumulative spike counts per unit (style=’ragged’).

  • spike_times_unit (str) – Time unit for spike times in ragged format (‘ms’, ‘s’, or ‘samples’).

  • fs_Hz (float) – Sampling frequency in Hz, required when converting to ‘samples’ unit.

  • group_per_unit (str) – Group name containing per-unit datasets (style=’group’).

  • group_time_unit (str) – Time unit for individual unit datasets (‘ms’, ‘s’, or ‘samples’).

  • idces_dataset (str) – Dataset name for unit indices (style=’paired’).

  • times_dataset (str) – Dataset name for spike times (style=’paired’).

  • times_unit (str) – Time unit for paired times (‘ms’, ‘s’, or ‘samples’).

  • raw_dataset (str or None) – Reserved for future raw data export.

  • raw_time_dataset (str or None) – Reserved for future raw time axis export.

  • raw_time_unit (str) – Time unit for raw data timestamps (‘ms’, ‘s’, or ‘samples’).

Return type:

None

Notes

  • All spike times are stored internally in milliseconds and converted to the requested output unit.

  • When using ‘samples’ unit, fs_Hz must be provided for proper conversion.

to_nwb(filepath, *, spike_times_dataset='spike_times', spike_times_index_dataset='spike_times_index', group='units')[source]

Export this SpikeData to a minimal NWB-compatible HDF5 file.

Stores spike times in the standard ‘/units’ group format for round-tripping with the NWB loader.

Parameters:
  • filepath (str) – Path to the output NWB file (.nwb extension recommended).

  • spike_times_dataset (str) – Name of the dataset containing flattened spike times in seconds. Standard NWB uses “spike_times”.

  • spike_times_index_dataset (str) – Name of the dataset containing cumulative spike counts per unit for indexing into spike_times. Standard NWB uses “spike_times_index”.

  • group (str) – Name of the HDF5 group to contain the spike data. Standard NWB uses “units” for the units table.

Return type:

None

Notes

  • Spike times are automatically converted from internal milliseconds to seconds as required by the NWB standard.

  • The output file contains only the essential spike timing data, not the full NWB metadata structure.

  • Compatible with both pynwb and h5py-based NWB readers.

to_kilosort(folder, *, fs_Hz, spike_times_file='spike_times.npy', spike_clusters_file='spike_clusters.npy', time_unit='samples', cluster_ids=None)[source]

Export this SpikeData to a KiloSort/Phy-compatible folder.

Creates spike_times.npy and spike_clusters.npy arrays for use with Phy.

Parameters:
  • folder (str) – Output directory path. Will be created if it doesn’t exist.

  • fs_Hz (float) – Sampling frequency in Hz. Required for time unit conversion, especially when using ‘samples’.

  • spike_times_file (str) – Filename for the spike times array. Standard KiloSort uses “spike_times.npy”.

  • spike_clusters_file (str) – Filename for the cluster assignments array. Standard KiloSort uses “spike_clusters.npy”.

  • time_unit (str) – Output time unit for spike times (‘samples’, ‘ms’, or ‘s’).

  • cluster_ids (list[int] or None) – Optional list of cluster IDs to assign to each unit. Must have length equal to self.N. If None, uses sequential integers [0, 1, 2, …].

Returns:

Paths to the created

(spike_times_file, spike_clusters_file).

Return type:

paths (tuple[str, str])

Notes

  • Empty units (no spikes) are skipped in the output arrays.

  • Cluster IDs are mapped to units in order, so cluster_ids[i] corresponds to unit i in the SpikeData.

  • The ‘samples’ time unit is most common for KiloSort workflows.

get_pop_rate(square_width=20, gauss_sigma=100, raster_bin_size_ms=1.0)[source]

Compute smoothed population firing rate.

Smooths the summed spike counts using a moving-average (square) window, then a Gaussian smoothing window.

Parameters:
  • square_width (float) – Width of square smoothing window in milliseconds.

  • gauss_sigma (float) – Sigma of Gaussian smoothing window in milliseconds.

  • raster_bin_size_ms (float) – Size of raster bins in ms.

Returns:

Smoothed population spiking data in

spikes per bin.

Return type:

pop_rate (numpy.ndarray)

Notes

  • square_width and gauss_sigma are specified in milliseconds and converted to bin counts internally using raster_bin_size_ms. With the default raster_bin_size_ms=1.0, 1 ms = 1 bin.

  • The returned array index corresponds to raster bin index. For event-centered data (start_time < 0), bin 0 corresponds to start_time, not t=0. To find the bin for t=0 (the event), use event_bin = int(-start_time / raster_bin_size_ms).

compute_spike_trig_pop_rate(window_ms=80, cutoff_hz=20, fs=1000, bin_size=1, cut_outer=10)[source]

Compute spike-triggered population rate (stPR).

Implementation of the stPR measure from Bimbard et al., building on Okun et al. (Nature, 2015). For each neuron i and lag tau, the leave-one-out population rate is computed excluding neuron i. The coupling curve measures how much the other neurons’ activity deviates from their temporal mean at times offset by tau from neuron i’s spikes.

Parameters:
  • window_ms (int) – Half-width of the lag window in ms (window from -window_ms to +window_ms).

  • cutoff_hz (float) – Low-pass Butterworth filter cutoff in Hz applied to the coupling curves.

  • fs (float) – Sampling rate in Hz used for filter design.

  • bin_size (float) – Bin size in ms for the spike raster.

  • cut_outer (int) – Number of outer lag bins to ignore.

Returns:

Low-pass filtered coupling

curves for every neuron, shape (N, 2*window_ms + 1).

coupling_strengths_zero_lag (numpy.ndarray): Coupling strength

at lag 0, shape (N,).

coupling_strengths_max (numpy.ndarray): Peak coupling strength

within the trimmed lag window, shape (N,).

delays (numpy.ndarray): Lag (in ms) at which peak coupling

occurs, shape (N,). Positive means neuron leads population.

lags (numpy.ndarray): Array of lag values from -window_ms to

+window_ms.

Return type:

stPR_filtered (numpy.ndarray)

get_bursts(thr_burst, min_burst_diff, burst_edge_mult_thresh, square_width=20, gauss_sigma=100, acc_square_width=8, acc_gauss_sigma=8, raster_bin_size_ms=1.0, peak_to_trough=True, pop_rate=None, pop_rate_acc=None, pop_rms_override=None)[source]

Detect bursts using thresholded peak finding and edge detection.

Parameters:
  • thr_burst (float) – Threshold multiplier for burst peak detection.

  • min_burst_diff (int) – Minimum time between detected bursts (in bins).

  • burst_edge_mult_thresh (float) – Threshold multiplier for burst edge detection.

  • square_width (float) – Square window width for calculating pop_rate (in milliseconds).

  • gauss_sigma (float) – Gaussian window sigma for calculating pop_rate (in milliseconds).

  • acc_square_width (float) – Square window width for calculating pop_rate_acc (in milliseconds).

  • acc_gauss_sigma (float) – Gaussian window sigma for calculating pop_rate_acc (in milliseconds).

  • raster_bin_size_ms (float) – Time bin size for calculating population rate in ms.

  • peak_to_trough (bool) – Flag to calculate bursts peak-to-trough (True) or peak-to-zero (False).

  • pop_rate (numpy.ndarray or None) – Pre-calculated smoothed population spiking data in spikes per bin.

  • pop_rate_acc (numpy.ndarray or None) – Pre-calculated accurate smoothed population spiking data in spikes per bin.

  • pop_rms_override (float or None) – RMS override for burst threshold baseline.

Returns:

Time bin indices of detected bursts. edges (numpy.ndarray): Burst edge indices, shape (N, 2). peak_amp (numpy.ndarray): Amplitudes of bursts at corresponding array indices.

Return type:

tburst (numpy.ndarray)

Notes

  • Will use pop_rate and pop_rate_acc if provided, otherwise will calculate using squared widths and sigmas.

  • Using the peak-to-zero calculations may result in several bursts being detected at one peak.

  • Returned time bin indices are relative to bin 0 of the raster. For event-centered data (start_time < 0), convert to event-relative ms via tburst * raster_bin_size_ms + start_time.

burst_sensitivity(thr_values, dist_values, burst_edge_mult_thresh, square_width=20, gauss_sigma=100, acc_square_width=8, acc_gauss_sigma=8, raster_bin_size_ms=1.0, peak_to_trough=True, pop_rate=None, pop_rate_acc=None, pop_rms_override=None)[source]

Sweep burst detection parameters and return burst counts.

Calls get_bursts for every combination of thr_values and dist_values, holding burst_edge_mult_thresh constant.

Parameters:
  • thr_values (array-like) – 1-D array of thr_burst values to sweep.

  • dist_values (array-like) – 1-D array of min_burst_diff values (in bins) to sweep.

  • burst_edge_mult_thresh (float) – Held constant during the sweep.

  • square_width (int) – Square window width for pop_rate (in bins).

  • gauss_sigma (int) – Gaussian window sigma for pop_rate (in bins).

  • acc_square_width (int) – Square window width for pop_rate_acc (in bins).

  • acc_gauss_sigma (int) – Gaussian window sigma for pop_rate_acc (in bins).

  • raster_bin_size_ms (float) – Time bin size for population rate in ms.

  • peak_to_trough (bool) – Peak-to-trough (True) or peak-to-zero (False) burst detection.

  • pop_rate (numpy.ndarray or None) – Pre-computed smoothed population rate.

  • pop_rate_acc (numpy.ndarray or None) – Pre-computed accurate smoothed population rate.

  • pop_rms_override (float or None) – RMS override for burst threshold baseline.

Returns:

Integer array of shape

(len(thr_values), len(dist_values)) with the number of detected bursts for each parameter combination.

Return type:

burst_counts (numpy.ndarray)

Notes

  • Either thr_values or dist_values can have length 1 to focus the sensitivity analysis on a single parameter.

  • Pre-computing pop_rate and pop_rate_acc and passing them in avoids redundant smoothing inside the loop.

fit_gplvm(bin_size_ms=50.0, movement_variance=1.0, tuning_lengthscale=10.0, n_latent_bin=100, n_iter=20, n_time_per_chunk=10000, random_seed=3, model_class=None, **model_kwargs)[source]

Fit a Gaussian Process Latent Variable Model to binned spike counts.

Bins the spike data into a spike count matrix, fits a GPLVM model via expectation-maximisation, decodes latent states, and returns the results together with a unit reordering based on tuning peaks.

Parameters:
  • bin_size_ms (float) – Bin width in milliseconds for spike count matrix.

  • movement_variance (float) – Movement variance hyperparameter for the GPLVM transition kernel.

  • tuning_lengthscale (float) – Lengthscale hyperparameter for the tuning curve kernel.

  • n_latent_bin (int) – Number of latent bins (discretisation of the latent space).

  • n_iter (int) – Number of EM iterations.

  • n_time_per_chunk (int) – Number of time bins per chunk for chunked inference (controls memory usage).

  • random_seed (int) – Random seed for JAX PRNG key.

  • model_class (class or None) – Model class to use. Defaults to PoissonGPLVMJump1D from poor_man_gplvm.

  • **model_kwargs – Additional keyword arguments passed to the model constructor (e.g. p_move_to_jump, basis_type).

Returns:

Dictionary with keys: "decode_res" (decoded

latent state dictionary), "log_marginal_l" (array of log marginal likelihoods per EM iteration), "reorder_indices" (unit reordering indices based on tuning curve peaks), "model" (the fitted model object), "binned_spike_counts" (the (T, N) binned spike count matrix), and "bin_size_ms" (the bin width used).

Return type:

result (dict)

Notes

  • Requires poor_man_gplvm and jax. Install with pip install poor-man-gplvm jax jaxlib.

  • The binned spike count matrix has shape (T, N) where T is the number of time bins and N is the number of units.

  • To compute metrics from the fitted model, see the GPLVM utility functions in spikedata.utils.

plot(**kwargs)[source]

Assemble a multi-panel column figure from this SpikeData object.

Thin wrapper around plot_utils.plot_recording(self, **kwargs). See plot_recording for the full list of parameters.

Parameters:

**kwargs – All keyword arguments are forwarded to plot_recording.

Returns:

The assembled figure.

Return type:

fig (matplotlib.Figure)

plot_spatial_network(ax, matrix, edge_threshold=None, top_pct=None, node_size_range=(2, 20), node_cmap='viridis', node_linewidth=0.2, edge_color='red', edge_linewidth=0.6, edge_alpha_range=(0.15, 1.0), scale_bar_um=500, font_size=None)[source]

Plot units at their MEA positions with pairwise edges.

Unit positions are extracted from neuron_attributes (requires 'x' and 'y' keys). Thin wrapper around plot_utils.plot_spatial_network.

Parameters:
  • ax (matplotlib.axes.Axes) – Target axes.

  • matrix (numpy.ndarray) – Symmetric (N, N) pairwise matrix (e.g. correlation). Diagonal values are ignored.

  • edge_threshold (float or None) – Minimum matrix value to draw an edge.

  • top_pct (float or None) – Percentage of top edges to draw.

  • node_size_range (tuple) – (min_size, max_size) in points squared for scatter markers.

  • node_cmap (str) – Matplotlib colourmap for node colour.

  • node_linewidth (float) – Outline width of node markers.

  • edge_color (str) – Colour for network edges.

  • edge_linewidth (float) – Line width for network edges.

  • edge_alpha_range (tuple) – (min_alpha, max_alpha) for edge transparency.

  • scale_bar_um (float) – Scale bar length in micrometres (0 to omit).

  • font_size (int or None) – Font size for scale bar label.

Returns:

The scatter

artist, useful for adding a colorbar.

Return type:

scatter (matplotlib.collections.PathCollection)

plot_aligned_pop_rate(events=None, pre_ms=250, post_ms=500, ax=None, pop_rate=None, pop_rate_params=None, color=None, label=None, linewidth=1.5, show_individual=False, individual_alpha=0.15, individual_linewidth=0.5, burst_edges=None, edge_percentile=None, xlabel='Time from event (ms)', ylabel='Pop. rate', font_size=None)[source]

Plot the average population rate aligned to events.

Cuts the population rate around each event and plots the mean trace. When events is None, burst peaks are auto-detected via self.get_bursts().

Parameters:
  • events (array-like or None) – Event times in ms. If None, burst peaks are auto-detected.

  • pre_ms (float) – Window duration before each event in ms.

  • post_ms (float) – Window duration after each event in ms.

  • ax (matplotlib.axes.Axes or None) – Target axes. If None, a new figure is created.

  • pop_rate (numpy.ndarray or None) – Pre-computed population rate (1-ms bins, full recording). If None, computed via self.get_pop_rate().

  • pop_rate_params (dict or None) – Keyword arguments forwarded to self.get_pop_rate() when pop_rate is None. Defaults: square_width=5, gauss_sigma=5.

  • color (str or None) – Colour for the mean trace. If None, uses the first colour from the default colour cycle.

  • label (str or None) – Legend label for the mean trace.

  • linewidth (float) – Line width for the mean trace.

  • show_individual (bool) – If True, plot each individual event’s pop-rate trace as a thin line behind the average.

  • individual_alpha (float) – Alpha for individual traces.

  • individual_linewidth (float) – Line width for individual traces.

  • burst_edges (numpy.ndarray or None) – Per-event [start_ms, end_ms] boundaries, shape (B, 2).

  • edge_percentile (float or None) – Percentile (0-100) controlling how conservatively the edge markers are placed. When set, vertical dashed lines are drawn at the resulting positions.

  • xlabel (str) – X-axis label.

  • ylabel (str) – Y-axis label.

  • font_size (int or None) – Font size for labels and ticks. If None, uses current rcParams.

Returns:

The mean population rate across

events, shape (pre_ms + post_ms,).

Return type:

avg_rate (numpy.ndarray)

Notes

  • To compare multiple conditions, call this method once per condition on the same ax. Each call draws its own trace with its own color and label; add ax.legend() after the last call.

curate_by_min_spikes(min_spikes=30)[source]

Remove units with fewer than min_spikes spikes.

See spikelab.spikedata.curation.curate_by_min_spikes for full documentation.

curate_by_firing_rate(min_rate_hz=0.05)[source]

Remove units whose firing rate is below min_rate_hz.

See spikelab.spikedata.curation.curate_by_firing_rate for full documentation.

curate_by_isi_violations(max_violation=0.01, threshold_ms=1.5, min_isi_ms=0.0, method='percent')[source]

Remove units with excessive ISI violations.

See spikelab.spikedata.curation.curate_by_isi_violations for full documentation.

curate_by_snr(min_snr=5.0, ms_before=1.0, ms_after=2.0)[source]

Remove units whose SNR is below min_snr.

See spikelab.spikedata.curation.curate_by_snr for full documentation.

curate_by_std_norm(max_std_norm=1.0, at_peak=True, window_ms_before=0.5, window_ms_after=1.5, ms_before=1.0, ms_after=2.0)[source]

Remove units whose normalized waveform STD exceeds max_std_norm.

See spikelab.spikedata.curation.curate_by_std_norm for full documentation.

compute_waveform_metrics(ms_before=1.0, ms_after=2.0, at_peak=True, window_ms_before=0.5, window_ms_after=1.5)[source]

Compute average waveforms, SNR, and normalized STD for all units.

Stores results in neuron_attributes. See spikelab.spikedata.curation.compute_waveform_metrics for full documentation.

curate(**kwargs)[source]

Apply multiple curation criteria in sequence (intersection).

See spikelab.spikedata.curation.curate for full documentation and supported keyword arguments.

curate_by_merge_duplicates(**kwargs)[source]

Remove duplicate units by merging nearby pairs with similar waveforms.

See spikelab.spikedata.curation.curate_by_merge_duplicates for full documentation and supported keyword arguments.

static build_curation_history(sd_original, sd_curated, results, parameters=None)[source]

Translate curation results into a serializable history dict.

See spikelab.spikedata.curation.build_curation_history for full documentation.

split_epochs()[source]

Split a concatenated SpikeData into per-epoch SpikeData objects.

Uses metadata["rec_chunks_ms"] (list of (start_ms, end_ms) tuples) to slice this SpikeData via subtime. Each resulting SpikeData receives the corresponding epoch template from neuron_attributes["epoch_templates"] as its "template" attribute.

Parameters:

metadata. (None. Epoch boundaries and names are read from)

Returns:

One SpikeData per epoch, time-shifted

so each starts at t=0.

Return type:

epochs (list[SpikeData])

Notes

  • Requires metadata["rec_chunks_ms"]. Raises ValueError if not present.

  • metadata["rec_chunk_names"] (optional) is stored as metadata["source_file"] on each output SpikeData.

compare_sorter(other, comparison_type='spike_times', delta_ms=0.4, f_rel_to_trough=(20, 40), max_lag=5, n_jobs=None)[source]

Compare this sorter output against another SpikeData object.

Implements the same comparison methodology as SpikeInterface’s SymmetricSortingComparison (Buccino et al., eLife 2020): greedy spike matching within a temporal window followed by Jaccard agreement scoring. Use best_match_assignment() on the returned agreement matrix to obtain the optimal unit mapping via the Hungarian algorithm (equivalent to SpikeInterface’s get_best_unit_match).

Parameters:
  • other (SpikeData) – SpikeData from a different sorter to compare against.

  • comparison_type (Literal['spike_times', 'waveforms']) – "spike_times" for spike-train agreement or "waveforms" for template footprint similarity.

  • delta_ms (float) – Maximum temporal distance (ms) for a spike match (spike_times only).

  • f_rel_to_trough (tuple[int, int]) – (pre, post) sample window around the trough for footprint construction (waveforms only).

  • max_lag (int) – Maximum lag in samples for footprint cosine similarity search (waveforms only).

  • n_jobs (int or None) – Number of parallel workers. None or 1 for serial execution, -1 for all cores.

Returns:

Comparison output dictionary containing:
  • labels_1 / labels_2: unit indices

  • metadata: comparison settings

  • For spike_times: agreement, frac_1, frac_2

  • For waveforms: similarity

Return type:

result (dict)

References

Buccino et al., “SpikeInterface, a unified framework for spike sorting”, eLife (2020). https://doi.org/10.7554/eLife.61834

static best_match_assignment(score_matrix, minimize=False)[source]

Compute optimal unit assignment from a pairwise score matrix.

Uses the Hungarian algorithm (scipy.optimize.linear_sum_assignment) to find the assignment of rows (sorter 1 units) to columns (sorter 2 units) that maximizes (or minimizes) the total score. When the matrix is non-square, unmatched units are reported separately. This is equivalent to SpikeInterface’s get_best_unit_match step.

Parameters:
  • score_matrix (np.ndarray) – Pairwise score matrix of shape (M, N), e.g. the agreement or similarity matrix returned by compare_sorter().

  • minimize (bool) – If True, find the assignment that minimizes the total score (useful for distance matrices). Default is False (maximize).

Returns:

Dictionary containing:
  • row_indices: matched row indices (length = min(M, N))

  • col_indices: matched column indices (length = min(M, N))

  • scores: score for each matched pair

  • total_score: sum of matched scores

  • unmatched_rows: row indices with no match (if M > N)

  • unmatched_cols: col indices with no match (if N > M)

  • row_order: full row permutation array (length M) — matched rows first, then unmatched. Apply to any (M, …) array via array[row_order].

  • col_order: full column permutation array (length N) — matched cols first, then unmatched. Apply to any (…, N) array via array[:, col_order].

  • reordered_matrix: score_matrix reordered so that matched pairs lie along the diagonal (equivalent to score_matrix[np.ix_(row_order, col_order)])

Return type:

result (dict)