SpikeData
The SpikeData class is the primary data structure in SpikeLab. It holds
spike trains for one or more units (neurons or electrodes), where each spike
train is a sequence of spike times. All analysis methods assume spike times are
in milliseconds. Most analysis workflows start by creating or loading a
SpikeData object.
- class spikelab.spikedata.SpikeData(train, *, N=None, length=None, start_time=0.0, neuron_attributes=None, metadata=None, raw_data=None, raw_time=None)[source]
Bases:
objectNeuronal spike data with functionality for loading, processing, and analyzing.
- train
List of numpy arrays, where each array contains the spike times for a particular neuron.
- Type:
- start_time
The time origin in milliseconds (default 0.0). For event-centered data, this is typically -pre_ms. Spike times fall within [start_time, start_time + length].
- Type:
- neuron_attributes
A list of dictionaries containing information on each neuron.
- metadata
A dictionary containing any additional information or metadata about the spike data.
- Type:
- raw_data
If provided, this numpy array contains the raw time series data.
- Type:
- raw_time
Either a numpy array of sample times, or a single float representing a sample rate in kHz.
- Type:
- static from_idces_times(idces, times, N=None, **kwargs)[source]
Create a SpikeData object from lists of unit indices and spike times.
- Parameters:
- Returns:
- A new SpikeData object with the given
unit indices and spike times.
- Return type:
spike_data (SpikeData)
Notes
This method is a wrapper around the _train_from_i_t_list helper function.
When
idcesis empty andNis None, defaults to 0 units andlength=0.
- static from_raster(raster, bin_size_ms, **kwargs)[source]
Create a SpikeData object from a spike raster matrix with shape (N, T).
- Parameters:
raster (numpy.ndarray) – Spike raster matrix with shape (N [units], T [time bins]).
bin_size_ms (float) – Size of each time bin in milliseconds.
**kwargs – Additional keyword arguments for the SpikeData constructor.
- Returns:
Object with the given spike raster.
- Return type:
sd (SpikeData)
Notes
The generated spike times are evenly spaced within each time bin. For example, if a unit fires 3 times in a 10 ms time bin, those events go at 2.5, 5, and 7.5 ms after the start of the bin.
All metadata parameters of the regular constructor are accepted.
For event-centered rasters (where bin 0 corresponds to
start_time, not t=0), passstart_timein kwargs so that spike times are correctly offset. Without it, bin 0 maps to t=0.
- static from_events(events, N=None, **kwargs)[source]
Create a SpikeData object from a list of (unit index, time) pairs.
- Parameters:
- Returns:
Object with the given events.
- Return type:
sd (SpikeData)
Notes
This method is a wrapper around the from_idces_times helper function. All metadata parameters of the regular constructor are accepted.
- static from_neo_spiketrains(spiketrains, **kwargs)[source]
Create a SpikeData object from a list of neo.SpikeTrain objects.
- static from_thresholding(data, fs_Hz=20000.0, threshold_sigma=5.0, filter=True, hysteresis=True, direction='both')[source]
Create a SpikeData object by filtering and thresholding raw electrophysiological data.
- Parameters:
data (numpy.ndarray) – Raw data with shape (channels, time).
fs_Hz (float) – Sampling frequency in Hz.
threshold_sigma (float) – Threshold in units of per-channel standard deviation.
filter (dict or bool) – Filter configuration or False to disable filtering; if True, a third-order Butterworth filter with passband 300 Hz to 6 kHz is used.
hysteresis (bool) – Use hysteresis for thresholding.
direction (str) – Direction of thresholding (‘both’, ‘up’, ‘down’).
- Returns:
Object with the given raw data.
- Return type:
sd (SpikeData)
Notes
To use different filter parameters, pass a dictionary, which will be passed as keyword arguments to butter_filter().
If filter is False, no filtering is done.
- __init__(train, *, N=None, length=None, start_time=0.0, neuron_attributes=None, metadata=None, raw_data=None, raw_time=None)[source]
Initialize a SpikeData object from a list of spike trains.
- Parameters:
train (list) – List of spike trains, each a list of spike times in milliseconds. Spike times can be negative for event-centered data (e.g. -200 to +300 around a stimulus event).
N (int) – Number of units (optional).
length (float) – Total duration of the time window in milliseconds (optional). For event-centered data with times from -200 to +300, length is 500. Defaults to the span from start_time to the latest spike time.
start_time (float) – Time of the first bin in milliseconds (default 0.0). For event-centered data, this is typically
-pre_ms(e.g. -200.0). Spike times must fall within[start_time, start_time + length].neuron_attributes (list) – List of neuron attributes (optional).
metadata (dict) – Dictionary of metadata (optional).
raw_data (numpy.ndarray) – Raw timeseries data with shape (channels, time) (optional).
raw_time (numpy.ndarray or float) – Raw time vector with shape (time) or sample rate in kHz (optional).
Notes
Arbitrary raw timeseries data, not associated with particular units, can be passed in as
raw_data(an array with shape (channels, time)).The
raw_timeargument can also be a sample rate in kHz, in which case it is generated assuming that the start of the raw data corresponds with t=0.
- property times
Iterate spike times for all units in time order.
- property events
Iterate (index,time) pairs for all units in time order.
- idces_times()[source]
Generate matched arrays of unit indices and times for all events.
- Returns:
Array of unit indices. times (numpy.ndarray): Array of times for all events.
- Return type:
idces (numpy.ndarray)
Notes
This method is not a property unlike
timesandeventsbecause the lists must actually be constructed in memory.
- property unit_locations: ndarray | None
Get unit locations as an (U, D) array where D is the spatial dimension.
- Returns:
- Array of unit locations, shape
(N, D). None if any unit lacks location data.
- Return type:
locations (numpy.ndarray or None)
Notes
Extracts from neuron_attributes ‘location’, ‘x’/’y’/’z’ (x and y required), or ‘position’ keys.
- property electrodes: ndarray | None
Get electrode/channel indices for each unit as a 1D array.
- Returns:
- 1D array of electrode
indices. None if any unit lacks electrode data.
- Return type:
electrodes (numpy.ndarray or None)
Notes
Extracts from neuron_attributes ‘electrode’, ‘channel’, or ‘ch’ keys.
- frames(length, overlap=0)[source]
Split the recording into a SpikeSliceStack of fixed-length windows.
- Parameters:
- Returns:
- Stack of SpikeData windows, one per
frame.
- Return type:
stack (SpikeSliceStack)
Notes
Windows that would extend past the end of the recording are excluded.
overlap must be strictly less than length.
- align_to_events(events, pre_ms, post_ms, *, kind='spike', bin_size_ms=1.0, sigma_ms=10)[source]
Align spike trains to events and return an event-aligned slice stack.
- Parameters:
events (array-like or str) – Event times in milliseconds, or a string key into
self.metadatawhose value is an array of event times in ms.pre_ms (float) – Window duration before each event in milliseconds.
post_ms (float) – Window duration after each event in milliseconds.
kind (str) –
"spike"to return aSpikeSliceStack, or"rate"to return aRateSliceStack. Default"spike".bin_size_ms (float) – Time bin width in milliseconds. Only used when
kind="rate". Default 1.0.sigma_ms (float) – Gaussian smoothing sigma in milliseconds for ISI-based firing rate estimation. Only used when
kind="rate". Default 10.
- Returns:
- Event-aligned slice
stack with one slice per event. Events whose window extends outside
[start_time, start_time + length]are dropped with a warning.
- Return type:
stack (SpikeSliceStack or RateSliceStack)
Notes
When
eventsis a metadata key, the corresponding array must already be in milliseconds (as stored byload_spikedata_from_ibl).
- binned(bin_size=40.0)[source]
Count the number of events in each time bin across all units.
Bins are relative to
start_time:(start_time, start_time + bin_size],(start_time + bin_size, start_time + 2*bin_size], etc. A spike at exactlystart_timeis included in bin 0.- Parameters:
bin_size (float) – Size of the time bin in milliseconds.
- Returns:
- Array of the number of events in
each bin.
- Return type:
binned_raster (numpy.ndarray)
- binned_meanrate(bin_size=40, unit='kHz')[source]
Calculate the mean firing rate across the population in each time bin.
- Parameters:
- Returns:
- Array of the mean firing rate
in each bin.
- Return type:
binned_meanrate (numpy.ndarray)
Notes
The rate is calculated as the number of events in each bin divided by the bin size and number of units.
- rates(unit='kHz')[source]
Calculate the mean firing rate of each neuron over the recording.
- Parameters:
unit (str) – Unit of the firing rate (‘Hz’ or ‘kHz’).
- Returns:
Array of the firing rate of each neuron.
- Return type:
rates (numpy.ndarray)
- resampled_isi(times, sigma_ms=10.0)[source]
Calculate instantaneous firing rate of each unit at the given times.
Computes interspike intervals and interpolates their inverse.
- Parameters:
times (numpy.ndarray) – Array of times to resample the firing rate to.
sigma_ms (float) – Standard deviation of the Gaussian kernel in milliseconds.
- Returns:
- Object with inst_Frate_data (N, T) and times;
units: Hz (spikes/s).
- Return type:
- sliding_rate(window_size, step_size=None, sampling_rate=None, t_start=None, t_end=None, gauss_sigma=0.0, apply_square=True)[source]
Compute continuous firing rate of each unit using a sliding-window average.
For each time bin t, counts spikes in the centered window [t - W/2, t + W/2] and returns rate R(t) = N / W (spikes per time unit, e.g. kHz).
- Parameters:
window_size (float) – Width of the sliding window in ms. Centered window [t - W/2, t + W/2].
step_size (float, optional) – Advance step for time bins in ms. Mutually exclusive with sampling_rate.
sampling_rate (float, optional) – Samples per ms; step_size = 1 / sampling_rate. Mutually exclusive with step_size.
t_start (float, optional) – Start of output time range in ms. Default: start_time - window_size/2.
t_end (float, optional) – End of output time range in ms. Default: start_time + length + window_size/2.
gauss_sigma (float, optional) – Gaussian smoothing sigma in ms. If 0, Gaussian smoothing is disabled.
apply_square (bool, optional) – If True, applies the square-window smoothing defined by window_size. If False, computes per-bin rates first and then applies optional Gaussian smoothing.
- Returns:
- Object with inst_Frate_data (N, T) and times;
units: spikes/ms (kHz).
- Return type:
- set_neuron_attribute(key, values, neuron_indices=None)[source]
Set an attribute across neurons in neuron_attributes.
- subset(units, by=None)[source]
Return a new SpikeData with only the selected units.
Units are selected either by their indices or by an ID stored under a given key in the neuron_attributes.
- Parameters:
- Returns:
New SpikeData object with the selected units.
- Return type:
sd (SpikeData)
Notes
Units are included in the output according to their order in self.train, not the order in the unit list (which is treated as a set).
raw_data and raw_time are not propagated to the subset – they remain on the original SpikeData object.
If IDs are not unique, every neuron which matches is included in the output.
Neurons whose neuron_attributes entry does not have the key are always excluded.
- neuron_to_channel_map(channel_attr=None)[source]
Return a mapping from neuron indices to channel indices.
- Parameters:
channel_attr (str or None) – Name of the attribute in neuron_attributes that contains the channel index. If None, searches for common attribute names.
- Returns:
- Mapping from neuron index (int) to channel index
(int). Returns an empty dict if neuron_attributes is None.
- Return type:
mapping (dict)
Notes
If neuron_attributes is None and channel information is required, or if the specified channel_attr doesn’t exist for all neurons, a ValueError is raised.
If channel_attr is not specified, attempts to find channel information using common attribute names: ‘channel’, ‘channel_id’, ‘channel_index’, ‘ch’, ‘channel_idx’.
- subtime(start, end, shift_to=None)[source]
Extract a subset of time points from the spike data.
Spike times are shifted so that
shift_tobecomes t=0 in the new SpikeData. By defaultshift_to=start, so subtime(100, 200) produces spikes in the range [0, 100). For event-centered slicing, passshift_to=event_timeto produce spikes from-(event - start)to+(end - event).- Parameters:
- Returns:
- New SpikeData object containing only the
specified time range.
- Return type:
sd (SpikeData)
Notes
For standard data (
start_time >= 0), negative start/end values are counted backwards from the end of the recording. For event-centered data (start_time < 0), negative values are treated as literal times.Start and end can also be None or Ellipsis, in which case that end of the data is not truncated.
All metadata and neuron data are propagated.
The output SpikeData has
start_time = start - shift_toandlength = end - start.
- append(spikeData, offset=0)[source]
Append spike times from another SpikeData object to this one.
- Parameters:
- Returns:
New SpikeData object with the appended data.
- Return type:
sd (SpikeData)
Notes
The two SpikeData objects must have the same number of neurons.
On metadata key collision, values from
selftake precedence.
- sparse_raster(bin_size=1.0, time_offset=0.0)[source]
Bin spike times into a sparse (units, bins) matrix.
Entry (i, j) is the number of times unit i fired in bin j. Spike times are shifted by
-start_timebefore binning so that bin 0 corresponds tostart_time.- Parameters:
- Returns:
- Sparse array where entry (i, j) is
the number of times unit i fired in bin j.
- Return type:
raster (sparse.csr_matrix)
Notes
Bins are left-open, right-closed intervals relative to start_time.
A spike at exactly start_time is clipped into bin 0.
The number of bins is always ceil((length + time_offset) / bin_size).
- raster(bin_size=1.0, time_offset=0.0)[source]
Bin spike times into a dense (units, bins) array.
Entry (i, j) is the number of times unit i fired in bin j.
- Parameters:
- Returns:
- Dense array where entry (i, j) is the
number of times unit i fired in bin j.
- Return type:
raster (numpy.ndarray)
Notes
Bins are left-open, right-closed intervals relative to start_time.
A spike at exactly start_time is clipped into bin 0.
- channel_raster(bin_size=1.0, channel_attr=None)[source]
Create a raster aggregated by channel instead of neuron.
- Parameters:
- Returns:
- Dense array where entry (c, j)
is the total number of spikes from all neurons on channel c in bin j.
- Return type:
channel_raster (numpy.ndarray)
Notes
Channels are determined from neuron_attributes using the same logic as neuron_to_channel_map().
If neuron_attributes is None or no channel information can be found, a ValueError is raised.
- get_waveform_traces(unit=None, ms_before=1.0, ms_after=2.0, channels=None, bandpass=None, filter_order=3, store=True, return_channel_waveforms=False, return_avg_waveform=True)[source]
Extract raw voltage waveforms around spike times from raw data.
- Parameters:
unit (int, slice, list, or None) – Unit index selection. int extracts a single unit (returns a single waveform array); slice/list-like/range extracts a subset (returns a list); None extracts all units (returns a list).
ms_before (float) – Milliseconds before each spike time (default: 1.0).
ms_after (float) – Milliseconds after each spike time (default: 2.0).
channels (int, list, or None) – Channel(s) to extract. None uses neuron_to_channel_map or all channels; int for single channel; list for multiple; [] for mapped channel.
bandpass (tuple or None) – Optional (lowcut_Hz, highcut_Hz) for bandpass filtering.
filter_order (int) – Butterworth filter order (default: 3).
store (bool) – If True (default), stores waveforms and avg_waveform in neuron_attributes.
return_channel_waveforms (bool) – If True, include a per-channel dict in the return.
return_avg_waveform (bool) – If False, skip computing/returning avg_waveform (it will be None).
- Returns:
- If unit is an int, a single
3D array shaped (num_channels, num_samples, num_spikes). Otherwise, a list of 3D arrays, one per selected unit.
- meta (dict): Dictionary with keys: fs_kHz, unit_indices,
channels, spike_times_ms, avg_waveforms (optional), channel_waveforms (optional).
- Return type:
waveforms (numpy.ndarray or list)
- interspike_intervals()[source]
Produce a list of arrays of interspike intervals per unit.
- Returns:
List of arrays of interspike intervals per unit.
- Return type:
isis (list)
- concatenate_spike_data(sd)[source]
Combine units from another SpikeData object with this one.
Returns a new SpikeData with the units of both objects. If the other SpikeData has a different time range, it is subtimed to match this one.
- Parameters:
sd (SpikeData) – SpikeData object whose units will be added.
- Returns:
New SpikeData with units from both objects.
- Return type:
combined (SpikeData)
Notes
raw_data and raw_time are carried over from self only.
If only one object has neuron_attributes, a RuntimeWarning is issued and the attributes from both are not merged.
- spike_time_tilings(delt=20.0)[source]
Compute the spike time tiling coefficient matrix.
- Parameters:
delt (float) – Time window in milliseconds (default: 20.0).
- Returns:
Spike time tiling coefficient matrix.
- Return type:
ret (PairwiseCompMatrix)
Notes
When
numbais installed, computation is parallelised across all unit pairs using numba’sprange.Reference: Cutts & Eglen. Detecting pairwise correlations in spike trains. J. Neurosci. 34:43, 14288-14303 (2014).
- spike_time_tiling(i, j, delt=20.0)[source]
Calculate the spike time tiling coefficient between two units.
- Parameters:
- Returns:
Spike time tiling coefficient between the two units.
- Return type:
ret (float)
Notes
Reference: Cutts & Eglen. Detecting pairwise correlations in spike trains. J. Neurosci. 34:43, 14288-14303 (2014).
- get_pairwise_ccg(compare_func=<function compute_cross_correlation_with_lag>, bin_size=1.0, max_lag=350, n_jobs=-1)[source]
Compute pairwise cross-correlogram matrices from binned spike arrays.
Bins the spike trains into a binary raster and computes the pairwise similarity between all unit pairs using lagged cross-correlation (default) or lagged cosine similarity.
- Parameters:
compare_func (callable) – Comparison function from utils. Must accept (ref_signal, comp_signal, max_lag=int) and return (score, lag). Default is compute_cross_correlation_with_lag.
bin_size (float) – Bin size in milliseconds for the binary raster (default: 1.0).
max_lag (float) – Maximum lag in milliseconds to search for the peak correlation (default: 350). Converted to bins internally.
n_jobs (int) – Number of threads for parallel computation. -1 uses all cores (default), 1 disables parallelism, None is serial.
- Returns:
- Matrix of maximum correlation
coefficients between all unit pairs. Diagonal is always 1.
- lag_matrix (PairwiseCompMatrix): Matrix of time lags in bins at
which maximum correlation occurs. Positive lag means unit j leads unit i. Diagonal is always 0.
- Return type:
corr_matrix (PairwiseCompMatrix)
- latencies(times, window_ms=100.0)[source]
Compute latencies from each time to the nearest spike per unit.
- get_pairwise_latencies(window_ms=None, return_distributions=False)[source]
Compute pairwise nearest-spike latency distributions between all unit pairs.
For each ordered pair (i, j), and for each spike in train i, finds the closest spike in train j and records the signed latency (t_j - t_i). Both directions are computed independently.
- Parameters:
window_ms (float or None) – If not None, discard latencies where the absolute distance exceeds this value (default: None, no filtering).
return_distributions (bool) – If True, also return a (U, U) numpy object array where entry [i, j] is an ndarray of all individual signed latencies from unit i to unit j (default: False).
- Returns:
- Matrix of mean signed
latencies in milliseconds. Entry [i, j] is the average latency from each spike in unit i to the nearest spike in unit j. Diagonal is 0.
- std_latency (PairwiseCompMatrix): Matrix of latency standard
deviations. Entry [i, j] is the std of latencies from unit i to unit j. Diagonal is 0.
- distributions (numpy.ndarray): Only returned when
return_distributions is True. A (U, U) object array where [i, j] is an ndarray of all signed latencies from unit i to unit j.
- Return type:
mean_latency (PairwiseCompMatrix)
- latencies_to_index(i, window_ms=100.0)[source]
Compute the latency from one unit to all other units.
- get_frac_active(edges, MIN_SPIKES, backbone_threshold, bin_size=1.0)[source]
Compute fraction of units active per burst and backbone identity.
- Parameters:
edges (numpy.ndarray) – Array of shape (B, 2) containing [start, end] indices for each burst. Indices are in raster bin coordinates (bin index = time_ms / bin_size).
MIN_SPIKES (int) – Minimum number of spikes required for a unit to be considered active in a burst.
backbone_threshold (float) – Minimum fraction of bursts a unit must be active in to be considered a backbone unit (0 to 1).
bin_size (float) – Raster bin size in milliseconds (default 1.0). Must match the bin size used to compute
edges.
- Returns:
- 1D array where each value is the
fraction of bursts in which the neuron was active.
- frac_per_burst (numpy.ndarray): 1D array where each value is the
fraction of neurons active in that burst.
- backbone_units (numpy.ndarray): 1D array of the neuron/unit
indices that are backbone units.
- Return type:
frac_per_unit (numpy.ndarray)
- get_frac_spikes_in_burst(edges, bin_size=1.0)[source]
Compute the fraction of each unit’s spikes that fall inside burst windows.
- Parameters:
edges (numpy.ndarray) – Array of shape (B, 2) containing [start, end] indices for each burst. Indices are in raster bin coordinates (bin index = time_ms / bin_size).
bin_size (float) – Raster bin size in milliseconds (default 1.0). Must match the bin size used to compute
edges.
- Returns:
- 1D array of shape (N,)
where each value is the fraction of the unit’s total spikes that fall inside any burst window. NaN for units with zero spikes.
- Return type:
frac_spikes_in_burst (numpy.ndarray)
- spike_shuffle(swap_per_spike=5, seed=None, bin_size=1)[source]
Shuffle the spike matrix using degree-preserving double-edge swaps.
- Parameters:
swap_per_spike (int) – Determines total number of swaps: num_spikes * swap_per_spike (default: 5).
seed (int or None) – Random seed for repeatability. None means no seed is set (default: None).
bin_size (int) – Number of individual time steps per bin. Bins with multiple spikes are binarized to 1. A RuntimeWarning is issued when multi-spike bins are detected (default: 1).
- Returns:
- SpikeData object with shuffled
spike train matrix.
- Return type:
shuffled_spike_data (SpikeData)
Notes
Each neuron’s average firing rate is preserved, but the specific time bin in which it spikes is shuffled.
Each time bin’s population rate is preserved, but the specific units active in each time bin are shuffled.
Every spike swap involves 2 different spikes so on average, every spike will get swapped 2*swap_per_spike times.
Reference: Okun, M. et al. Population rate dynamics and multineuron firing patterns in sensory cortex. J. Neurosci. 32, 17108-17119 (2012).
- spike_shuffle_stack(n_shuffles, seed=None, swap_per_spike=5, bin_size=1)[source]
Generate multiple shuffled copies as a SpikeSliceStack.
Each shuffle is an independent call to
spike_shuffle. The resulting stack can be used withSpikeSliceStack.applyto build null distributions for statistical testing.- Parameters:
- Returns:
- Stack of n_shuffles shuffled SpikeData
objects. All slices share the same time bounds.
- Return type:
stack (SpikeSliceStack)
- subset_stack(n_subsets, units_per_subset, seed=None)[source]
Generate multiple random unit subsets as a SpikeSliceStack.
Each subset is drawn by sampling units_per_subset unit indices without replacement from the full unit set. Draws are independent across subsets (with replacement across draws), so the same unit may appear in multiple subsets.
- Parameters:
- Returns:
- Stack of n_subsets subsetted SpikeData
objects. All slices share the same time bounds.
- Return type:
stack (SpikeSliceStack)
Notes
The stack-level
neuron_attributesisNonebecause each subset contains a different set of units. IndividualSpikeDataobjects within the stack carry their own subsetted attributes.
- to_hdf5(filepath, *, style='ragged', raster_dataset='raster', raster_bin_size_ms=None, spike_times_dataset='spike_times', spike_times_index_dataset='spike_times_index', spike_times_unit='s', fs_Hz=None, group_per_unit='units', group_time_unit='s', idces_dataset='idces', times_dataset='times', times_unit='ms', raw_dataset=None, raw_time_dataset=None, raw_time_unit='ms')[source]
Export this SpikeData to an HDF5 file with flexible formatting.
Supports four storage styles: ‘raster’ (dense 2D array), ‘ragged’ (flat spike times with index array), ‘group’ (separate dataset per unit), and ‘paired’ (parallel index and time arrays).
- Parameters:
filepath (str) – Path to the output HDF5 file.
style (str) – Storage format style (‘raster’, ‘ragged’, ‘group’, or ‘paired’). Defaults to ‘ragged’.
raster_dataset (str) – Dataset name for raster data (style=’raster’).
raster_bin_size_ms (float) – Bin size in milliseconds for rasterization. Required for ‘raster’ style.
spike_times_dataset (str) – Dataset name for flat spike times (style=’ragged’).
spike_times_index_dataset (str) – Dataset name for cumulative spike counts per unit (style=’ragged’).
spike_times_unit (str) – Time unit for spike times in ragged format (‘ms’, ‘s’, or ‘samples’).
fs_Hz (float) – Sampling frequency in Hz, required when converting to ‘samples’ unit.
group_per_unit (str) – Group name containing per-unit datasets (style=’group’).
group_time_unit (str) – Time unit for individual unit datasets (‘ms’, ‘s’, or ‘samples’).
idces_dataset (str) – Dataset name for unit indices (style=’paired’).
times_dataset (str) – Dataset name for spike times (style=’paired’).
times_unit (str) – Time unit for paired times (‘ms’, ‘s’, or ‘samples’).
raw_dataset (str or None) – Reserved for future raw data export.
raw_time_dataset (str or None) – Reserved for future raw time axis export.
raw_time_unit (str) – Time unit for raw data timestamps (‘ms’, ‘s’, or ‘samples’).
- Return type:
Notes
All spike times are stored internally in milliseconds and converted to the requested output unit.
When using ‘samples’ unit, fs_Hz must be provided for proper conversion.
- to_nwb(filepath, *, spike_times_dataset='spike_times', spike_times_index_dataset='spike_times_index', group='units')[source]
Export this SpikeData to a minimal NWB-compatible HDF5 file.
Stores spike times in the standard ‘/units’ group format for round-tripping with the NWB loader.
- Parameters:
filepath (str) – Path to the output NWB file (.nwb extension recommended).
spike_times_dataset (str) – Name of the dataset containing flattened spike times in seconds. Standard NWB uses “spike_times”.
spike_times_index_dataset (str) – Name of the dataset containing cumulative spike counts per unit for indexing into spike_times. Standard NWB uses “spike_times_index”.
group (str) – Name of the HDF5 group to contain the spike data. Standard NWB uses “units” for the units table.
- Return type:
Notes
Spike times are automatically converted from internal milliseconds to seconds as required by the NWB standard.
The output file contains only the essential spike timing data, not the full NWB metadata structure.
Compatible with both pynwb and h5py-based NWB readers.
- to_kilosort(folder, *, fs_Hz, spike_times_file='spike_times.npy', spike_clusters_file='spike_clusters.npy', time_unit='samples', cluster_ids=None)[source]
Export this SpikeData to a KiloSort/Phy-compatible folder.
Creates spike_times.npy and spike_clusters.npy arrays for use with Phy.
- Parameters:
folder (str) – Output directory path. Will be created if it doesn’t exist.
fs_Hz (float) – Sampling frequency in Hz. Required for time unit conversion, especially when using ‘samples’.
spike_times_file (str) – Filename for the spike times array. Standard KiloSort uses “spike_times.npy”.
spike_clusters_file (str) – Filename for the cluster assignments array. Standard KiloSort uses “spike_clusters.npy”.
time_unit (str) – Output time unit for spike times (‘samples’, ‘ms’, or ‘s’).
cluster_ids (list[int] or None) – Optional list of cluster IDs to assign to each unit. Must have length equal to self.N. If None, uses sequential integers [0, 1, 2, …].
- Returns:
- Paths to the created
(spike_times_file, spike_clusters_file).
- Return type:
Notes
Empty units (no spikes) are skipped in the output arrays.
Cluster IDs are mapped to units in order, so cluster_ids[i] corresponds to unit i in the SpikeData.
The ‘samples’ time unit is most common for KiloSort workflows.
- get_pop_rate(square_width=20, gauss_sigma=100, raster_bin_size_ms=1.0)[source]
Compute smoothed population firing rate.
Smooths the summed spike counts using a moving-average (square) window, then a Gaussian smoothing window.
- Parameters:
- Returns:
- Smoothed population spiking data in
spikes per bin.
- Return type:
pop_rate (numpy.ndarray)
Notes
square_widthandgauss_sigmaare specified in milliseconds and converted to bin counts internally usingraster_bin_size_ms. With the defaultraster_bin_size_ms=1.0, 1 ms = 1 bin.The returned array index corresponds to raster bin index. For event-centered data (start_time < 0), bin 0 corresponds to start_time, not t=0. To find the bin for t=0 (the event), use
event_bin = int(-start_time / raster_bin_size_ms).
- compute_spike_trig_pop_rate(window_ms=80, cutoff_hz=20, fs=1000, bin_size=1, cut_outer=10)[source]
Compute spike-triggered population rate (stPR).
Implementation of the stPR measure from Bimbard et al., building on Okun et al. (Nature, 2015). For each neuron i and lag tau, the leave-one-out population rate is computed excluding neuron i. The coupling curve measures how much the other neurons’ activity deviates from their temporal mean at times offset by tau from neuron i’s spikes.
- Parameters:
window_ms (int) – Half-width of the lag window in ms (window from -window_ms to +window_ms).
cutoff_hz (float) – Low-pass Butterworth filter cutoff in Hz applied to the coupling curves.
fs (float) – Sampling rate in Hz used for filter design.
bin_size (float) – Bin size in ms for the spike raster.
cut_outer (int) – Number of outer lag bins to ignore.
- Returns:
- Low-pass filtered coupling
curves for every neuron, shape (N, 2*window_ms + 1).
- coupling_strengths_zero_lag (numpy.ndarray): Coupling strength
at lag 0, shape (N,).
- coupling_strengths_max (numpy.ndarray): Peak coupling strength
within the trimmed lag window, shape (N,).
- delays (numpy.ndarray): Lag (in ms) at which peak coupling
occurs, shape (N,). Positive means neuron leads population.
- lags (numpy.ndarray): Array of lag values from -window_ms to
+window_ms.
- Return type:
stPR_filtered (numpy.ndarray)
- get_bursts(thr_burst, min_burst_diff, burst_edge_mult_thresh, square_width=20, gauss_sigma=100, acc_square_width=8, acc_gauss_sigma=8, raster_bin_size_ms=1.0, peak_to_trough=True, pop_rate=None, pop_rate_acc=None, pop_rms_override=None)[source]
Detect bursts using thresholded peak finding and edge detection.
- Parameters:
thr_burst (float) – Threshold multiplier for burst peak detection.
min_burst_diff (int) – Minimum time between detected bursts (in bins).
burst_edge_mult_thresh (float) – Threshold multiplier for burst edge detection.
square_width (float) – Square window width for calculating pop_rate (in milliseconds).
gauss_sigma (float) – Gaussian window sigma for calculating pop_rate (in milliseconds).
acc_square_width (float) – Square window width for calculating pop_rate_acc (in milliseconds).
acc_gauss_sigma (float) – Gaussian window sigma for calculating pop_rate_acc (in milliseconds).
raster_bin_size_ms (float) – Time bin size for calculating population rate in ms.
peak_to_trough (bool) – Flag to calculate bursts peak-to-trough (True) or peak-to-zero (False).
pop_rate (numpy.ndarray or None) – Pre-calculated smoothed population spiking data in spikes per bin.
pop_rate_acc (numpy.ndarray or None) – Pre-calculated accurate smoothed population spiking data in spikes per bin.
pop_rms_override (float or None) – RMS override for burst threshold baseline.
- Returns:
Time bin indices of detected bursts. edges (numpy.ndarray): Burst edge indices, shape
(N, 2). peak_amp (numpy.ndarray): Amplitudes of bursts at corresponding array indices.- Return type:
tburst (numpy.ndarray)
Notes
Will use pop_rate and pop_rate_acc if provided, otherwise will calculate using squared widths and sigmas.
Using the peak-to-zero calculations may result in several bursts being detected at one peak.
Returned time bin indices are relative to bin 0 of the raster. For event-centered data (
start_time < 0), convert to event-relative ms viatburst * raster_bin_size_ms + start_time.
- burst_sensitivity(thr_values, dist_values, burst_edge_mult_thresh, square_width=20, gauss_sigma=100, acc_square_width=8, acc_gauss_sigma=8, raster_bin_size_ms=1.0, peak_to_trough=True, pop_rate=None, pop_rate_acc=None, pop_rms_override=None)[source]
Sweep burst detection parameters and return burst counts.
Calls
get_burstsfor every combination ofthr_valuesanddist_values, holdingburst_edge_mult_threshconstant.- Parameters:
thr_values (array-like) – 1-D array of
thr_burstvalues to sweep.dist_values (array-like) – 1-D array of
min_burst_diffvalues (in bins) to sweep.burst_edge_mult_thresh (float) – Held constant during the sweep.
square_width (int) – Square window width for pop_rate (in bins).
gauss_sigma (int) – Gaussian window sigma for pop_rate (in bins).
acc_square_width (int) – Square window width for pop_rate_acc (in bins).
acc_gauss_sigma (int) – Gaussian window sigma for pop_rate_acc (in bins).
raster_bin_size_ms (float) – Time bin size for population rate in ms.
peak_to_trough (bool) – Peak-to-trough (True) or peak-to-zero (False) burst detection.
pop_rate (numpy.ndarray or None) – Pre-computed smoothed population rate.
pop_rate_acc (numpy.ndarray or None) – Pre-computed accurate smoothed population rate.
pop_rms_override (float or None) – RMS override for burst threshold baseline.
- Returns:
- Integer array of shape
(len(thr_values), len(dist_values)) with the number of detected bursts for each parameter combination.
- Return type:
burst_counts (numpy.ndarray)
Notes
Either
thr_valuesordist_valuescan have length 1 to focus the sensitivity analysis on a single parameter.Pre-computing
pop_rateandpop_rate_accand passing them in avoids redundant smoothing inside the loop.
- fit_gplvm(bin_size_ms=50.0, movement_variance=1.0, tuning_lengthscale=10.0, n_latent_bin=100, n_iter=20, n_time_per_chunk=10000, random_seed=3, model_class=None, **model_kwargs)[source]
Fit a Gaussian Process Latent Variable Model to binned spike counts.
Bins the spike data into a spike count matrix, fits a GPLVM model via expectation-maximisation, decodes latent states, and returns the results together with a unit reordering based on tuning peaks.
- Parameters:
bin_size_ms (float) – Bin width in milliseconds for spike count matrix.
movement_variance (float) – Movement variance hyperparameter for the GPLVM transition kernel.
tuning_lengthscale (float) – Lengthscale hyperparameter for the tuning curve kernel.
n_latent_bin (int) – Number of latent bins (discretisation of the latent space).
n_iter (int) – Number of EM iterations.
n_time_per_chunk (int) – Number of time bins per chunk for chunked inference (controls memory usage).
random_seed (int) – Random seed for JAX PRNG key.
model_class (class or None) – Model class to use. Defaults to
PoissonGPLVMJump1Dfrompoor_man_gplvm.**model_kwargs – Additional keyword arguments passed to the model constructor (e.g.
p_move_to_jump,basis_type).
- Returns:
- Dictionary with keys:
"decode_res"(decoded latent state dictionary),
"log_marginal_l"(array of log marginal likelihoods per EM iteration),"reorder_indices"(unit reordering indices based on tuning curve peaks),"model"(the fitted model object),"binned_spike_counts"(the (T, N) binned spike count matrix), and"bin_size_ms"(the bin width used).
- Dictionary with keys:
- Return type:
result (dict)
Notes
Requires
poor_man_gplvmandjax. Install withpip install poor-man-gplvm jax jaxlib.The binned spike count matrix has shape
(T, N)where T is the number of time bins and N is the number of units.To compute metrics from the fitted model, see the GPLVM utility functions in
spikedata.utils.
- plot(**kwargs)[source]
Assemble a multi-panel column figure from this SpikeData object.
Thin wrapper around
plot_utils.plot_recording(self, **kwargs). Seeplot_recordingfor the full list of parameters.- Parameters:
**kwargs – All keyword arguments are forwarded to
plot_recording.- Returns:
The assembled figure.
- Return type:
fig (matplotlib.Figure)
- plot_spatial_network(ax, matrix, edge_threshold=None, top_pct=None, node_size_range=(2, 20), node_cmap='viridis', node_linewidth=0.2, edge_color='red', edge_linewidth=0.6, edge_alpha_range=(0.15, 1.0), scale_bar_um=500, font_size=None)[source]
Plot units at their MEA positions with pairwise edges.
Unit positions are extracted from
neuron_attributes(requires'x'and'y'keys). Thin wrapper aroundplot_utils.plot_spatial_network.- Parameters:
ax (matplotlib.axes.Axes) – Target axes.
matrix (numpy.ndarray) – Symmetric (N, N) pairwise matrix (e.g. correlation). Diagonal values are ignored.
edge_threshold (float or None) – Minimum matrix value to draw an edge.
top_pct (float or None) – Percentage of top edges to draw.
node_size_range (tuple) – (min_size, max_size) in points squared for scatter markers.
node_cmap (str) – Matplotlib colourmap for node colour.
node_linewidth (float) – Outline width of node markers.
edge_color (str) – Colour for network edges.
edge_linewidth (float) – Line width for network edges.
edge_alpha_range (tuple) – (min_alpha, max_alpha) for edge transparency.
scale_bar_um (float) – Scale bar length in micrometres (0 to omit).
font_size (int or None) – Font size for scale bar label.
- Returns:
- The scatter
artist, useful for adding a colorbar.
- Return type:
scatter (matplotlib.collections.PathCollection)
- plot_aligned_pop_rate(events=None, pre_ms=250, post_ms=500, ax=None, pop_rate=None, pop_rate_params=None, color=None, label=None, linewidth=1.5, show_individual=False, individual_alpha=0.15, individual_linewidth=0.5, burst_edges=None, edge_percentile=None, xlabel='Time from event (ms)', ylabel='Pop. rate', font_size=None)[source]
Plot the average population rate aligned to events.
Cuts the population rate around each event and plots the mean trace. When
eventsis None, burst peaks are auto-detected viaself.get_bursts().- Parameters:
events (array-like or None) – Event times in ms. If None, burst peaks are auto-detected.
pre_ms (float) – Window duration before each event in ms.
post_ms (float) – Window duration after each event in ms.
ax (matplotlib.axes.Axes or None) – Target axes. If None, a new figure is created.
pop_rate (numpy.ndarray or None) – Pre-computed population rate (1-ms bins, full recording). If None, computed via
self.get_pop_rate().pop_rate_params (dict or None) – Keyword arguments forwarded to
self.get_pop_rate()whenpop_rateis None. Defaults:square_width=5, gauss_sigma=5.color (str or None) – Colour for the mean trace. If None, uses the first colour from the default colour cycle.
label (str or None) – Legend label for the mean trace.
linewidth (float) – Line width for the mean trace.
show_individual (bool) – If True, plot each individual event’s pop-rate trace as a thin line behind the average.
individual_alpha (float) – Alpha for individual traces.
individual_linewidth (float) – Line width for individual traces.
burst_edges (numpy.ndarray or None) – Per-event [start_ms, end_ms] boundaries, shape (B, 2).
edge_percentile (float or None) – Percentile (0-100) controlling how conservatively the edge markers are placed. When set, vertical dashed lines are drawn at the resulting positions.
xlabel (str) – X-axis label.
ylabel (str) – Y-axis label.
font_size (int or None) – Font size for labels and ticks. If None, uses current rcParams.
- Returns:
- The mean population rate across
events, shape (pre_ms + post_ms,).
- Return type:
avg_rate (numpy.ndarray)
Notes
To compare multiple conditions, call this method once per condition on the same
ax. Each call draws its own trace with its owncolorandlabel; addax.legend()after the last call.
- curate_by_min_spikes(min_spikes=30)[source]
Remove units with fewer than min_spikes spikes.
See
spikelab.spikedata.curation.curate_by_min_spikesfor full documentation.
- curate_by_firing_rate(min_rate_hz=0.05)[source]
Remove units whose firing rate is below min_rate_hz.
See
spikelab.spikedata.curation.curate_by_firing_ratefor full documentation.
- curate_by_isi_violations(max_violation=0.01, threshold_ms=1.5, min_isi_ms=0.0, method='percent')[source]
Remove units with excessive ISI violations.
See
spikelab.spikedata.curation.curate_by_isi_violationsfor full documentation.
- curate_by_snr(min_snr=5.0, ms_before=1.0, ms_after=2.0)[source]
Remove units whose SNR is below min_snr.
See
spikelab.spikedata.curation.curate_by_snrfor full documentation.
- curate_by_std_norm(max_std_norm=1.0, at_peak=True, window_ms_before=0.5, window_ms_after=1.5, ms_before=1.0, ms_after=2.0)[source]
Remove units whose normalized waveform STD exceeds max_std_norm.
See
spikelab.spikedata.curation.curate_by_std_normfor full documentation.
- compute_waveform_metrics(ms_before=1.0, ms_after=2.0, at_peak=True, window_ms_before=0.5, window_ms_after=1.5)[source]
Compute average waveforms, SNR, and normalized STD for all units.
Stores results in
neuron_attributes. Seespikelab.spikedata.curation.compute_waveform_metricsfor full documentation.
- curate(**kwargs)[source]
Apply multiple curation criteria in sequence (intersection).
See
spikelab.spikedata.curation.curatefor full documentation and supported keyword arguments.
- curate_by_merge_duplicates(**kwargs)[source]
Remove duplicate units by merging nearby pairs with similar waveforms.
See
spikelab.spikedata.curation.curate_by_merge_duplicatesfor full documentation and supported keyword arguments.
- static build_curation_history(sd_original, sd_curated, results, parameters=None)[source]
Translate curation results into a serializable history dict.
See
spikelab.spikedata.curation.build_curation_historyfor full documentation.
- split_epochs()[source]
Split a concatenated SpikeData into per-epoch SpikeData objects.
Uses
metadata["rec_chunks_ms"](list of(start_ms, end_ms)tuples) to slice this SpikeData viasubtime. Each resulting SpikeData receives the corresponding epoch template fromneuron_attributes["epoch_templates"]as its"template"attribute.- Parameters:
metadata. (None. Epoch boundaries and names are read from)
- Returns:
- One SpikeData per epoch, time-shifted
so each starts at t=0.
- Return type:
Notes
Requires
metadata["rec_chunks_ms"]. RaisesValueErrorif not present.metadata["rec_chunk_names"](optional) is stored asmetadata["source_file"]on each output SpikeData.
- compare_sorter(other, comparison_type='spike_times', delta_ms=0.4, f_rel_to_trough=(20, 40), max_lag=5, n_jobs=None)[source]
Compare this sorter output against another SpikeData object.
Implements the same comparison methodology as SpikeInterface’s
SymmetricSortingComparison(Buccino et al., eLife 2020): greedy spike matching within a temporal window followed by Jaccard agreement scoring. Usebest_match_assignment()on the returned agreement matrix to obtain the optimal unit mapping via the Hungarian algorithm (equivalent to SpikeInterface’sget_best_unit_match).- Parameters:
other (SpikeData) – SpikeData from a different sorter to compare against.
comparison_type (
Literal['spike_times','waveforms']) –"spike_times"for spike-train agreement or"waveforms"for template footprint similarity.delta_ms (float) – Maximum temporal distance (ms) for a spike match (spike_times only).
f_rel_to_trough (tuple[int, int]) –
(pre, post)sample window around the trough for footprint construction (waveforms only).max_lag (int) – Maximum lag in samples for footprint cosine similarity search (waveforms only).
n_jobs (int or None) – Number of parallel workers.
Noneor 1 for serial execution, -1 for all cores.
- Returns:
- Comparison output dictionary containing:
labels_1/labels_2: unit indicesmetadata: comparison settingsFor
spike_times:agreement,frac_1,frac_2For
waveforms:similarity
- Return type:
result (dict)
References
Buccino et al., “SpikeInterface, a unified framework for spike sorting”, eLife (2020). https://doi.org/10.7554/eLife.61834
- static best_match_assignment(score_matrix, minimize=False)[source]
Compute optimal unit assignment from a pairwise score matrix.
Uses the Hungarian algorithm (
scipy.optimize.linear_sum_assignment) to find the assignment of rows (sorter 1 units) to columns (sorter 2 units) that maximizes (or minimizes) the total score. When the matrix is non-square, unmatched units are reported separately. This is equivalent to SpikeInterface’sget_best_unit_matchstep.- Parameters:
score_matrix (np.ndarray) – Pairwise score matrix of shape (M, N), e.g. the
agreementorsimilaritymatrix returned bycompare_sorter().minimize (bool) – If True, find the assignment that minimizes the total score (useful for distance matrices). Default is False (maximize).
- Returns:
- Dictionary containing:
row_indices: matched row indices (length = min(M, N))col_indices: matched column indices (length = min(M, N))scores: score for each matched pairtotal_score: sum of matched scoresunmatched_rows: row indices with no match (if M > N)unmatched_cols: col indices with no match (if N > M)row_order: full row permutation array (length M) — matched rows first, then unmatched. Apply to any (M, …) array viaarray[row_order].col_order: full column permutation array (length N) — matched cols first, then unmatched. Apply to any (…, N) array viaarray[:, col_order].reordered_matrix: score_matrix reordered so that matched pairs lie along the diagonal (equivalent toscore_matrix[np.ix_(row_order, col_order)])
- Return type:
result (dict)