Utilities

General-purpose utility functions used across SpikeLab, including array manipulation helpers, time conversion, and validation functions.

spikelab.spikedata.utils.get_sttc(tA, tB, delt=20.0, length=None, start_time=0.0)[source]

Calculate the spike time tiling coefficient between two spike trains.

Parameters:

tA (list) – List of spike times for the first spike train.
tB (list) – List of spike times for the second spike train.
delt (float) – Time window in milliseconds (default: 20.0).
length (float or None) – Total duration in milliseconds. If None, inferred from the latest spike time after shifting, which may underestimate the true recording duration if the last spike does not fall near the end. Pass the actual recording length for unbiased STTC.
start_time (float) – Time origin of the spike trains (default 0.0). Spike times are shifted by -start_time before computation so that the STTC edge corrections work correctly for event-centered data with negative spike times.

Returns:

Spike time tiling coefficient between the two spike: trains.

Return type:

sttc (float)

Notes

Formula: STTC = ((PA - TB) / (1 - PA * TB) + (PB - TA) / (1 - PB * TA)) / 2

[1] Cutts & Eglen. Detecting pairwise correlations in spike trains: An objective comparison of methods and application to the study of retinal waves. Journal of Neuroscience 34:43, 14288-14303 (2014).

spikelab.spikedata.utils.swap(ar, idxs, rng)[source]

Attempt one double-edge swap in a binary spike raster while preserving per-row and per-column sums.

Parameters:

ar (np.ndarray) – Binary spike raster.
idxs (tuple) – Tuple of numpy arrays containing the indices of the spikes.
rng (np.random.Generator) – Random number generator for reproducibility.

Returns:

True if a swap was performed.

Return type:

success (bool)

Notes

Both ar and idxs are mutated in-place for performance.

The swap chooses two existing spike positions (i0, j0) and (i1, j1) and, if the off-diagonal positions (i0, j1) and (i1, j0) are both empty and the indices are distinct, swaps them so that spikes move to those positions.

spikelab.spikedata.utils.randomize(ar, swap_per_spike=5, seed=None)[source]

Randomize a binary spike raster using degree-preserving double-edge swaps.

Parameters:

ar (array_like) – Binary matrix shaped (neurons, time) or (time, neurons). Values should be 0/1.
swap_per_spike (int) – Target number of successful swaps per spike.
seed (int or None) – Random seed number. Set for repeatability during experiments.

Returns:

Randomized binary matrix with the: same shape and row/column sums.

Return type:

randomized_raster (np.ndarray)

Notes

Shuffling preserves each neuron’s average firing rate but shuffles which time bins it spikes in. Each time bin’s population rate is also preserved but the specific units active are shuffled. Every spike swap involves 2 different spikes so on average every spike will get swapped 2 * swap_per_spike times.

Okun, M. et al. Population rate dynamics and multineuron firing patterns in sensory cortex. J. Neurosci. 32, 17108-17119 (2012).

spikelab.spikedata.utils.trough_between(i0, i1, pop_rate)[source]

Find the minimum value (trough) between two indices in a population rate array.

Parameters:

i0 (int) – Time bin index of the first burst.
i1 (int) – Time bin index of the second burst.
pop_rate (np.ndarray) – Smoothed population spiking data in spikes per bin.

Returns:

Time bin index of minimum value (trough): between peaks. None if the indices are adjacent.

Return type:

trough_idx (int or None)

spikelab.spikedata.utils.ensure_h5py()[source]: Raise ImportError if h5py is not installed.

spikelab.spikedata.utils.times_from_ms(times_ms, unit, fs_Hz)[source]

Convert times from milliseconds to the requested unit.

Return type:: Union[ndarray, float, int]

spikelab.spikedata.utils.to_ms(values, unit, fs_Hz)[source]

Convert a vector of times to milliseconds.

Return type:: ndarray

spikelab.spikedata.utils.extract_waveforms(raw_data, spike_times_ms, fs_kHz, ms_before=1.0, ms_after=2.0, channel_indices=None, bandpass=None, filter_order=3)[source]

Extract waveform snippets from raw data at specified spike times.

Parameters:

raw_data (np.ndarray) – Raw voltage data with shape (num_channels, num_samples).
spike_times_ms (np.ndarray) – Array of spike times in milliseconds.
fs_kHz (float) – Sampling rate in kHz.
ms_before (float) – Milliseconds before each spike time.
ms_after (float) – Milliseconds after each spike time.
channel_indices (list of int or None) – Channel indices to extract. If None, extracts all.
bandpass (tuple or None) – Optional (lowcut_Hz, highcut_Hz) for bandpass filtering.
filter_order (int) – Butterworth filter order (default: 3).

Returns:

3D array: (num_channels, num_samples, num_spikes). Empty if no valid spikes.

Return type:

waveforms (np.ndarray)

spikelab.spikedata.utils.check_neuron_attributes(neuron_attributes, n_neurons=None)[source]

Check a list of dictionaries for use as neuron_attributes to verify that keys and values are consistent.

Parameters:

neuron_attributes (list of dict) – List of dictionaries containing neuron attributes.
n_neurons (int or None) – Expected number of neurons. If provided, validates the list length.

Returns:

A list of dictionaries where all dictionaries: have valid keys and values.

Return type:

result (list of dict)

Notes

If some dictionaries are missing keys that others have, a ValueError is raised indicating which neuron entries have inconsistent keys.

spikelab.spikedata.utils.get_channels_for_unit(unit_idx, channels, neuron_to_channel, n_channels_total)[source]

Determine which channels to extract for a given unit.

Parameters:

unit_idx (int) – Index of the unit.
channels (int, list of int, or None) – Channel specification. None uses neuron_to_channel mapping or all channels; int for single channel; list for multiple; empty list for mapped channel.
neuron_to_channel (dict) – Mapping from unit indices to channel indices.
n_channels_total (int) – Total number of channels in the raw data.

Returns:

Channel indices to extract.

Return type:

result (list of int)

Raises:

ValueError – If channels argument is invalid type.

spikelab.spikedata.utils.compute_avg_waveform(waveforms, channel_indices, dtype)[source]

Compute the average waveform from extracted waveforms.

Parameters:

waveforms (np.ndarray) – 3D array of shape (num_channels, num_samples, num_spikes).
channel_indices (list of int) – List of channel indices used for extraction.
dtype (np.dtype) – Data type for the output array if waveforms is empty.

Returns:

2D array of shape (num_channels, num_samples): containing the average waveform.

Return type:

avg (np.ndarray)

spikelab.spikedata.utils.get_valid_spike_times(spike_times_ms, fs_kHz, ms_before, ms_after, n_time_samples)[source]

Filter spike times to only those within valid bounds of the raw data.

Parameters:

spike_times_ms (np.ndarray) – Array of spike times in milliseconds.
fs_kHz (float) – Sampling rate in kHz.
ms_before (float) – Milliseconds before each spike time.
ms_after (float) – Milliseconds after each spike time.
n_time_samples (int) – Total number of time samples in the raw data.

Returns:

Array of valid spike times in milliseconds.

Return type:

valid (np.ndarray)

spikelab.spikedata.utils.waveforms_by_channel(waveforms, channel_indices)[source]

Convert a waveform stack into a per-channel dict.

Parameters:

waveforms (np.ndarray) – 3D array shaped (num_channels, num_samples, num_spikes).
channel_indices (list of int) – List of channel indices corresponding to waveforms axis 0.

Returns:

Mapping of channel index to 2D array shaped: (num_samples, num_spikes).

Return type:

result (dict)

spikelab.spikedata.utils.extract_unit_waveforms(unit_idx, spike_times_ms, raw_data, fs_kHz, ms_before, ms_after, channels, neuron_to_channel, bandpass=None, filter_order=3, return_channel_waveforms=False, return_avg_waveform=True)[source]

Extract waveforms and compute statistics for a single unit.

This function orchestrates the full waveform extraction pipeline: resolves channels, extracts raw voltage snippets around each spike time, computes the mean waveform, and filters spike times to valid extraction windows.

Parameters:

unit_idx (int) – Index of the unit being extracted.
spike_times_ms (np.ndarray) – Array of spike times in milliseconds for this unit.
raw_data (np.ndarray) – Raw voltage data with shape (num_channels, num_samples).
fs_kHz (float) – Sampling rate in kHz.
ms_before (float) – Milliseconds before each spike time.
ms_after (float) – Milliseconds after each spike time.
channels (int, list of int, or None) – Channel specification. None uses neuron_to_channel mapping or all channels; int for single channel; list for multiple; empty list for mapped channel.
neuron_to_channel (dict) – Mapping from unit indices to channel indices.
bandpass (tuple or None) – Optional (lowcut_Hz, highcut_Hz) for bandpass filtering.
filter_order (int) – Butterworth filter order (default: 3).
return_channel_waveforms (bool) – If True, include per-channel waveforms in the metadata dict.
return_avg_waveform (bool) – If True, compute and include the average waveform in the metadata dict.

Returns:

3D array: (num_channels, num_samples, num_spikes).
meta (dict): Per-unit metadata containing channels,: spike_times_ms, avg_waveform, and optionally channel_waveforms.

Return type:

waveforms (np.ndarray)

spikelab.spikedata.utils.consecutive_durations(signal, threshold, mode='above', min_dur=1)[source]

Compute the lengths of consecutive runs in a 1-D signal that satisfy a threshold condition.

Scans signal for contiguous stretches of bins that are above (>=) or below (<) threshold, returns an array of their durations, and optionally filters out runs shorter than min_dur.

Parameters:

signal (array_like) – 1-D numeric array (e.g. continuity probability time series from a GPLVM).
threshold (float) – Threshold value for the condition.
mode (str) – "above" keeps runs where signal >= threshold; "below" keeps runs where signal < threshold.
min_dur (int) – Minimum run length to keep. Runs shorter than this are discarded.

Returns:

1-D integer array of run lengths that satisfy: the condition and are at least min_dur bins long. May be empty.

Return type:

durations (np.ndarray)

spikelab.spikedata.utils.gplvm_state_entropy(posterior_latent_marg)[source]

Compute Shannon entropy of the latent state distribution at each time bin.

Parameters:

posterior_latent_marg (np.ndarray) – Marginal posterior over latent states with shape (T, K) where T is the number of time bins and K is the number of latent states. Typically obtained from SpikeData.fit_gplvm()["decode_res"]["posterior_latent_marg"].

Returns:

1-D array of shape (T,) with the Shannon: entropy (in nats) for each time bin.

Return type:

entropy (np.ndarray)

spikelab.spikedata.utils.gplvm_continuity_prob(decode_res)[source]

Extract the continuity (non-jump) probability time series from a GPLVM decode result.

The continuity probability at each time bin is the marginal posterior probability that the dynamics remained continuous (i.e. did not jump) between the previous and current time bin.

Parameters:

decode_res (dict) – Decoded latent state dictionary as returned by SpikeData.fit_gplvm()["decode_res"]. Must contain the key "posterior_dynamics_marg" with shape (T, D) where the first column (index 0) holds the continuity probability.

Returns:

1-D array of shape (T,) with the: continuity probability at each time bin.

Return type:

continuity_prob (np.ndarray)

spikelab.spikedata.utils.gplvm_average_state_probability(posterior_latent_marg)[source]

Compute the average probability of each latent state across all time bins.

Parameters:

posterior_latent_marg (np.ndarray) – Marginal posterior over latent states with shape (T, K) where T is the number of time bins and K is the number of latent states. Typically obtained from SpikeData.fit_gplvm()["decode_res"]["posterior_latent_marg"].

Returns:

1-D array of shape (K,) with the mean: probability of each latent state, averaged over all time bins.

Return type:

avg_prob (np.ndarray)

spikelab.spikedata.utils.shuffle_z_score(observed, shuffle_distribution)[source]

Z-score an observed value against a shuffle null distribution.

Parameters:

observed (scalar or np.ndarray) – The metric computed on the real data.
shuffle_distribution (np.ndarray) – Shape (N, ...) array of the same metric computed on N shuffled datasets (e.g. from SpikeSliceStack.apply on a shuffle stack built by SpikeData.spike_shuffle_stack).

Returns:

Z-score (observed - mean) / std computed along: axis 0. Same shape as observed.

Return type:

z (np.ndarray)

Notes

Intended for determining whether an observed metric is significantly different from what degree-preserving shuffled data produces.
The shuffle std is the Bessel-corrected (ddof=1) sample estimator. The default np.nanstd denominator of N underestimates σ by ~11% at N=5 and ~0.5% at N=100; with ddof=1 z-scores are unbiased and comparable across different shuffle counts.
Elements where the shuffle standard deviation is zero will be NaN. For N=1 the sample std is NaN (no degrees of freedom), which also propagates to NaN.

spikelab.spikedata.utils.shuffle_percentile(observed, shuffle_distribution)[source]

Compute the percentile rank of an observed value within a shuffle distribution.

Parameters:

observed (scalar or np.ndarray) – The metric computed on the real data.
shuffle_distribution (np.ndarray) – Shape (N, ...) array of the same metric computed on N shuffled datasets.

Returns:

Fraction of shuffle values ≤ observed, computed: along axis 0. Values in [0, 1]. Same shape as observed.

Return type:

pct (np.ndarray)

Notes

Non-parametric alternative to shuffle_z_score; gives the rank of the observed value within the null distribution without assuming normality.

spikelab.spikedata.utils.slice_trend(values, times=None)[source]

Fit a linear trend to a metric computed across ordered slices.

Parameters:

values (np.ndarray) – Shape (S,) array of metric values, one per slice, in temporal order.
times (np.ndarray | None) – Shape (S,) array of slice midpoints in milliseconds. If None, integer indices 0 .. S-1 are used.

Returns:

Linear regression slope. Units are metric-change per: millisecond (if times provided) or per slice index.
p_value (float): Two-sided p-value for the null hypothesis that the: slope is zero.

Return type:

slope (float)

Notes

Intended for detecting systematic drift of a metric over the course of a recording. Apply to the output of SpikeSliceStack.apply on a frames stack built by SpikeData.frames. A significant positive or negative slope indicates non-stationarity.
Uses scipy.stats.linregress.

spikelab.spikedata.utils.slice_stability(values)[source]

Compute the coefficient of variation of a metric across slices.

Parameters:

values (np.ndarray) – Shape (S,) or (S, ...) array of metric values from SpikeSliceStack.apply.

Returns:

Coefficient of variation std / |mean|: computed along axis 0. Scalar when input is (S,).

Return type:

cv (np.ndarray or float)

Notes

Intended for summarising how much a metric varies across slices (frames, trials, or shuffles). Low CV indicates a stable metric; high CV indicates instability or sensitivity to the slicing.
Elements where the mean is zero will be NaN.