Workspace Persistence

This guide covers saving and loading analysis results with the AnalysisWorkspace, so that expensive computations do not need to be repeated across sessions.

After working through this guide you will know how to:

  • Create a workspace and store analysis results in named namespaces.

  • Save a workspace to disk and reload it later.

  • Browse the contents of a workspace with list_namespaces, list_keys, and describe.

  • Use LazyAnalysisWorkspace for large datasets that do not fit in RAM.

Creating a Workspace

An AnalysisWorkspace is a two-level dictionary: items are addressed by a (namespace, key) pair. A typical convention is to use the experimental condition as the namespace and the analysis result name as the key.

from spikelab import AnalysisWorkspace

ws = AnalysisWorkspace(name="my_experiment")

# Store a SpikeData object under namespace "D0", key "spikedata"
ws.store("D0", "spikedata", sd, note="Baseline recording, 120 units")

# Store a pairwise matrix in the same namespace
ws.store("D0", "sttc_matrix", sttc, note="STTC, delt=20 ms")

# Store a numpy array under a shared namespace
ws.store("all", "burst_times", tburst, note="Burst peak times across all conditions")

The note parameter is optional free-text metadata that helps you remember what each item contains.

Supported types include SpikeData, RateData, RateSliceStack, SpikeSliceStack, PairwiseCompMatrix, PairwiseCompMatrixStack, numpy.ndarray, and dict (with serializable leaf values such as scalars, strings, arrays, or nested supported types).

Use describe to get a summary of every item in the workspace:

summary = ws.describe()

# summary is a nested dict: {namespace: {key: info_dict}}
for ns, keys in summary.items():
    for key, info in keys.items():
        print(f"  {ns}/{key}: {info['type']}, {info.get('shape', '')}")

Saving and Loading

Call save to write the workspace to disk. This produces two files: an HDF5 file (*.h5) containing all data, and a JSON sidecar with workspace metadata.

# Save to disk — creates workspace.h5 and workspace.json
ws.save("results/workspace")

To reload the workspace in a later session:

ws = AnalysisWorkspace.load("results/workspace")

# All items are available immediately
sd = ws.get("D0", "spikedata")
sttc = ws.get("D0", "sttc_matrix")

If you only need a single item and want to avoid loading the entire workspace into memory, use load_item():

sttc = AnalysisWorkspace.load_item("results/workspace", "D0", "sttc_matrix")

Listing and Retrieving

Several methods let you browse the workspace contents without loading the actual data:

# List all top-level namespaces
namespaces = ws.list_namespaces()
print(namespaces)   # ['D0', 'D3', 'D10', 'D30', 'D50', 'all']

# List keys within a specific namespace
keys = ws.list_keys("D0")
print(keys)         # ['spikedata', 'sttc_matrix', 'fr_corr_matrix', ...]

# List keys across all namespaces
all_keys = ws.list_keys()   # returns {ns: [keys]} dict
for ns, ks in all_keys.items():
    print(f"  {ns}: {len(ks)} items")

# Get summary info for a single item (without loading the object)
info = ws.get_info("D0", "sttc_matrix")
print(info)
# {'type': 'PairwiseCompMatrix', 'shape': (120, 120), 'created_at': ..., 'note': '...'}

To retrieve the actual object:

obj = ws.get("D0", "sttc_matrix")   # returns the object, or None if not found

Other management operations:

# Rename a key
ws.rename("D0", "sttc_matrix", "sttc_delt20")

# Add or update a note
ws.add_note("D0", "sttc_delt20", "STTC with delt=20 ms, 120 units")

# Delete a single item
ws.delete("D0", "sttc_delt20")

# Delete an entire namespace
ws.delete("scratch")

Lazy Loading

For large datasets where loading everything into RAM is impractical, use LazyAnalysisWorkspace. It has the same API as the regular workspace, but each store call immediately writes the object to a temporary HDF5 file and releases it from memory, and each get call reads the object back from disk.

from spikelab.workspace.workspace import LazyAnalysisWorkspace

ws = LazyAnalysisWorkspace(name="large_experiment")

# Store writes to disk immediately — the object is not kept in RAM
ws.store("D0", "spikedata", sd, note="Baseline recording")
ws.store("D0", "burst_rss", rss, note="Burst-aligned RateSliceStack")

# get reads from disk on every call
rss = ws.get("D0", "burst_rss")

# save copies the backing file to the target path
ws.save("results/workspace")

# Load also returns a LazyAnalysisWorkspace when the file is large
ws = LazyAnalysisWorkspace.load("results/workspace")

All other operations – list_namespaces, list_keys, describe, get_info – work from an in-memory index and do not trigger disk reads.

A typical workflow is to use the lazy workspace during long-running compute scripts (where intermediate results accumulate and would exhaust RAM) and switch to the regular workspace for interactive exploration:

# During computation — use lazy to keep RAM under control
ws = LazyAnalysisWorkspace(name="compute_session")

for condition in conditions:
    sd = load_my_data(condition)
    rss = sd.align_to_events(tburst, pre_ms=250, post_ms=500, kind="rate")
    corr_stack, _ = rss.get_slice_to_slice_unit_corr_from_stack()

    ws.store(condition, "spikedata", sd)
    ws.store(condition, "burst_rss", rss)
    ws.store(condition, "burst_corr", corr_stack)

ws.save("results/workspace")

# Later, for interactive exploration — load everything into RAM
ws = AnalysisWorkspace.load("results/workspace")
sd = ws.get("D0", "spikedata")

Merging Workspaces

When analyses are run separately (e.g. by different scripts or on different machines), you can combine their results into a single workspace with merge_from():

ws_main = AnalysisWorkspace(name="combined")

ws_a = AnalysisWorkspace.load("results/analysis_a/workspace")
ws_b = AnalysisWorkspace.load("results/analysis_b/workspace")

result = ws_main.merge_from(ws_a)
print(f"Merged {result['merged']} items from A")

result = ws_main.merge_from(ws_b)
print(f"Merged {result['merged']} items from B, skipped {result['skipped']}")

ws_main.save("results/combined/workspace")

By default, existing keys are kept and incoming duplicates are skipped. Pass overwrite=True to replace existing items with the incoming values instead.