====================== Workspace Persistence ====================== This guide covers saving and loading analysis results with the :class:`~spikelab.AnalysisWorkspace`, so that expensive computations do not need to be repeated across sessions. After working through this guide you will know how to: - Create a workspace and store analysis results in named namespaces. - Save a workspace to disk and reload it later. - Browse the contents of a workspace with ``list_namespaces``, ``list_keys``, and ``describe``. - Use :class:`~spikelab.workspace.workspace.LazyAnalysisWorkspace` for large datasets that do not fit in RAM. Creating a Workspace --------------------- An :class:`~spikelab.AnalysisWorkspace` is a two-level dictionary: items are addressed by a ``(namespace, key)`` pair. A typical convention is to use the experimental condition as the namespace and the analysis result name as the key. .. code-block:: python from spikelab import AnalysisWorkspace ws = AnalysisWorkspace(name="my_experiment") # Store a SpikeData object under namespace "D0", key "spikedata" ws.store("D0", "spikedata", sd, note="Baseline recording, 120 units") # Store a pairwise matrix in the same namespace ws.store("D0", "sttc_matrix", sttc, note="STTC, delt=20 ms") # Store a numpy array under a shared namespace ws.store("all", "burst_times", tburst, note="Burst peak times across all conditions") The ``note`` parameter is optional free-text metadata that helps you remember what each item contains. Supported types include :class:`~spikelab.SpikeData`, :class:`~spikelab.RateData`, :class:`~spikelab.RateSliceStack`, :class:`~spikelab.SpikeSliceStack`, :class:`~spikelab.PairwiseCompMatrix`, :class:`~spikelab.PairwiseCompMatrixStack`, ``numpy.ndarray``, and ``dict`` (with serializable leaf values such as scalars, strings, arrays, or nested supported types). Use ``describe`` to get a summary of every item in the workspace: .. code-block:: python summary = ws.describe() # summary is a nested dict: {namespace: {key: info_dict}} for ns, keys in summary.items(): for key, info in keys.items(): print(f" {ns}/{key}: {info['type']}, {info.get('shape', '')}") Saving and Loading ------------------- Call ``save`` to write the workspace to disk. This produces two files: an HDF5 file (``*.h5``) containing all data, and a JSON sidecar with workspace metadata. .. code-block:: python # Save to disk — creates workspace.h5 and workspace.json ws.save("results/workspace") To reload the workspace in a later session: .. code-block:: python ws = AnalysisWorkspace.load("results/workspace") # All items are available immediately sd = ws.get("D0", "spikedata") sttc = ws.get("D0", "sttc_matrix") If you only need a single item and want to avoid loading the entire workspace into memory, use :meth:`~spikelab.AnalysisWorkspace.load_item`: .. code-block:: python sttc = AnalysisWorkspace.load_item("results/workspace", "D0", "sttc_matrix") Listing and Retrieving ----------------------- Several methods let you browse the workspace contents without loading the actual data: .. code-block:: python # List all top-level namespaces namespaces = ws.list_namespaces() print(namespaces) # ['D0', 'D3', 'D10', 'D30', 'D50', 'all'] # List keys within a specific namespace keys = ws.list_keys("D0") print(keys) # ['spikedata', 'sttc_matrix', 'fr_corr_matrix', ...] # List keys across all namespaces all_keys = ws.list_keys() # returns {ns: [keys]} dict for ns, ks in all_keys.items(): print(f" {ns}: {len(ks)} items") # Get summary info for a single item (without loading the object) info = ws.get_info("D0", "sttc_matrix") print(info) # {'type': 'PairwiseCompMatrix', 'shape': (120, 120), 'created_at': ..., 'note': '...'} To retrieve the actual object: .. code-block:: python obj = ws.get("D0", "sttc_matrix") # returns the object, or None if not found Other management operations: .. code-block:: python # Rename a key ws.rename("D0", "sttc_matrix", "sttc_delt20") # Add or update a note ws.add_note("D0", "sttc_delt20", "STTC with delt=20 ms, 120 units") # Delete a single item ws.delete("D0", "sttc_delt20") # Delete an entire namespace ws.delete("scratch") Lazy Loading ------------- For large datasets where loading everything into RAM is impractical, use :class:`~spikelab.workspace.workspace.LazyAnalysisWorkspace`. It has the same API as the regular workspace, but each ``store`` call immediately writes the object to a temporary HDF5 file and releases it from memory, and each ``get`` call reads the object back from disk. .. code-block:: python from spikelab.workspace.workspace import LazyAnalysisWorkspace ws = LazyAnalysisWorkspace(name="large_experiment") # Store writes to disk immediately — the object is not kept in RAM ws.store("D0", "spikedata", sd, note="Baseline recording") ws.store("D0", "burst_rss", rss, note="Burst-aligned RateSliceStack") # get reads from disk on every call rss = ws.get("D0", "burst_rss") # save copies the backing file to the target path ws.save("results/workspace") # Load also returns a LazyAnalysisWorkspace when the file is large ws = LazyAnalysisWorkspace.load("results/workspace") All other operations -- ``list_namespaces``, ``list_keys``, ``describe``, ``get_info`` -- work from an in-memory index and do not trigger disk reads. A typical workflow is to use the lazy workspace during long-running compute scripts (where intermediate results accumulate and would exhaust RAM) and switch to the regular workspace for interactive exploration: .. code-block:: python # During computation — use lazy to keep RAM under control ws = LazyAnalysisWorkspace(name="compute_session") for condition in conditions: sd = load_my_data(condition) rss = sd.align_to_events(tburst, pre_ms=250, post_ms=500, kind="rate") corr_stack, _ = rss.get_slice_to_slice_unit_corr_from_stack() ws.store(condition, "spikedata", sd) ws.store(condition, "burst_rss", rss) ws.store(condition, "burst_corr", corr_stack) ws.save("results/workspace") # Later, for interactive exploration — load everything into RAM ws = AnalysisWorkspace.load("results/workspace") sd = ws.get("D0", "spikedata") Merging Workspaces ------------------ When analyses are run separately (e.g. by different scripts or on different machines), you can combine their results into a single workspace with :meth:`~spikelab.workspace.workspace.AnalysisWorkspace.merge_from`: .. code-block:: python ws_main = AnalysisWorkspace(name="combined") ws_a = AnalysisWorkspace.load("results/analysis_a/workspace") ws_b = AnalysisWorkspace.load("results/analysis_b/workspace") result = ws_main.merge_from(ws_a) print(f"Merged {result['merged']} items from A") result = ws_main.merge_from(ws_b) print(f"Merged {result['merged']} items from B, skipped {result['skipped']}") ws_main.save("results/combined/workspace") By default, existing keys are kept and incoming duplicates are skipped. Pass ``overwrite=True`` to replace existing items with the incoming values instead.