Skip to content

Project Setup

Brittany Scheid edited this page Aug 23, 2022 · 12 revisions

Requirements

  • anaconda >= 3.0
  • Python >= 3.5
  • Matlab >= 2018a

1. Environment Setup

  1. Download or clone the toolbox in a command window: clone git@github.com:b-schd/RNS_processing_toolbox.git.
    If you are not downloading new raw data and are not interacting with Pennsieve, you will not need python functionality, and can skip the next two steps!
  2. Create a new conda environment using the rns_processing_toolbox.yml file: conda env create -f rns_processing_toolbox.yml
  3. Activate the new conda environment conda activate rns_processing_toolbox

2. Preparing config.json

Before running any toolbox pipelines, you must fill out the config.JSON file with path and subject directory information as follows:

  • paths:

    • RNS_RAW_Folder: absolute path to the folder housing all the raw RNS system Data downloaded from NeuroPace (not required for Matlab functionality). If you are using Box Drive to mount the box folder to your file system, specify the direct path to the Neuropace box folder.
    • RNS_DATA_Folder: absolute path to folder that will hold the parsed, partially deidentified, patient data files.
  • boxKeys (optional, only fill if downloading data from NeuroPace box using the sdk. NOT necessary if using Box Drive)

    • CLIENT_ID: OAuth2.0 client ID for box app
    • CLIENT_SECRET: OAuth2.0 client secret
    • ACCESS_TOKEN: box app access token (or developer token)
    • Folder_ID: ID of box folder holding all NeuroPace patient folders (can be found in URL).
  • institution: institution name as used in the NeuroPace folder names (eg. UPenn)

  • patients:

    • ID: internal ID
    • PDMS_ID: NeuroPace-assigned PDMS ID number
    • Initials: Initials used in NeuroPace file names
    • pnsv_dataset (optional): If using Pennsieve integration, Pennsieve dataset ID
    • pnsv_package (optional): If using Pennsieve integration, Pennsieve package ID

3. Downloading the NeuroPace data from box.com

  • Option 1- Direct Download: If you have a small amount of data, you can directly download it from box.com to the RNS_RAW_FOLDER path.

  • Option 2- Use Box Drive: Download Box Drive to your computer and sign into your box account. set the RNS_RAW_FOLDER to the location of the mounted Neuropace box folder (e.g. '/Users/mycomputer/Library/CloudStorage/Box-Box/UPenn IRB 12345 - RNS System Data EXTERNAL #PHI')

  • Option 3- Use the Box sdk: Alternatively, the RNS toolbox can use the box.com sdk to download new NeuroPace data programmatically. You will have to create a box App first, using the following steps:

    1. Go to developer.box.com, log into your box account (or create one) and create a new App
    2. Find the Client ID, Client Secret, and access token (or developer token if you keep your app in developer mode) under your app's "Development" tab, and add them to config.json
    3. In your online box.com account, navigate to the NeuroPace folder that contains each patient folder. The folder ID can be found at the end of the URL, paste that into config.json

Parsing the NeuroPace data for analysis

Before using this toolbox, patient data downloaded from NeuroPace must be repackaged into a format that allows for easy manipulation in a matlab or python environment. Repackaged data will be located in the RNS_DATA_folder specified in config.JSON. If this is your first time running the toolbox, or if you need to update the data files, follow the parsing steps below:

  1. All raw patient data from NeuroPace should be placed in the folder specified by RNS_RAW_Folder in the config.JSON file. The NeuroPace folder for each patient should contain an ECoG_Catalog.csv file and a Data folder containing the recorded device files (.dat). Additionally, you may have a Histograms folder containing hourly and daily event counts, an EpisodeDurations folder containing .csv files detailing episode durations.

  2. cd into RNS_processing_toolbox/rns_py_tools, then run python process_raw.py. This will create a partially de-identified (but not time shifted) version of all RNS files, and will aggregate all .dat files into a single object for future analysis. After running, process_raw, the structure of your RNS_DATA_FOLDER will be structured as follows:

RNS_DATA_FOLDER/
   ptID1/
      Device_Data.mat       # Includes AllData (N_data x 4 int16 matrix), and EventIdx (N_event x 2 matrix)
      Ecog_Catalog.csv
      Annotations/
         Device_Stim.mat    # Includes StimStartStopTimes, StimStartStopIndex, StimStats (times in UTC)
      Histograms/           # The following folders may not be included, depending on your data sharing agreement
         Daily_Histograms.csv
         Hourly_Histograms.csv
      EpisodeDurations/
         ptID1_EpisodeDurations_t1234_t5678.csv
         ...
    ptID2/
  ....