Skip to content

Commit

Permalink
derving study name from lowercase storage collection
Browse files Browse the repository at this point in the history
  • Loading branch information
vsoch committed Oct 26, 2017
1 parent 5e83001 commit e0b87e7
Show file tree
Hide file tree
Showing 4 changed files with 6 additions and 15 deletions.
9 changes: 1 addition & 8 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,6 @@ ANONYMIZE_PIXELS=False

**Important** the pixel scrubbing is not yet implemented, so this variable will currently only check for the header, and alert you of the image, and skip it. Regardless of the setting that you choose for the variable `ANONYMIZE_PIXELS` the header will always be checked. If you have pixel scrubbing turned on (and it's implemented) the images will be scrubbed, and included. If you have scrubbing turned on (and it's not implemented) it will just yell at you and skip them. The same thing will happen if it's off, just to alert you that they exist.

```
# The default study to use
SOM_STUDY="test"
```

The `SOM_STUDY` is part of the Stanford DASHER API to specify a study, and the default should be set before you start the application. If the study needs to vary between calls, please [post an issue](https://www.github.com/pydicom/sendit) and it can be added to be done at runtime.

Next, you likely want a custom filter applied to whitelist (accept no matter what), greylist (not accept, but in the future know how to clean the data) and blacklist (not accept). Currently, the deid software applies a [default filter](https://github.com/pydicom/deid/blob/development/deid/data/deid.dicom) to filter out images with known burned in pixels. If you want to add a custom file, currently it must live with the repository, and is referenced by the name of the file after the `deid`. You can specify this string in the config file:

```
Expand Down Expand Up @@ -85,7 +78,7 @@ GOOGLE_STORAGE_COLLECTION=None # define here or in your secrets
GOOGLE_PROJECT_NAME="project-name" # not the id, usually the end of the url in Google Cloud
```

Note that the storage collection is set to None, and this should be the id of the study (eg, the IRB). For Google Storage, this collection corresponds with a Bucket. For BigQuery, it corresponds with a database (and a table of dicom). If this is set to None, it will not upload.
Note that the storage collection is set to None, and this should be the id of the study (eg, the IRB). For Google Storage, this collection corresponds with a Bucket. For BigQuery, it corresponds with a database (and a table of dicom). If this is set to None, it will not upload. Also note that we derive the study name to use with Dasher from this bucket. It's simply the lowercase version of it. This means that a `GOOGLE_STORAGE_COLLECTION` of `IRB12345` maps to a study name `irb12345`.

Note that this approach isn't suited for having more than one study - when that is the case, the study will likely be registered with the batch. Importantly, for the above, there must be a `GOOGLE_APPLICATION_CREDENTIALS` filepath exported in the environment, or it should be run on a Google Cloud Instance (unlikely in the near future).

Expand Down
1 change: 0 additions & 1 deletion sendit/apps/main/tasks/finish.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,6 @@ def upload_storage(batch_ids=None):
from sendit.settings import (GOOGLE_CLOUD_STORAGE,
SEND_TO_GOOGLE,
GOOGLE_PROJECT_NAME,
GOOGLE_PROJECT_ID_HEADER,
GOOGLE_STORAGE_COLLECTION)

if batch_ids is None:
Expand Down
2 changes: 1 addition & 1 deletion sendit/apps/main/tasks/get.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ def get_identifiers(bid,study=None,run_replace_identifiers=True):

# Process all dicoms at once, one call to the API
dicom_files = batch.get_image_paths()
batch.change_images_status('PROCESSING')
batch.status = "PROCESSING"
batch.save() # redundant

try:
Expand Down
9 changes: 4 additions & 5 deletions sendit/settings/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,6 @@
# If True, scrub pixel data for images identified by header "Burned in Annotation" = "NO"
ANONYMIZE_PIXELS=False # currently not supported

# The default study to use
SOM_STUDY="test"

# An additional specification for white, black, and greylisting data
# If None, only the default (for burned pixel filtering) is used
# Currently, these live with the deid software, eg:
Expand Down Expand Up @@ -53,5 +50,7 @@

# Google Cloud Storage Bucket (must be created)
GOOGLE_CLOUD_STORAGE='radiology'
GOOGLE_STORAGE_COLLECTION=None # define here or in your secrets
GOOGLE_PROJECT_NAME=None # define here or in your secretsy
GOOGLE_STORAGE_COLLECTION='' # must be defined before SOM_STUDY
GOOGLE_PROJECT_NAME=None

SOM_STUDY = GOOGLE_STORAGE_COLLECTION.lower()

0 comments on commit e0b87e7

Please sign in to comment.