Skip to content

Commit

Permalink
Merge pull request #92 from Emory-HITI/dev
Browse files Browse the repository at this point in the history
Make the split into chunks as a configurable parameter
  • Loading branch information
pradeeban authored Jan 13, 2021
2 parents f5fa14c + 9c6eb6b commit 602bfed
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
3 changes: 2 additions & 1 deletion modules/png-extraction/ImageExtractor.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
processes = niffler['UseProcesses'] #how many processes to use.
email = niffler['YourEmail']
send_email = niffler['SendEmail']
split = niffler['SplitIntoChunks']

png_destination = output_directory + '/extracted-images/'
failed = output_directory +'/failed-dicom/'
Expand Down Expand Up @@ -259,7 +260,7 @@ def fix_mismatch(with_VRs=['PN', 'DS', 'IS']):
filelist=glob.glob(file_path, recursive=True) #this searches the folders at the depth we request and finds all dicoms
pickle.dump(filelist,open(pickle_file,'wb'))

file_chunks = np.array_split(filelist,100)
file_chunks = np.array_split(filelist,split)
logging.info('Number of dicom files: ' + str(len(filelist)))
logging.info('Number of chunks is 100 with size ' + str(len(file_chunks[0])) )

Expand Down
2 changes: 2 additions & 0 deletions modules/png-extraction/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ Find the config.json file in the folder and modify accordingly *for each* Niffle

* *Depth*: How far in the folder hierarchy from the DICOMHome are the DICOM images. For example, a patient/study/series/instances.dcm hierarchy indicates a depth of 3. If the DICOM files are in the DICOMHome itself with no folder hierarchy, the depth will be 0.

* *SplitIntoChunks*: How many chunks do you want to split the metadata extraction process into? By default, 1. Leave it as it is for most of the extractions. For extremely large batches, split it accordingly. Single chunk works for 10,000 files. So you can set it to 2, if you have 20,000 files, for example.

* *SendEmail*: Do you want to send an email notification when the extraction completes? The default is true. You may disable this if you do not want to receive an email upon the completion.

* *YourEmail*: Replace "test@test.test" with a valid email if you would like to receive an email notification. If the SendEmail property is disabled, you can leave this as is.
Expand Down
3 changes: 2 additions & 1 deletion modules/png-extraction/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@
"DICOMHome": "/Users/pradeeban/Downloads",
"OutputDirectory": "/Users/pradeeban/Downloads/root",
"Depth": 0,
"SplitIntoChunks": 1,
"PrintImages": true,
"CommonHeadersOnly": false,
"UseProcesses": 0,
"SendEmail": true,
"YourEmail": "test@test.test"
}
}

0 comments on commit 602bfed

Please sign in to comment.