Data Management Plan for dealing with sequencing data.
see also : https://github.com/sr320/LabDocs/wiki/Data-Management
Phasing out
We send material to a variety of facilities. As the samples are shipped to a facility, respective information is entered in the Google Docs spreadsheet Next Gen Seq Library Database.
-
As sequencing facility provdes data, files are downloaded to our local NAS (owl), in the root
nightingales
directory. http://owl.fish.washington.edu/nightingales/ -
The Nightingales Google Spreadsheet is updated.
-
Update the Nightingales Google Fusion Table with new information from the Nightingales Google Spreadsheet. This is accomplished by:
- deleting all rows in the Nightingales Google Fusion Table (Edit > Delete all rows)
- Importing data from the Nightingales Google Spreadsheet (File > Import more rows...)
-
The Google Docs spreadsheet Nightingales Google Spreadsheet is backed up on a regular basis? by downloading tab-delimited file and pushing to LabDocs Repository, with the file name
Nightingales.tsv
-
The nightingales directory on owl is backed up to Amazon Glacier. This is accessible by .......?
- Requirements: Directory with new sequence data, MD5 file from sequencing facility, copy of owluploader.R, destination directory on owl.
-
Copy the owluploader.R script to the directory that contains the new data and the MD5 file provided by the facility. Script needs write access to this directory to create temporary and log files.
-
Edit owluploader.R. Change the string on line 12 to be the directory for the new data, line 13 to the facility MD5 file name, and line 14 to the destination directory on owl.
-
Open terminal, type "Rscript owluploader.R" in the terminal.
-
When the script completes, inspect the MD5Mismatch.txt file for files that did not copy correctly. Also inspect the checksum.md5 and readme.MD files to make sure the new files were added properly.