Skip to content

Commit

Permalink
updated notebooks with information about project progress for pre
Browse files Browse the repository at this point in the history
release
  • Loading branch information
yaaminiv committed Nov 5, 2016
1 parent 64153e2 commit 121d738
Show file tree
Hide file tree
Showing 4 changed files with 125 additions and 21 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,54 @@
"\n",
"The Uniprot database was downloaded on 2016-10-28 from this url: http://www.uniprot.org/uniprot/?query=&fil=reviewed%3Ayes&columns=id%2Centry%20name%2Creviewed%2Cprotein%20names%2Cgenes%2Corganism%2Clength%2Cgo(biological%20process)%2Cgo-id%2Ccomment(PATHWAY)%2Cdatabase(UniPathway)%2Cdatabase(CDD)%2Cdatabase(Pfam). \n",
"\n",
"It includes Uniprot codes, protein names, gene names, organism information, sequence length, gene ontology, UniPathway, CDD and Pfam data.\n",
"\n",
"It includes Uniprot codes, protein names, gene names, organism information, sequence length, gene ontology, UniPathway, CDD and Pfam data. Before I begin, I will set my working directory in a way that will allow me to work from a remote machine if needed."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'/Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/notebooks'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pwd"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016\n"
]
}
],
"source": [
"cd /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first step involves acquiring the .fasta file and setting it as my database. To do this, I will use the following code:\n",
"\n",
"1. use `makeblastdb` to create the database that we need\n",
Expand Down Expand Up @@ -232345,6 +232391,15 @@
"-outfmt 6"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"It seems like my blastx is not running to completion because of my jupyter notebook websockets timing out. I will try running this analysis in the command line, and hope to have an interpretation of this blastx by 2016-11-07."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,16 +125,18 @@
"2. `-i` indicates when index to use\n",
"3. `-o` tells the program where to write the output\n",
"4. `--single` allows me to process single-end reads\n",
"5. `-l` estimated average fragment length\n",
"5. `-l` estimated average fragment length from [FastQC output](https://github.com/yaaminiv/yaaminiv-fish546-2016/blob/master/notebooks/2016-10-19-oly-gonad-OA-part-1-FASTQC-results.ipynb)\n",
"6. `-s` estimated standard deviation of fragment length\n",
"7. fastq file to be used\n",
"\n",
"**1. filtered_106A_Female_Mix_GATCAG_L004_R1.fastq**"
"**1. filtered_106A_Female_Mix_GATCAG_L004_R1.fastq**\n",
"\n",
"### NOTE: As of 2016-11-04, I stopped here because I'm trying to figure out how to find `-s` for my .fastq files. I will complete this analysis by 2016-11-07."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 3,
"metadata": {
"collapsed": false
},
Expand All @@ -144,6 +146,8 @@
"output_type": "stream",
"text": [
"\r\n",
"Error: Missing read files\r\n",
"Error: cannot supply mean/sd without supplying both -l and -s\r\n",
"Error: fragment length mean and sd must be supplied for single-end reads using -l and -s\r\n",
"\r\n",
"Usage: kallisto quant [arguments] FASTQ-files\r\n",
Expand Down Expand Up @@ -174,8 +178,8 @@
"-i /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/data/kallisto-index-OlyO-v6 \\\n",
"-o /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/analyses \\\n",
"--single \\\n",
"-l \\\n",
"-s \\\n",
"-l 76 \\\n",
"-s \\ #I'm still trying to figure out where to get this number from!\n",
"/Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/data/filtered_106A_Female_Mix_GATCAG_L004_R1.fastq"
]
},
Expand Down
59 changes: 50 additions & 9 deletions notebooks/2016-10-28-oly-gonad-OA-part-2-BLAST.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,54 @@
"\n",
"The Uniprot database was downloaded on 2016-10-28 from this url: http://www.uniprot.org/uniprot/?query=&fil=reviewed%3Ayes&columns=id%2Centry%20name%2Creviewed%2Cprotein%20names%2Cgenes%2Corganism%2Clength%2Cgo(biological%20process)%2Cgo-id%2Ccomment(PATHWAY)%2Cdatabase(UniPathway)%2Cdatabase(CDD)%2Cdatabase(Pfam). \n",
"\n",
"It includes Uniprot codes, protein names, gene names, organism information, sequence length, gene ontology, UniPathway, CDD and Pfam data.\n",
"\n",
"It includes Uniprot codes, protein names, gene names, organism information, sequence length, gene ontology, UniPathway, CDD and Pfam data. Before I begin, I will set my working directory in a way that will allow me to work from a remote machine if needed."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'/Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/notebooks'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pwd"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016\n"
]
}
],
"source": [
"cd /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first step involves acquiring the .fasta file and setting it as my database. To do this, I will use the following code:\n",
"\n",
"1. use `makeblastdb` to create the database that we need\n",
Expand Down Expand Up @@ -232346,17 +232392,12 @@
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"!/Applications/ncbi-blast-2.5.0+/bin/blastx \\\n",
"-query /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/data/OlyO_v6_transcriptome.fa \\\n",
"-db /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/data/uniprot-all \\\n",
"-outfmt 6"
"It seems like my blastx is not running to completion because of my jupyter notebook websockets timing out. I will try running this analysis in the command line, and hope to have an interpretation of this blastx by 2016-11-07."
]
},
{
Expand Down
14 changes: 9 additions & 5 deletions notebooks/2016-11-04-oly-gonad-OA-part3-kallisto.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -125,16 +125,18 @@
"2. `-i` indicates when index to use\n",
"3. `-o` tells the program where to write the output\n",
"4. `--single` allows me to process single-end reads\n",
"5. `-l` estimated average fragment length\n",
"5. `-l` estimated average fragment length from [FastQC output](https://github.com/yaaminiv/yaaminiv-fish546-2016/blob/master/notebooks/2016-10-19-oly-gonad-OA-part-1-FASTQC-results.ipynb)\n",
"6. `-s` estimated standard deviation of fragment length\n",
"7. fastq file to be used\n",
"\n",
"**1. filtered_106A_Female_Mix_GATCAG_L004_R1.fastq**"
"**1. filtered_106A_Female_Mix_GATCAG_L004_R1.fastq**\n",
"\n",
"### NOTE: As of 2016-11-04, I stopped here because I'm trying to figure out how to find `-s` for my .fastq files. I will complete this analysis by 2016-11-07."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 3,
"metadata": {
"collapsed": false
},
Expand All @@ -144,6 +146,8 @@
"output_type": "stream",
"text": [
"\r\n",
"Error: Missing read files\r\n",
"Error: cannot supply mean/sd without supplying both -l and -s\r\n",
"Error: fragment length mean and sd must be supplied for single-end reads using -l and -s\r\n",
"\r\n",
"Usage: kallisto quant [arguments] FASTQ-files\r\n",
Expand Down Expand Up @@ -174,8 +178,8 @@
"-i /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/data/kallisto-index-OlyO-v6 \\\n",
"-o /Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/analyses \\\n",
"--single \\\n",
"-l \\\n",
"-s \\\n",
"-l 76 \\\n",
"-s \\ #I'm still trying to figure out where to get this number from!\n",
"/Users/yaaminivenkataraman/Documents/School/Year1/FISH-546/yaaminiv-fish546-2016/data/filtered_106A_Female_Mix_GATCAG_L004_R1.fastq"
]
},
Expand Down

0 comments on commit 121d738

Please sign in to comment.