Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spencer Goodman week05 homework: slightly late, but accurate to the 12th decimal pt #126

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
135 changes: 127 additions & 8 deletions weeks/week03/homework.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
"cell_type": "raw",
"metadata": {},
"source": [
" "
" this is a one tailed test because we expect the difference to be in a specific direction (in this case higher level in A then B alleles)"
]
},
{
Expand All @@ -36,7 +36,9 @@
{
"cell_type": "raw",
"metadata": {},
"source": []
"source": [
"in this case because the data appears to be non-parametric a test such as the wilcoxon sum ranked test would be most appropriate"
]
},
{
"cell_type": "markdown",
Expand All @@ -46,9 +48,60 @@
]
},
{
"cell_type": "raw",
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"\n",
"\tShapiro-Wilk normality test\n",
"\n",
"data: data_b$normal - data_c$normal\n",
"W = 0.97567, p-value = 0.9281\n"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"\n",
"\tWilcoxon rank sum test\n",
"\n",
"data: data_b$normal and data_c$normal\n",
"W = 6, p-value = 0.06494\n",
"alternative hypothesis: true location shift is not equal to 0\n"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data <- read.table(file = \"adamts_B6.txt\", header = TRUE)\n",
"\n",
"data_b <- subset(data, subset = nxf1 == \"B\")\n",
"data_c <- subset(data, subset = nxf1 == \"C\")\n",
"\n",
"shapiro.test(data_b$normal-data_c$normal)\n",
"\n",
"wilcox.test(x = data_b$normal, y = data_c$normal)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
"source": [
"Since the p value is larger then 0.05, we conclude that there is no sigificant difference between the alleles. "
]
},
{
"cell_type": "markdown",
Expand All @@ -60,9 +113,71 @@
]
},
{
"cell_type": "raw",
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Warning message:\n",
"In data_b$normal.f2 - data_c$normal.f2: longer object length is not a multiple of shorter object length"
]
},
{
"data": {
"text/plain": [
"\n",
"\tShapiro-Wilk normality test\n",
"\n",
"data: data_b$normal.f2 - data_c$normal.f2\n",
"W = 0.81146, p-value = 0.002215\n"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"\n",
"\tWelch Two Sample t-test\n",
"\n",
"data: data_b$normal.f2 and data_c$normal.f2\n",
"t = -2.0766, df = 13.225, p-value = 0.02894\n",
"alternative hypothesis: true difference in means is less than 0\n",
"95 percent confidence interval:\n",
" -Inf -0.0005380758\n",
"sample estimates:\n",
" mean of x mean of y \n",
"0.002613228 0.006241535 \n"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data <- read.table(file = \"adamts_balbF2.txt\", header = TRUE)\n",
"\n",
"data_b <- subset(data, subset = nxf1.f2 == \"B\")\n",
"data_c <- subset(data, subset = nxf1.f2 == \"C\")\n",
"\n",
"shapiro.test(data_b$normal.f2-data_c$normal.f2)\n",
"t.test(data_b$normal.f2, data_c$normal.f2, alternative = \"less\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
"source": [
"I conclude that since the data is normally distributed, a t-test is appropriate in this case. Furthermore, since we expect the B allele to be higher, a one-tailed test is also approriate. Given these conditions, I reject the null hypothesis and conclude that the B allele is significantly higher than C."
]
},
{
"cell_type": "markdown",
Expand All @@ -74,7 +189,9 @@
{
"cell_type": "raw",
"metadata": {},
"source": []
"source": [
"I conclude that Nfx1 has an influcence on Gene 8 in a strain-specific fashion. The evidence is not overwhelming of an effect, then. However, the p value for B6 is very close to significant. Further experiments are validated."
]
},
{
"cell_type": "markdown",
Expand All @@ -86,7 +203,9 @@
{
"cell_type": "raw",
"metadata": {},
"source": []
"source": [
"Parametric tests are more senstitive. This is why non-parametric tests use ranks."
]
},
{
"cell_type": "markdown",
Expand Down