Scope of Samples to Examine #1099
-
Alright, so I'm restarting much of my pipeline from the start, as I realized a flaw in my previous analysis (I was grouping Day 0 and Day 2 crabs at different temperature treatments, even though Day 0 samples were taken prior to exposure to different temperatures). The plan is as before: use kallisto to create a matrix of counts for each temperature group, then use DESeq2 to perform a differential expression analysis, then get the GO terms and use GO-MWU to perform a gene enrichment analysis. The good news: This gives me a chance to recalibrate what libraries I want to include in my analysis of DEGs at different temperature treatments! I have 3 possibilities, as follows: Option 1: Balanced Design
Option 2: Imbalanced Design
Option 3: Imbalanced Design - This time, it's imbalanced-er
Any recommendations on the best approach to take here? |
Beta Was this translation helpful? Give feedback.
Replies: 10 comments 2 replies
-
These are individual crab libraries? And in some instances the same crab
samples more than one? This would be a confounding factor. Can you provide
a diagram of all libraries in play and how they relate?
On Tue, Feb 2, 2021 at 7:28 PM afcoyle ***@***.***> wrote:
Alright, so I'm restarting much of my pipeline from the start, as I
realized a flaw in my previous analysis (I was grouping Day 0 and Day 2
crabs at different temperature treatments, even though Day 0 samples were
taken prior to exposure to different temperatures). The plan is as before:
use kallisto to create a matrix of counts for each temperature group, then
use DESeq2 to perform a differential expression analysis, then get the GO
terms and use GO-MWU to perform a gene enrichment analysis.
The good news: This gives me a chance to recalibrate what libraries I want
to include in my analysis of DEGs at different temperature treatments! I
have 3 possibilities, as follows:
*Option 1: Balanced Design*
Temperature Num of samples
Elevated 3
Ambient 3
Lowered 3
*Option 2: Imbalanced Design*
This option disregards a balanced design, and samples all possible
infected crab from their respective treatment group
Temperature Num of samples
Elevated 4
Ambient 10
Lowered 3
*Option 3: Imbalanced Design - This time, it's imbalanced-er*
This option adds some bonus ambient-temperature crab by including Day 0
crab that were part of the ambient and low-temperature treatment groups
(and, since they were Day 0, had been held at ambient temperatures). This
may be a plus or a minus, but this means that several individuals would be
present in multiple temperature treatments (ex: Crab A is counted as an
ambient library on Day 0, but a elevated-temperature library on Day 2)
Temperature Num of samples
Elevated 4
Ambient 14
Lowered 3
Any recommendations on the best approach to take here?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1093>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABB4PN7FC6P6WBZLJI3Z4TDS5C7ERANCNFSM4XACMPXQ>
.
--
Steven B. Roberts, Associate Professor
Associate Director - Graduate Program Coordinator
School of Aquatic and Fishery Sciences
University of Washington
Fisheries Teaching and Research (FTR) Building - Office 232
1140 NE Boat Street - Seattle, WA 98105
robertslab.info - sr320@uw.edu - @sr320
vm:206.866.5141 - cell:360.362.3626
|
Beta Was this translation helpful? Give feedback.
-
They're a mix of pooled and individual libraries (did my best to balance them out). And yep, absolutely - here's a table of the 3 possible sample situations. They are a bit large! I can also post this as a separate question, but figured I'd ask in case it's a quick easy answer - looks like there are both raw .fastq files and fasta-trimmed .fastq files. Which should I be using as my initial data source? Anyway, here are the tables! Option 1: Balanced Design
Option 2: Unbalanced Design
Option 3: Mega-Unbalanced Design
In case you missed that, those were crabs G/H/I/E, sample numbers 173, 72, 127, 151 |
Beta Was this translation helpful? Give feedback.
-
I would say go for all in terms of data exploration however none seem logical for an experimental design associated with a manuscript. |
Beta Was this translation helpful? Give feedback.
-
Gotcha, I'll start on that. Can you elaborate a bit more on that - is there some configuration of the existing samples that would be manuscript-worthy, or would more sequencing be necessary? Also for reference, the final table includes all existing libraries of infected crabs (both individual and pooled) |
Beta Was this translation helpful? Give feedback.
-
I see something worthy... |
Beta Was this translation helpful? Give feedback.
-
I think I largely follow - comparison is between day and day 2 libraries for crabs G/H/I. Are the ambient-temperature crabs there to be used as a baseline for ambient-temperature expression - what's their purpose? And that table does show all existing crab libraries! I can definitely work on prepping more for sequencing though, there are plenty of Day 2 elevated-temperature samples |
Beta Was this translation helpful? Give feedback.
-
comparison is crabs ABC versus GHI, only focusing on two time points. To determine how temperature influences expression
certainly have more pooled libraries? https://d.pr/i/MsOLkt |
Beta Was this translation helpful? Give feedback.
-
Ahh okay, gotcha - I understand! Although with the libraries: all libraries not included in that table are either uninfected libraries, which I assume are irrelevant for examining hematodinium response, or pooled by temperature (such as all those 329... samples sequenced by Sam), which don't seem useful for looking at temperature response. Here's a photo from the spreadsheet that tracks the sample treatments specifically |
Beta Was this translation helpful? Give feedback.
-
Untapped value here is the individual samples. What would be cool is looking a gene expression patterns over time in crabs and then compare the time series to see similarities and differences. Something like WGCNA would be worth doing.. |
Beta Was this translation helpful? Give feedback.
-
Yep, absolutely! Here is that table. Again, this only includes individual libraries of infected crab, since uninfected crab don't seem relevant for what we're examining:
If we do end up doing further sequencing to get more time series of infected crab, it may be interesting to consider looking at the initial crab collection data and see if any physical characteristics are linked to Hematodinium gene expression. Since a common hypothesis is that crab are infected post-molt, looking at shell condition or maturity (as determined by chela height) could be a promising way to potentially look at different stages of Hematodinium infection |
Beta Was this translation helpful? Give feedback.
Untapped value here is the individual samples.
Can you provide a table where each row is a crab and provide relevant metadata?
What would be cool is looking a gene expression patterns over time in crabs and then compare the time series to see similarities and differences. Something like WGCNA would be worth doing..