We'll use SparCC which is from this paper from the Alm lab to identify cooccurence networks amongst taxonomic groups of bacteria from this cohort.
For now, there is a single tar-compressed directory sparcc_fam_gen.tar.gz. This in includes the following files:
- var_pass1pct_family_sparcc_corr_JA112018.txt
- var_pass1pct_family_sparcc_corr_JA112018_pvals_2side.txt
- var_pass1pct_family_sparcc_cov_JA112018.txt
- var_pass1pct_genus_sparcc_corr_JA112018.txt
- var_pass1pct_genus_sparcc_corr_JA112018_pvals_2side.txt
- var_pass1pct_genus_sparcc_cov_JA112018.txt
The description each file is as follows:
- A tab-delimited table of correlation values between different families.
- A tab-delimited table of 2-sided p-values based on random data label shuffling (1000 permutations) for the family correlations.
- A tab-delimited table of covariance values between different families. Not as informative or trustworthy as correlations.
- A tab-delimited table of correlation values between different genera.
- A tab-delimited table of 2-sided p-values based on random data label shuffling (1000 permutations) for the genera correlations.
- A tab-delimited table of covariance values between different genera. Not as informative or trustworthy as correlations.
For the same files for single variants, I will share these files via dropbox, since github is flagging their size.