-
Notifications
You must be signed in to change notification settings - Fork 44
Multiclass Bayesian analysis
Hao Cheng edited this page Jun 9, 2021
·
6 revisions
Multi-class Bayesian Alphabet model is used when multiple genome data are included in the model.
Below we will split the dataset "genofile" into two classes assuming genomic regions composed of marker 1500 and marker 5011000 represent 2 genetic architectures (class 1 and class 2). This is used to demonstrate the multi-class anlaysis using JWAS. Users will assign markers to multiple groups given additional biological information.
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets
genofile = dataset("genotypes.csv")
geno_data = CSV.read(genofile,DataFrame);
number_snp = size(geno_data,2) - 1 # 1st column is ID
ID = geno_data[!,:ID]; # get IDs
geno_class1 = geno_data[!, 2:Int(number_snp/2 + 1)]; # allocate the marker m1 to m500 to class 1
geno_class2 = geno_data[!, Int(number_snp/2 + 2):number_snp + 1 ]; # allocate the marker m501 to m1000 to class 2
# Step 1: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets
# Step 2: Read data
phenofile = dataset("phenotypes.csv")
pedfile = dataset("pedigree.csv")
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"])
pedigree = get_pedigree(pedfile,separator=",",header=true);
genotypes1 = get_genotypes(geno_class1,rowID = ID, method="BayesB"); # read the genotype data in class 1
genotypes2 = get_genotypes(geno_class2,rowID = ID, method="BayesC"); # read the genotype data in class 2
first(phenotypes,5)
# Step 3: Build Model Equations
model_equation ="y1 = intercept + x1 + x2 + x2*x3 + ID + dam + genotypes1 + genotypes2
y2 = intercept + x1 + x2 + ID + genotypes1 + genotypes2
y3 = intercept + x1 + ID + genotypes1 + genotypes2";
model = build_model(model_equation);
# Step 4: Set Factors or Covariates
set_covariate(model,"x1");
# Step 5: Set Random or Fixed Effects
set_random(model,"x2");
set_random(model,"ID dam",pedigree);
# Step 6: Run Analysis
out=runMCMC(model,phenotypes);
# Step 7: Check Accuruacy
results = innerjoin(out["EBV_y3"], phenotypes, on = :ID)
accuruacy = cor(results[!,:EBV],results[!,:bv3])