Skip to content

single trait and multiple trait GBLUP by providing the relationship matrix directly

Hao Cheng edited this page May 19, 2021 · 8 revisions

To fit single-trait and multiple-trait GBLUP by providing the relationship matrix directly, users just need to simply provide the genomic relationship matrix in the get_genotypes function as

genotypes  = get_genotypes("GRM.csv",separator=',',method="GBLUP",header=false);

The format of "GRM.csv" is as follows,

a1,1.0581,-0.0311,0.0287
a2,-0.0311,1.0132,0.0798
a3,0.0287,0.0798,1.0021

This is a genomic relationship matrix for 3 individuals including a1, a2, and a3. The 1st column is for the individual IDs. Note that the header can NOT be included in the file and the argument header=false should be used in the get_genotypes() function.

Single-trait analysis

# Step 1: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets

# Step 2: Read data 
phenofile  = dataset("phenotypes.csv")
pedfile    = dataset("pedigree.csv")
genofile   = dataset("GRM.csv")
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"])
pedigree   = get_pedigree(pedfile,separator=",",header=true);
genotypes  = get_genotypes(genofile,separator=',',method="GBLUP",header=false);
first(phenotypes,5)

# Step 3: Build Model Equations

model_equation  ="y1 = intercept + x1 + x2 + x2*x3 + ID + dam + genotypes"
model = build_model(model_equation);

# Step 4: Set Factors or Covariates
set_covariate(model,"x1");

# Step 5: Set Random or Fixed Effects
set_random(model,"x2");
set_random(model,"ID dam",pedigree);

# Step 6: Run Analysis
out=runMCMC(model,phenotypes);

# Step 7: Check Accuruacy
results    = innerjoin(out["EBV_y1"], phenotypes, on = :ID) 
accuruacy  = cor(results[!,:EBV],results[!,:bv1])

Multi-trait analysis

# Step 1: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets

# Step 2: Read data 
phenofile  = dataset("phenotypes.csv")
pedfile    = dataset("pedigree.csv")
genofile   = dataset("GRM.csv")
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"])
pedigree   = get_pedigree(pedfile,separator=",",header=true);
genotypes  = get_genotypes(genofile,separator=',',method="GBLUP",header=false);
first(phenotypes,5)

# Step 3: Build Model Equations

model_equation  ="y1 = intercept + x1 + x2 + x2*x3 + ID + dam + genotypes
                  y2 = intercept + x1 + x2 + ID + genotypes
                  y3 = intercept + x1 + ID + genotypes";
model = build_model(model_equation);

# Step 4: Set Factors or Covariates
set_covariate(model,"x1");

# Step 5: Set Random or Fixed Effects
set_random(model,"x2");
set_random(model,"ID dam",pedigree);

# Step 6: Run Analysis
out=runMCMC(model,phenotypes);

# Step 7: Check Accuruacy
results    = innerjoin(out["EBV_y3"], phenotypes, on = :ID) 
accuruacy  = cor(results[!,:EBV],results[!,:bv3])
Clone this wiki locally