-
Notifications
You must be signed in to change notification settings - Fork 44
Censored Trait Analysis
- Single-trait Linear Mixed Model (Genomic data)
- Multiple censored traits Linear Mixed Model (Genomic data)
- Single-trait/Multi-trait Linear Additive Genetic Model
Censored traits should be identified in the argument
censored_trait
in build_model().
Lower bound should be named as "traitname_l", and upper bound should be named as "traitname_u".
If you have observations without upper or lower bounds, you can have them in your phenotype file as
lower_bound, upper_bound
120,Inf
-Inf,130
If you have observations without upper and lower bounds, you can have them in your phenotype file as
lower_bound, upper_bound
-Inf,Inf
or
lower_bound, upper_bound
missing,missing
The MCMC samples of unobserved continuous liability are in the output folder.
Below we will convert a continuous trait "y1" to a censored trait by setting a phenotypic value (e.g.,y11) to a range [lower_bound, upper_bound] as follows,
- [y11 - 0.5, y11 + 0.5] if 0.0 < y11 <1.0
- [2.0, Inf] if y11 > 2.0
- [-Inf, -3.0] if y11 < -3.0
- [y11, y11] otherwise
This is only used to demonstrate the censored trait analysis using JWAS.
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets,Random
Random.seed!(1)
phenofile = dataset("phenotypes.csv")
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"])
#y1 - censored trait
phenotypes[!, :lower_bound1] .= round.(phenotypes[!, :y1],digits=2);
phenotypes[!, :upper_bound1] .= round.(phenotypes[!, :y1],digits=2);
phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :upper_bound1] .= phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :y1] .+ 0.5;
phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :lower_bound1] .= phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :y1] .- 0.5;
phenotypes[phenotypes[!, :y1] .> 2.0,:upper_bound1] .= Inf;
phenotypes[phenotypes[!, :y1] .> 2.0,:lower_bound1] .= 2.0;
phenotypes[phenotypes[!, :y1] .< -3.0,:upper_bound1].= -3.0;
phenotypes[phenotypes[!, :y1] .< -3.0,:lower_bound1].= -Inf;
phenotypes=phenotypes[:, Not(:y1)] #drop :y1 column
# Step 0: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets,Random
Random.seed!(1)
# Step 1: Rename the lower bound as "traitname_l", and upper bound as "traitname_u"
rename!(phenotypes, :lower_bound1 => :y1_l, :upper_bound1 => :y1_u)
# Step 2: Read data
pedfile = dataset("pedigree.csv")
pedigree = get_pedigree(pedfile,separator=",",header=true);
genofile = dataset("genotypes.csv")
genotypes = get_genotypes(genofile,separator=',',method="BayesC");
# Step 3: Build Model Equations
model_equation ="y1 = intercept + x1 + x2 + x2*x3 + ID + dam + genotypes"
model = build_model(model_equation,
censored_trait=["y1"])
# Step 4: Set Factors or Covariates
set_covariate(model,"x1");
# Step 5: Set Random or Fixed Effects
set_random(model,"x2");
set_random(model,"ID dam",pedigree);
# Step 6: Run Analysis
out=runMCMC(model,phenotypes,chain_length=5000);
# Step 7: Check Accuruacy
results = innerjoin(out["EBV_y1"], phenotypes, on = :ID)
accuruacy = cor(results[!,:EBV],results[!,:bv1])
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets,Random
Random.seed!(1)
phenofile = dataset("phenotypes.csv")
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"])
#y1 - censored trait
phenotypes[!, :lower_bound1] .= round.(phenotypes[!, :y1],digits=2);
phenotypes[!, :upper_bound1] .= round.(phenotypes[!, :y1],digits=2);
phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :upper_bound1] .= phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :y1] .+ 0.5;
phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :lower_bound1] .= phenotypes[0.0 .< phenotypes[!, :y1] .< 1.0, :y1] .- 0.5;
phenotypes[phenotypes[!, :y1] .> 2.0,:upper_bound1] .= Inf;
phenotypes[phenotypes[!, :y1] .> 2.0,:lower_bound1] .= 2.0;
phenotypes[phenotypes[!, :y1] .< -3.0,:upper_bound1].= -3.0;
phenotypes[phenotypes[!, :y1] .< -3.0,:lower_bound1].= -Inf;
phenotypes=phenotypes[:, Not(:y1)] #drop :y1 column
#y2 - censored trait
phenotypes[!, :lower_bound2] .= round.(phenotypes[!, :y2],digits=2);
phenotypes[!, :upper_bound2] .= round.(phenotypes[!, :y2],digits=2);
phenotypes[0.0 .< phenotypes[!, :y2] .< 1.0, :upper_bound2] .= phenotypes[0.0 .< phenotypes[!, :y2] .< 1.0, :y2] .+ 0.5;
phenotypes[0.0 .< phenotypes[!, :y2] .< 1.0, :lower_bound2] .= phenotypes[0.0 .< phenotypes[!, :y2] .< 1.0, :y2] .- 0.5;
phenotypes[phenotypes[!, :y2] .> 2.0,:upper_bound2] .= Inf;
phenotypes[phenotypes[!, :y2] .> 2.0,:lower_bound2] .= 2.0;
phenotypes[phenotypes[!, :y2] .< -3.0,:upper_bound2].= -3.0;
phenotypes[phenotypes[!, :y2] .< -3.0,:lower_bound2].= -Inf;
phenotypes=phenotypes[:, Not(:y2)] #drop :y2 column
# Step 0: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets,Random
Random.seed!(1)
# Step 1: Rename the lower bound as "traitname_l", and upper bound as "traitname_u"
rename!(phenotypes, :lower_bound1 => :y1_l, :upper_bound1 => :y1_u)
rename!(phenotypes, :lower_bound2 => :y2_l, :upper_bound2 => :y2_u)
# Step 2: Read data
pedfile = dataset("pedigree.csv")
pedigree = get_pedigree(pedfile,separator=",",header=true);
genofile = dataset("genotypes.csv")
genotypes = get_genotypes(genofile,separator=',',method="BayesC");
# Step 3: Build Model Equations
model_equation ="y1 = intercept + x1 + x2 + x2*x3 + ID + dam + genotypes
y2 = intercept + x1 + x2 + ID + genotypes"
model = build_model(model_equation,
censored_trait=["y1","y2"])
# Step 4: Set Factors or Covariates
set_covariate(model,"x1");
# Step 5: Set Random or Fixed Effects
set_random(model,"x2");
set_random(model,"ID dam",pedigree);
# Step 6: Run Analysis
out=runMCMC(model,phenotypes,chain_length=5000);
# Step 7: Check Accuruacy
results = innerjoin(out["EBV_y1"], phenotypes, on = :ID)
accuruacy = cor(results[!,:EBV],results[!,:bv1])
results = innerjoin(out["EBV_y2"], phenotypes, on = :ID)
accuruacy = cor(results[!,:EBV],results[!,:bv2])
To run analysis without genomic data, just remove "genotypes" in the model_equation from the script above.