Error in user YAML: (<unknown>): mapping values are not allowed in this context at line 2 column 17

---
layout: page
title: R-Package: SSDFGP
permalink: /SSDFGP/test/
---

R-Package: SSDFGP

Sample Size Determination for Genomic Prediction

This R package provides a simple function to generate an operating curve that can be used for determining reasonable sample size for genomic prediction.

Installation

### RUN THIS IN R ###
devtools::install_github("oumarkme/SSDFGP")
library(SSDFGP)

Contact authors via the GitHub issue page or e-mails in the authorship list below if you have any questions.
Source and binary files are also available (HERE) if you cannot install the package directly from GitHub.
This package is tested on R versions 3.6.3 and 4.1.1.

The `RErs.det()` function

The RErs.det() function is developed for generating the operating curve function helping users to determine the training set size for genomic prediction.

Usage:

RErs.det(geno, nt = NULL, n_iter = NULL, multi.threads = TRUE)

Input:

geno: A numeric data frame. Genotype information (-1, 0, 1 or principle components). (row: sample, column: variants/PCs)
nt: A numeric vector carried training set sizes for r-score simulation. Also known as $n_t$ in the article, which varies from $n_{min}$ to $n_{max}$ by increment of $\delta$ ($n_{min} \leq n_t \leq n_{max}$). This function will evenly determine 10 breakpoints by default (nt = NULL). Note that the range of the operating curve will be limited by the given nt.
n_iter: A number. Time of iteration of simulating r-scores for each given $n_t$. n_iter = nt by default (n_iter = NULL)
multi.threads: TRUE/FALSE. When the computer has more than 4 threads, this function will use 75% of total computing power by default.

Output

$OC.fig: Operating curve figure. Points which RErs($n_t$) equal 0.95 and 0.99 are annotated.
$GC.fig$ : Fitted growth curve and simulated points.
$parameter: Estimated growth curve parameter ($\alpha$, $\beta$, and $\gamma$).
$OC.fit$ : The fitted values (RErs$(n_t)$) of the operating curve model. ($1 \leq n_t \leq n_c$).

** Curves are plotted by the ggplot2 package thus you may easily annotate them with ggplot2 commands afterward.

Example

Here we use rice 44k data as an example. The raw dataset is available at ricediversity.org and published by Zhao et al. (2011). Load the principal component matrix of the genotype data from the TSDFGS package (this should be installed while installing the SSDFGP package).

# install.packages("TSDFGS")
library(TSDFGS)
data(TSDFGS)

Run the RErs.det()function setting $n_t$ ranging from 25 to 225 by increacement of 25.

RErs.det(geno, nt = seq(25, 225, by = 25))

Determine optimal training set

After deciding on the training set size, you may determine the optimal training set by genotype information using the optTrain() function from the TSDFGS package. (Article available: Ou and Liao, 2019)

# install.packages("TSDFGS")
library(TSDFGS)
optTrain(geno, 1:nrow(geno))

Authorship

Po-Ya Wu (Po-Ya.Wu@hhu.de)
- Article's first author.
- Institute for Quantitative Genetics and Genomics of Plants, Heinrich Heine University, Düsseldorf, Germany
Jen-Hsiang Ou (jen-hsiang.ou@imbim.uu.se)
- Package maintainer
- Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
Chen-Tuo Liao (ctliao@ntu.edu.tw)
- Project administration, supervisor
- Department of Agronomy, National Taiwan University, Taipei, Taiwan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

R-Package: SSDFGP

Installation

The `RErs.det()` function

Example

Determine optimal training set

Authorship

Files

index.md

Latest commit

History

index.md

File metadata and controls

R-Package: SSDFGP

Installation

The RErs.det() function

Example

Determine optimal training set

Authorship

The `RErs.det()` function