% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/findPara.R
\name{EBPRS}
\alias{EBPRS}
\title{Main function}
\usage{
EBPRS(train, test, N1, N0, robust = F)
}
\arguments{
\item{train}{training dataset}

\item{test}{testing dataset (list) including fam, bed, bim (generated from plink files, plink2R::read_plink is recommended). If missing(test)=T, the function will use all SNPs in training dataset by default.}

\item{N1}{case number}

\item{N0}{control number}

\item{robust}{T/F, indicator that whether robust estimation is needed.}
}
\value{
A list containing
data.frame (result): combining the summary statistics and estimated effect sizes (eff)

estimated effect sizes (eff)

estimated mu (muHat)

estimated sigma2 (sigmaHat2)

estimated proportion of non-associated SNPs (pi0)

estimated variance of effect sizes of associated SNPs (sigma02)
}
\description{
Clean the dataset, extract information from raw data and calculate effect sizes.
(Please notice that there are some requirements for the training and testing datasets.)
}
\details{
The raw training data should be a
data.fame including CHROM, POS, A1, A2, OR, P, SNP.


The CHROM column and the SNP column is used for indexing.

An example training dataset can be acquired using data("traindat")

"test" file can be generated from read_plink("test_plink_file")
The raw testing data could be the files transformed from plink2R (using plink bfiles).

test is a list including fam (6 columns with information on samples), bim (6 columns with information on SNPs), bed (genotypes matrix 0, 1, 2)

Note that in real data, we usually use beta0 = m/20 as the default setting for the EM algorithm,
which is accurate enough in most cases and will have little influence on the prediction performance.
If more accurate parameter estimation is required, we provide a robust estimation (by setting robust=T),
integrating our data-driven bootstrap-based parameter tuning method. This can
derive the best parameter for robust estimation, while more time is needed.
}
\references{
Song, S., Jiang, W., Hou, L. and Zhao, H. (2019). Leveraging effect size distributions to improve polygenic risk scores derived from genome-wide association studies. \emph{PLoS Compuational Biology}.
}
\seealso{
\url{https://github.com/gabraham/plink2R}
}
\author{
Shuang Song, Wei Jiang, Lin Hou and Hongyu Zhao
}
