% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ml_regression_aft_survival_regression.R
\name{ml_aft_survival_regression}
\alias{ml_aft_survival_regression}
\alias{ml_survival_regression}
\title{Spark ML -- Survival Regression}
\usage{
ml_aft_survival_regression(x, formula = NULL, censor_col = "censor",
  quantile_probabilities = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95,
  0.99), fit_intercept = TRUE, max_iter = 100L, tol = 1e-06,
  aggregation_depth = 2, quantiles_col = NULL,
  features_col = "features", label_col = "label",
  prediction_col = "prediction",
  uid = random_string("aft_survival_regression_"), ...)

ml_survival_regression(x, formula = NULL, censor_col = "censor",
  quantile_probabilities = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95,
  0.99), fit_intercept = TRUE, max_iter = 100L, tol = 1e-06,
  aggregation_depth = 2, quantiles_col = NULL,
  features_col = "features", label_col = "label",
  prediction_col = "prediction",
  uid = random_string("aft_survival_regression_"), response = NULL,
  features = NULL, ...)
}
\arguments{
\item{x}{A \code{spark_connection}, \code{ml_pipeline}, or a \code{tbl_spark}.}

\item{formula}{Used when \code{x} is a \code{tbl_spark}. R formula as a character string or a formula. This is used to transform the input dataframe before fitting, see \link{ft_r_formula} for details.}

\item{censor_col}{Censor column name. The value of this column could be 0 or 1. If the value is 1, it means the event has occurred i.e. uncensored; otherwise censored.}

\item{quantile_probabilities}{Quantile probabilities array. Values of the quantile probabilities array should be in the range (0, 1) and the array should be non-empty.}

\item{fit_intercept}{Boolean; should the model be fit with an intercept term?}

\item{max_iter}{The maximum number of iterations to use.}

\item{tol}{Param for the convergence tolerance for iterative algorithms.}

\item{aggregation_depth}{(Spark 2.1.0+) Suggested depth for treeAggregate (>= 2).}

\item{quantiles_col}{Quantiles column name. This column will output quantiles of corresponding quantileProbabilities if it is set.}

\item{features_col}{Features column name, as a length-one character vector. The column should be single vector column of numeric values. Usually this column is output by \code{\link{ft_r_formula}}.}

\item{label_col}{Label column name. The column should be a numeric column. Usually this column is output by \code{\link{ft_r_formula}}.}

\item{prediction_col}{Prediction column name.}

\item{uid}{A character string used to uniquely identify the ML estimator.}

\item{...}{Optional arguments; see Details.}

\item{response}{(Deprecated) The name of the response column (as a length-one character vector.)}

\item{features}{(Deprecated) The name of features (terms) to use for the model fit.}
}
\value{
The object returned depends on the class of \code{x}.

\itemize{
  \item \code{spark_connection}: When \code{x} is a \code{spark_connection}, the function returns an instance of a \code{ml_estimator} object. The object contains a pointer to
  a Spark \code{Predictor} object and can be used to compose
  \code{Pipeline} objects.

  \item \code{ml_pipeline}: When \code{x} is a \code{ml_pipeline}, the function returns a \code{ml_pipeline} with
  the predictor appended to the pipeline.

  \item \code{tbl_spark}: When \code{x} is a \code{tbl_spark}, a predictor is constructed then
  immediately fit with the input \code{tbl_spark}, returning a prediction model.

  \item \code{tbl_spark}, with \code{formula}: specified When \code{formula}
    is specified, the input \code{tbl_spark} is first transformed using a
    \code{RFormula} transformer before being fit by
    the predictor. The object returned in this case is a \code{ml_model} which is a
    wrapper of a \code{ml_pipeline_model}.
}
}
\description{
Fit a parametric survival regression model named accelerated failure time (AFT) model (see \href{https://en.wikipedia.org/wiki/Accelerated_failure_time_model}{Accelerated failure time model (Wikipedia)}) based on the Weibull distribution of the survival time.
}
\details{
When \code{x} is a \code{tbl_spark} and \code{formula} (alternatively, \code{response} and \code{features}) is specified, the function returns a \code{ml_model} object wrapping a \code{ml_pipeline_model} which contains data pre-processing transformers, the ML predictor, and, for classification models, a post-processing transformer that converts predictions into class labels. For classification, an optional argument \code{predicted_label_col} (defaults to \code{"predicted_label"}) can be used to specify the name of the predicted label column. In addition to the fitted \code{ml_pipeline_model}, \code{ml_model} objects also contain a \code{ml_pipeline} object where the ML predictor stage is an estimator ready to be fit against data. This is utilized by \code{\link{ml_save}} with \code{type = "pipeline"} to faciliate model refresh workflows.

\code{ml_survival_regression()} is an alias for \code{ml_aft_survival_regression()} for backwards compatibility.
}
\examples{
\dontrun{

library(survival)
library(sparklyr)

sc <- spark_connect(master = "local")
ovarian_tbl <- sdf_copy_to(sc, ovarian, name = "ovarian_tbl", overwrite = TRUE)

partitions <- ovarian_tbl \%>\%
  sdf_random_split(training = 0.7, test = 0.3, seed = 1111)

ovarian_training <- partitions$training
ovarian_test <- partitions$test

sur_reg <- ovarian_training \%>\%
  ml_aft_survival_regression(futime ~ ecog_ps + rx + age + resid_ds, censor_col = "fustat")

pred <- ml_predict(sur_reg, ovarian_test)
pred
}

}
\seealso{
See \url{http://spark.apache.org/docs/latest/ml-classification-regression.html} for
  more information on the set of supervised learning algorithms.

Other ml algorithms: \code{\link{ml_decision_tree_classifier}},
  \code{\link{ml_gbt_classifier}},
  \code{\link{ml_generalized_linear_regression}},
  \code{\link{ml_isotonic_regression}},
  \code{\link{ml_linear_regression}},
  \code{\link{ml_linear_svc}},
  \code{\link{ml_logistic_regression}},
  \code{\link{ml_multilayer_perceptron_classifier}},
  \code{\link{ml_naive_bayes}},
  \code{\link{ml_one_vs_rest}},
  \code{\link{ml_random_forest_classifier}}
}
\concept{ml algorithms}
