\name{tdmModCreateCVindex}
\alias{tdmModCreateCVindex}
\title{Create and return a training-validation-set index vector.}
\usage{
  tdmModCreateCVindex(dset, response.variables, opts,
    stratified = FALSE)
}
\arguments{
  \item{dset}{the data frame for which cvi is needed}

  \item{response.variables}{issue a warning if
  \code{length(response.variables)>1}. Use the first
  response variable for determining strata size.}

  \item{opts}{a list from which we need here the following
  entries \itemize{ \item TST.kind: ["cv"|"rand"|"col"]
  \item TST.NFOLD: number of CV folds (only relevant in
  case TST.kind=="cv") \item TST.COL: column of dset
  containing the (0/1/<0) index (only relevant in case
  TST.kind=="col") or NULL if no such column exists \item
  TST.valiFrac: fraction of records to set aside for
  validation (only relevant in case TST.kind=="rand") \item
  TST.trnFrac: [1-opts$TST.valiFrac] fraction of records to
  use for training (only relevant in case TST.kind=="rand")
  }}

  \item{stratified}{[F] do stratified sampling for
  TST.kind="rand" with at least one training record for
  each response variable level (classification)}
}
\value{
  cvi training-validation-set (0/>0) index vector (all
  records with cvi<0, e.g. from column TST.COL, are
  disregarded)
}
\description{
  Depending on the value of member TST.kind in list opts,
  the returned index cvi is \enumerate{ \item
  TST.kind="cv": a random cross validation index
  P([111...222...333...]) - or - \item TST.kind="rand": a
  random index with P([00...11...-1-1...]) for training
  (0), validation (1) and disregard (-1) cases - or - \item
  TST.kind="col": the column dset[,opts$TST.COL] contains
  the training (0), validation (1) and disregard (-1) set
  division (and all records with a value <0 in column
  TST.COL are disregarded).  } Here P(.) denotes random
  permutation of the sequence. \cr The disregard set is
  optional, i.e. cvi may contain only 0 and 1, if desired.
  \cr Special case TST.kind="cv" and TST.NFOLD=1: make
  *every* record a training record, i.e. index [000...].
  \cr In case TST.kind="rand" and stratified=TRUE a
  \emph{stratified} sample is drawn, where the strata in
  the training case reflect the rel. frequency of each
  level of the **1st** response variable and are ensured to
  be at least of size 1.
}
\note{
  Currently stratified sampling in case TST.KIND='rand'
  does only work correctly for \emph{one} response
  variable.  If there are more than one, the right fraction
  of validation records is taken, but the strata are drawn
  w.r.t. the first response variable. (For multiple
  response variables we would have to return a list of
  cvi's or to call tdmModCreateCVindex for each response
  variable anew.)
}

