% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/deriveVars.R
\name{deriveVars}
\alias{deriveVars}
\title{Derive variables by transformation.}
\usage{
deriveVars(data, transformtype = c("L", "M", "D", "HF", "HR", "T", "B"),
  allsplines = FALSE, algorithm = "maxent", write = FALSE,
  dir = NULL, quiet = FALSE)
}
\arguments{
\item{data}{Data frame containing the response variable in the first column
and explanatory variables in subsequent columns. The response variable
should represent either presence and background (coded as 1/NA) or presence
and absence (coded as 1/0). The explanatory variable data should be
complete (no NAs). See \code{\link{readData}}.}

\item{transformtype}{Specifies the types of transformations types to be
performed. Default is the full set of the following transformation types: L
(linear), M (monotonous), D (deviation), HF (forward hinge), HR (reverse
hinge), T (threshold), and B (binary).}

\item{allsplines}{Logical. Keep all spline transformations created, rather
than pre-selecting particular splines based on fraction of total variation
explained.}

\item{algorithm}{Character string matching either "maxent" or "LR", which
determines the type of model used for spline pre-selection. See Details.}

\item{write}{Logical. Write the transformation functions to .Rdata file?
Default is \code{FALSE}.}

\item{dir}{Directory for file writing if \code{write = TRUE}. Defaults to the
working directory.}

\item{quiet}{Logical. Suppress progress messages from spline pre-selection?}
}
\value{
List of 2: \enumerate{ \item dvdata: List containing first the
  response variable, followed data frames of derived variables produced for
  each explanatory variable. This item is recommended as input for
  \code{dvdata} in \code{\link{selectDVforEV}}. \item transformations: List
  containing first the response variable, followed by all the transformation
  functions used to produce the derived variables. }
}
\description{
\code{deriveVars} produces derived variables from explanatory variables by
transformation, and returns a list of dataframes. The available
transformation types are as follows, described in Halvorsen et al. (2015): L,
M, D, HF, HR, T (for continuous EVs), and B (for categorical EVs). For spline
transformation types (HF, HR, T),  a subset of possible DVs is pre-selected
by the criteria described under Details.
}
\details{
The linear transformation "L" is a simple rescaling to the range [0, 1].

The monotonous transformation "M" performed is a zero-skew transformation
(Oekland et al. 2001).

The deviation transformation "D" is performed around an optimum EV value that
is found by looking at frequency of presence (see \code{\link{plotFOP}}).
Three deviation transformations are created with different steepness and
curvature around the optimum.

For spline transformations ("HF", "HR", and "T"), DVs are created around 20
different break points (knots) which span the range of the EV. Only DVs which
satisfy all of the following criteria are retained: \enumerate{ \item 3 <=
knot <= 18 (DVs with knots at the extremes of the EV are never retained).
\item Chi-square test of the single-variable model from the given DV compared
to the null model gives a p-value < 0.05. \item The single-variable model
from the given DV shows a local maximum in fraction of variation explained
(D^2, sensu Guisan & Zimmerman, 2000) compared to DVs from the neighboring 4
knots.} The models used in this pre-selection procedure may be maxent models
(algorithm="maxent") or standard logistic regression models (algorithm="LR").

For categorical variables, 1 binary derived variable (type "B") is created
for each category.

The maximum entropy algorithm ("maxent") --- which is implemented in
MIAmaxent as an infinitely-weighted logistic regression with presences added
to the background --- is conventionally used with presence-only occurrence
data. In contrast, standard logistic regression (algorithm = "LR"), is
conventionally used with presence-absence occurrence data.

Explanatory variables should be uniquely named. Underscores ('_') and colons
(':') are reserved to denote derived variables and interaction terms
respectively, and \code{deriveVars} will replace these --- along with other
special characters --- with periods ('.').
}
\examples{
toydata_dvs <- deriveVars(toydata_sp1po, c("L", "M", "D", "HF", "HR", "T", "B"))
str(toydata_dvs$dvdata)
summary(toydata_dvs$transformations)

\dontrun{
# From vignette:
grasslandDVs <- deriveVars(grasslandPO,
                           transformtype = c("L","M","D","HF","HR","T","B"))
summary(grasslandDVs$dvdata)
head(summary(grasslandDVs$transformations))
length(grasslandDVs$transformations)
plot(grasslandPO$terslpdg, grasslandDVs$dvdata$terslpdg$terslpdg_D2, pch=20,
     ylab="terslpdg_D2")
plot(grasslandPO$terslpdg, grasslandDVs$dvdata$terslpdg$terslpdg_M, pch=20,
     ylab="terslpdg_M")
}

}
\references{
Guisan, A., & Zimmermann, N. E. (2000). Predictive habitat
  distribution models in ecology. Ecological modelling, 135(2-3), 147-186.

Halvorsen, R., Mazzoni, S., Bryn, A., & Bakkestuen, V. (2015).
  Opportunities for improved distribution modelling practice via a strict
  maximum likelihood interpretation of MaxEnt. Ecography, 38(2), 172-183.

Oekland, R.H., Oekland, T. & Rydgren, K. (2001).
  Vegetation-environment relationships of boreal spruce swamp forests in
  Oestmarka Nature Reserve, SE Norway. Sommerfeltia, 29, 1-190.
}
