% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/createMrBayesTipDatingNexus.R
\name{createMrBayesTipDatingNexus}
\alias{createMrBayesTipDatingNexus}
\alias{tipdating}
\title{Construct a Fully Formatted NEXUS Script for Performing Tip-Dating Analyses With MrBayes}
\usage{
createMrBayesTipDatingNexus(tipTimes, outgroupTaxa = NULL,
  treeConstraints = NULL, ageCalibrationType, whichAppearance = "first",
  treeAgeOffset, minTreeAge = NULL, collapseUniform = TRUE,
  anchorTaxon = TRUE, newFile = NULL, origNexusFile = NULL,
  parseOriginalNexus = TRUE, createEmptyMorphMat = TRUE, runName = NULL,
  morphModel = "strong", ngen = "100000000", doNotRun = FALSE,
  cleanNames = TRUE, printExecute = TRUE)
}
\arguments{
\item{tipTimes}{This input may be either a timeList object (i.e. a list of length 2, 
composed of a table of interval upper and lower time boundaries (i.e. earlier and latter bounds), and 
a table of first and last intervals for taxa) or a matrix with rownames
for taxa as you want listed in the MrBayes block, with either one, two
or four columns containing ages (respectively) for point occurrences with
precise dates (for a single column), uncertainty bounds on a point occurrence
(for two columns), or uncertainty bounds on the first and
last occurrence (for four columns). Note that precise first and last occurrence
dates should not be entered as a two column matrix, as this will instead be interpreted
as uncertainty bounds on a single occurrence. Instead, either select which you want to
use for tip-dates and give a 1-column matrix, or repeat (and collate) the columns, so that
the first and last appearances has uncertainty bounds of zero.}

\item{outgroupTaxa}{A vector of type 'character', containing taxon names designating the outgroup.
 All taxa not listed in the outgroup will be constrained to be a monophyletic ingroup, for sake of rooting
the resulting dated tree.
Either \code{treeConstraints} or \code{outgroupTaxa} must be defined, but \emph{not both}. 
If the outgroup-ingroup split is not present on the supplied \code{treeConstraints}, add that split to \code{treeConstraints} manually.}

\item{treeConstraints}{An object of class \code{phylo}, 
from which (if \code{treeConstraints} is supplied) the set topological constraints are derived, as
as described for argument \code{tree} for function \code{createMrBayesConstraints}. 
Either \code{treeConstraints} or \code{outgroupTaxa} must be defined, but \emph{not both}.
If the outgroup-ingroup split is not present on the supplied \code{treeConstraints}, add that split to \code{treeConstraints} manually.}

\item{ageCalibrationType}{This argument decides how age calibrations are defined, 
and currently allows for four options: \code{"fixedDateEarlier"} which fixes tip
ages at the earlier (lower) bound for the selected age of appearance (see argument
\code{whichAppearance} for how that selection is made), \code{"fixedDateLatter"}
which fixes the date to the latter (upper) bound of the selected age of appearance,
\code{"fixedDateRandom"} which fixes tips to a date that is randomly drawn from a
uniform distribution bounded by the upper and lower bounds on the selected age of
appearance, or (the recommended option) \code{"uniformRange"} which places a uniform
prior on the age of the tip, bounded by the latest and earliest (upper and lower)
bounds on the the selected age.}

\item{whichAppearance}{Which appearance date of the taxa should be used:
their \code{'first'} or their \code{'last'} appearance date? The default
option is to use the 'first' appearance date. Note that use of the last
appearance date means that tips will be constrained to occur before their
last occurrence, and thus could occur long after their first occurrence (!).
In addition, \code{createMrBayesTipDatingNexus} allows for two
options for this argument that are in addition to those offered by
\code{\link{createMrBayesTipCalibrations}}. Both of these options will duplicate 
the taxa in the inputs multiple times, modifying their OTU labels, thus allowing
multiple occurrences of long-lived morphotaxa to be listed as multiple OTUs
arrayed across their stratigraphic duration. If 
\code{whichAppearance = "firstLast"}, taxa will be duplicated so each taxon is
listed as occurring twice: once at their first appearance, and a second time at
their last appearance. Note that if a taxon first and last appears in the same interval,
and \code{ageCalibrationType = "uniformRange"}, then
the resulting posterior trees may place the OTU assigned to the last occurrence before the
first occurrence in temporal order (but the assignment, in that case, was entirely
arbitrary). When \code{whichAppearance = "rangeThrough"}, each taxon will be
duplicated into as many OTUs as each
interval that a taxon ranges through (in a timeList format, see other
paleotree functions), with the corresponding age uncertainties for those intervals.
If the input tipTimes is not a list of length = 2, however, the function will 
return an error under this option.}

\item{treeAgeOffset}{A parameter given by the user controlling the offset 
between the minimum and expected tree age prior. mean tree age for the
offset exponential prior on tree age will be set to the minimum tree age, 
plus this offset value. Thus, an offset of 10 million years would equate to a prior
assuming that the expected tree age is around 10 million years before the minimum age.}

\item{minTreeAge}{if \code{NULL} (the default), then minTreeAge will
be set as the oldest date among the tip age used (those used being
determine by user choices (or oldest bound on a tip age). Otherwise,
the user can supply their own minimum tree, which must be greater than
whatever the oldest tip age used is.}

\item{collapseUniform}{MrBayes won't accept uniform age priors where the maximum and
minimum age are identical (i.e. its actually a fixed age). Thus, if this argument
is \code{TRUE} (the default), this function
will treat any taxon ages where the maximum and minimum are identical as a fixed age, and
will override setting \code{ageCalibrationType = "uniformRange"} for those dates.
All taxa with their ages set to fixed by the behavior of \code{anchorTaxon} or \code{collapseUniform}
are returned as a list within a commented line of the returned MrBayes block.}

\item{anchorTaxon}{This argument may be a logical (default is \code{TRUE}, or a character string of length = 1.
This argument has no effect if \code{ageCalibrationType} is not set to "uniformRange", but the argument may still be evaluated.
If \code{ageCalibrationType = "uniformRange"}, MrBayes will do a tip-dating analysis with uniform age uncertainties on 
all taxa (if such uncertainties exist; see \code{collapseUniform}). However, MrBayes does not record how each tree sits on an absolute time-scale,
so if the placement of \emph{every} tip is uncertain, lining up multiple dated trees sampled from the posterior (where each tip's true age might
differ) could be a nightmare to back-calculate, if not impossible. Thus, if \code{ageCalibrationType = "uniformRange"}, and there are no tip taxa given
fixed dates due to \code{collapseUniform} (i.e. all of the tip ages have a range of uncertainty on them), then a particular taxon
will be selected and given a fixed date equal to its earliest appearance time for its respective \code{whichAppearance}. This taxon can either be indicated by
the user or instead the first taxon listed in \code{tipTimes} will be arbitrary selected. All taxa with their ages set
to fixed by the behavior of \code{anchorTaxon} or \code{collapseUniform} are returned as a list within a commented line of the returned MrBayes block.}

\item{newFile}{Filename (possibly with path) as a character string
leading to a file which will be overwritten with the output tip age calibrations.
If \code{NULL}, tip calibration commands are output to the console.}

\item{origNexusFile}{Filename (possibly with path) as a character
string leading to a NEXUS text file, presumably containing a matrix
of character date formated for \emph{MrBayes}. If supplied
(it does not need to be supplied), the listed file is read as a text file, and
concatenated with the \emph{MrBayes} script produced by this function, so as to
reproduce the original NEXUS matrix for executing in MrBayes. 
Note that the taxa in this NEXUS file are \emph{NOT} checked against the user
input \code{tipTimes} and \code{treeConstraints}, so it is up to the user to
ensure the taxa are the same across the three data sources.}

\item{parseOriginalNexus}{If \code{TRUE} (the default), the original NEXUS file is parsed and 
the taxon names listed within in the matrix are compared against the other inputs
for matching (completely, across all inputs that include taxon names). 
Thus, it is up to the user to ensure the same
taxa are found in all inputs. However, some NEXUS files may not parse correctly
(particularly if character data for taxa stretches across more than a single line in the matrix).
This may necessitate setting this argument to \code{FALSE}, which will instead do a straight scan
of the NEXUS matrix without parsing it, and without checking the taxon names against other outputs.
Some options for \code{whichAppearance} will not be available, however.}

\item{createEmptyMorphMat}{If \code{origNexusFile} is not specified (implying there is no
pre-existing morphological character matrix for this dataset), then an 'empty' NEXUS-formatted matrix will be
appended to the set of \emph{MrBayes} commands if this command is \code{TRUE} (the default). This
'empty' matrix will have each taxon in \code{tipTimes} coded for a single missing character
(i.e., '?'). This allows tip-dating analyses with hard topological constraints, and ages
determined entirely by the fossilized birth-death prior, with no impact from a
presupposed morphological clock (thus a 'clock-less analysis').}

\item{runName}{The name of the run, used for naming the log files. 
If not set, the name will be taken from the name given for outputting
the NEXUS script (\code{newFile}). If \code{newFile} is not given, and
\code{runName} is not set by the user, the default run name will be  "new_run_paleotree".}

\item{morphModel}{This argument can be used to switch between two end-member models of 
morphological evolution in MrBayes, here named 'strong' and 'relaxed', for the 'strong assumptions'
and 'relaxed assumptions' models described by Bapst, Schreiber and Carlson (Systematic Biology).
The default is a model which makes very 'strong' assumptions about the process of morphological evolution,
while the 'relaxed' alternative allows for considerably more heterogeneity in the rate
of morphological evolution across characters, and in the forward and reverse transition
rates between states. Note that in both cases, the character data is assumed to be filtered
to only parsimony-informative characters, without autapomorphies.}

\item{ngen}{Number of generations to set the MCMCMC to run for.
Default (\code{ngen = 100000000}) is very high.}

\item{doNotRun}{If \code{TRUE}, the commands that cause a script to automatically begin running in 
\emph{MrBayes} will be left out. Useful for troubleshooting initial runs of scripts for non-fatal errors and
warnings (such as ignored constraints). Default for this argument is \code{FALSE}.}

\item{cleanNames}{If \code{TRUE} (the default), then special characters
(currently, this only contains the forward-slashes: '/') are removed from
taxon names before construction of the NEXUS file.}

\item{printExecute}{If \code{TRUE} (the default) and if output is directed to a \code{newFile}
(i.e. a \code{newFile} is specified), a line for pasting into MrBayes for executing the newly created file
will be messaged to the terminal.}
}
\value{
If argument \code{newFile} is \code{NULL}, then the text of the 
generated NEXUS script is output to the console as a series of character strings.
}
\description{
This function is meant to expedite the creation of NEXUS files formatted
for performing tip-dating analyses in the popular phylogenetics software \emph{MrBayes},
particularly clock-less tip-dating analyses executed with 'empty' morphological matrices
(i.e. where all taxa are coded for a single missing character), although a pre-existing
morphological matrix can also be input by the user (see argument \code{origNexusFile}).
Under some options, this pre-existing matrix may be edited by this function.
The resulting full NEXUS script is output as a set of character strings either
printed to the R console, or output to file which is then overwritten.
}
\details{
Users must supply a data set of tip ages (in various formats),
which are used to construct age calibrations commands on the tip taxa 
(via paleotree function \code{\link{createMrBayesTipCalibrations}}). 
The user must also supply some topological constraint: 
either a set of taxa designated as the outgroup, which is then converted into a command constraining
the monophyly on the ingroup taxa, which is presumed to be all taxa \emph{not} listed in the outgroup. 
Alternatively, a user may supply a tree which is then converted into a series of hard topological
constraints (via function \code{\link{createMrBayesConstraints}}. Both types of topological constraints
cannot be applied. Many of the options available with \code{\link{createMrBayesTipCalibrations}} are available with this function,
allowing users to choose between fixed calibrations or uniform priors that approximate stratigraphic uncertainty.
In addition, the user may also supply a path to a text file
presumed to be a NEXUS file containing character data formatted for use with \emph{MrBayes}.


The taxa listed in \code{tipTimes} must match the taxa in 
\code{treeConstraints}, if such is supplied. If supplied, the taxa in \code{outgroupTaxa}
must be contained within this same set of taxa. These all must have matches
in the set of taxa in \code{origNexusFile}, if provided and
if \code{parseOriginalNexus} is \code{TRUE}.

Note that because the same set of taxa must be contained in all inputs, 
relationships are constrained as 'hard' constraints, rather than 'partial' constraints,
which allows some taxa to float across a partially fixed topology. 
See the documentation for \code{\link{createMrBayesConstraints}},
for more details.
}
\note{
This function allows a user to take an undated phylogenetic tree in R, and a set of age estimates
for the taxa on that tree, and produce a posterior sample of dated trees using the MCMCMC in \emph{MrBayes},
while treating an 'empty' morphological matrix as an uninformative set of missing characters.
This 'clock-less tip-dating' approach is essentially an alternative to the \emph{cal3} method in paleotree, 
sharing the same fundamental theoretical model (a version of the fossilized birth-death model), but
with a better algorithm that considers the whole tree simultaneously, rather than evaluating each node
individually, from the root up to the tips (as \emph{cal3} does it, and which may cause artifacts). 
That said, \emph{cal3} still has a few advantages: tip-dating as of April 2017 still only treats OTUs as
point observations, contained in a single time-point, while cal3 can consider taxa as having durations with
first and last occurrences. This means it may be more straightforward to assess the extent of budding cladogenesis
patterns of ancestor-descendant relationships in \emph{cal3}, than in tip-dating.
}
\examples{

# let's do some examples

# load retiolitid dataset
data(retiolitinae)

# let's try making a NEXUS file!

# Use a uniform prior, with a 10 million year offset for
	# the expected tree age from the earliest first appearance
# set average tree age to be 10 Ma earlier than first FAD

outgroupRetio<-"Rotaretiolites" # sister to all other included taxa

# the following will create a NEXUS file with an 'empty' morph matrix
	# with the only topological constraint on ingroup monophyly
	# Probably shouldn't do this: leaves too much to the FBD prior
 
# with doNotRun set to TRUE for troubleshooting

createMrBayesTipDatingNexus(tipTimes=retioRanges,
		outgroupTaxa=outgroupRetio,treeConstraints=NULL,
		ageCalibrationType="uniformRange",whichAppearance="first",
		treeAgeOffset=10,	newFile=NULL,	
		origNexusFile=NULL,createEmptyMorphMat=TRUE,
		runName="retio_dating",doNotRun=TRUE)

# let's try it with a tree for topological constraints
     # this requires setting outgroupTaxa to NULL
# let's also set doNotRun to FALSE

createMrBayesTipDatingNexus(tipTimes=retioRanges,
		outgroupTaxa=NULL,treeConstraints=retioTree,
		ageCalibrationType="uniformRange",whichAppearance="first",
		treeAgeOffset=10,	newFile=NULL,	
		origNexusFile=NULL,createEmptyMorphMat=TRUE,
		runName="retio_dating",doNotRun=FALSE)

# the above is essentially cal3 with a better algorithm,
		# and no need for a priori rate estimates
# just need a tree and age estimates for the tips!

#############################################################################
# some more variations for testing purposes

# no morph matrix supplied or generated
	# you'll need to manually append to an existing NEXUS file
createMrBayesTipDatingNexus(tipTimes=retioRanges,
		outgroupTaxa=NULL,treeConstraints=retioTree,
		ageCalibrationType="uniformRange",whichAppearance="first",
		treeAgeOffset=10,	newFile=NULL,	
		origNexusFile=NULL,createEmptyMorphMat=FALSE,
		runName="retio_dating",doNotRun=TRUE)

\dontrun{

# let's actually try writing an example with topological constraints
	# to file and see what happens

# here's my super secret MrBayes directory
file<-"D:\\\\dave\\\\workspace\\\\mrbayes\\\\exampleRetio.nex"

createMrBayesTipDatingNexus(tipTimes=retioRanges,
		outgroupTaxa=NULL,treeConstraints=retioTree,
		ageCalibrationType="uniformRange",whichAppearance="first",
		treeAgeOffset=10,	newFile=file,	
		origNexusFile=NULL,createEmptyMorphMat=TRUE,
		runName="retio_dating",doNotRun=FALSE)

}

}
\references{
The basic fundamentals of tip-dating, and tip-dating with the fossilized
birth-death model are introduced in these two papers: 

Ronquist, F., S. Klopfstein, L. Vilhelmsen, S. Schulmeister, D. L. Murray,
and A. P. Rasnitsyn. 2012. A Total-Evidence Approach to Dating with Fossils,
Applied to the Early Radiation of the Hymenoptera. \emph{Systematic Biology} 61(6):973-999.

Zhang, C., T. Stadler, S. Klopfstein, T. A. Heath, and F. Ronquist. 2016. 
Total-Evidence Dating under the Fossilized Birth-Death Process.
\emph{Systematic Biology} 65(2):228-249. 

For recommended best practices in tip-dating analyses, please see:

Matzke, N. J., and A. Wright. 2016. Inferring node dates from tip dates
in fossil Canidae: the importance of tree priors. \emph{Biology Letters} 12(8).

The rationale behind the two alternative morphological models are described in more detail here:

Bapst, D. W., H. A. Schreiber, and S. J. Carlson. In press. Combined analysis of extant Rhynchonellida
(Brachiopoda) using morphological and molecular data. \emph{Systematic Biology} doi: 10.1093/sysbio/syx049
}
\seealso{
\code{\link{createMrBayesConstraints}}, \code{\link{createMrBayesTipCalibrations}}, , \code{\link{cal3}}
}
\author{
David W. Bapst. This code was produced as part of a project 
funded by National Science Foundation grant EAR-1147537 to S. J. Carlson.

The basic \emph{MrBayes} commands utilized in the output script are a collection
of best practices taken from studying NEXUS files supplied by April Wright,
William Gearty, Graham Slater, Davey Wright, and
guided by the recommendations of Matzke and Wright, 2016 in Biol. Lett.
}
