% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dmst.zones.R
\name{dmst.zones}
\alias{dmst.zones}
\title{Determine zones using the dynamic minimum spanning tree scan test of Assuncao et al. (2006)}
\usage{
dmst.zones(coords, cases, pop, w, ex = sum(cases)/sum(pop) * pop,
  ubpop = 0.5, ubd = 1, lonlat = FALSE, parallel = FALSE,
  maxonly = FALSE)
}
\arguments{
\item{coords}{An \eqn{n \times 2} matrix of centroid coordinates for the regions.}

\item{cases}{The number of cases observed in each region.}

\item{pop}{The population size associated with each region.}

\item{w}{A binary spatial adjacency matrix.}

\item{ex}{The expected number of cases for each region.  The default is calculated under the constant risk hypothesis.}

\item{ubpop}{The upperbound of the proportion of the total population to consider for a cluster.}

\item{ubd}{The upperbound for the radius of a cluster.  This should be a proportion in (0, 1].  The value is the proportion of the maximum intercentroid distance between any two locations in \code{coords}. See Details.}

\item{lonlat}{The default is \code{FALSE}, which specifies that Euclidean distance should be used.If \code{lonlat} is \code{TRUE}, then the great circle distance is used to calculate the intercentroid distance.}

\item{parallel}{A logical indicating whether the test should be parallelized using the \code{parallel::mclapply function}.  Default is \code{TRUE}.  If \code{TRUE}, no progress will be reported.}

\item{maxonly}{A logical value indicating whether to return only the maximum test statistic across all candidate zones.  Default is \code{FALSE}.}
}
\value{
Returns a list of zones to consider for clustering that includes the location id of each zone and the associated test statistic, number of cases, expected number of cases, and the population in the zone. If \code{maxonly = TRUE}, then only the maximum test statistic across all of these zones is returned.
}
\description{
\code{dmst.zones} determines the zones that produce the largest test statistic using a greedy algorithm.  Specifically, starting individually with each region as a starting zone, new (connected) regions are added to the current zone in the order that results in the largest likelihood ratio test statistic.  This is used to implement the dynamic minimum spanning tree (dmst) scan test of Assuncao et al. (2006).
}
\details{
The test is performed using the spatial scan test based on the Poisson test statistic and a fixed number of cases.  The first cluster is the most likely to be a cluster.  If no significant clusters are found, then the most likely cluster is returned (along with a warning).

Every zone considered must have a total population less than \code{ubpop * sum(pop)}.  Additionally, the maximum intercentroid distance for the regions within a zone must be no more than \code{ubd * the maximum intercentroid distance across all regions}.
}
\examples{
data(nydf)
data(nyw)
coords = as.matrix(nydf[,c("longitude", "latitude")])
# find zone with max statistic starting from each individual region
max_zones = dmst.zones(coords, cases = floor(nydf$cases),
                       nydf$pop, w = nyw, ubpop = 0.25,
                       ubd = .25, lonlat = TRUE)
head(max_zones)
}
\references{
Assuncao, R.M., Costa, M.A., Tavares, A. and Neto, S.J.F. (2006). Fast detection of arbitrarily shaped disease clusters, Statistics in Medicine, 25, 723-742.
}
\author{
Joshua French
}
