% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utils.R
\name{density_inversion}
\alias{density_inversion}
\title{KDE-based Inverse CDF Sampling for Synthetic Data}
\usage{
density_inversion(x, n, KDEn = 100, seed = NULL)
}
\arguments{
\item{x}{Numeric vector containing the observed data. Should be numeric or integer.
Vector only (no datasets at this stage).}

\item{n}{Integer scalar, the number of synthetic samples to generate.}

\item{KDEn}{Integer scalar passed to [stats::density()] as the `n` grid size
for the KDE; larger values yield a finer grid for integration (default
`100`).}

\item{seed}{Optional integer to set the random seed for reproducibility.}
}
\value{
A numeric vector of length `n` containing the synthetic values.
}
\description{
Generate synthetic samples that try to follow an observed distribution
by inverting a smooth kernel density estimate (KDE).
}
\details{
This utility estimates a univariate density for the input vector, numerically
integrates it to obtain a smooth cumulative distribution function (CDF), and
then samples synthetic values by inverting that CDF for uniformly distributed
probabilities. The result is a set of values that follow the shape of the
observed data without reproducing exact observations.


The KDE is fit using [stats::density()] with bandwidth rule `bw = "nrd0"`.
The CDF is computed via trapezoidal integration of the KDE on its
grid and normalized to \[0, 1\]. An inverse-CDF function is obtained via
linear interpolation ([stats::approxfun()]), which is then evaluated at
`n` independent uniforms from \[0, 1\].

This approach produces smooth synthetic samples that closely match the 
observed density.
}
\examples{
sim_iris <- density_inversion(iris$Sepal.Length, n = 200, seed = 1234)

hist(iris$Sepal.Length)
hist(sim_iris)

# cleanup
rm(sim_iris)
}
\seealso{
[stats::density()], [stats::approxfun()], [stats::runif()].
}
