\name{fold_in}
\alias{fold_in}
\title{Ex-post folding-in of textmatrices into an existing latent semantic space}
\description{
Additional documents can be mapped into a pre-exisiting
latent semantic space without influencing the factor
distribution of the space. Applied, when additional documents 
must not influence the calculated existing latent semantic 
factor structure.
}
\usage{
fold_in( docvecs, LSAspace )
}
\arguments{
   \item{LSAspace}{a latent semantic space generated by createLSAspace.}
   \item{docvecs}{a textmatrix.}
}
\details{

To keep additional documents from influencing the factor distribution
calculated previously from a particular text basis, they can be folded-in 
after the singular value decomposition performed in \code{lsa()}.

Background Information:
For folding-in, a pseudo document vector \code{mi} of the new documents 
is calculated into as shown in the equations (1) and (2) (cf. Berry et al., 1995):

(1) \eqn{\hat{d} = v^T T_k S_k^{-1}}{di = t(v) Tk Sk^(-1)}

(2) \eqn{\hat{m} = T_k S_k \hat{d}}{mi = Tk Sk t(di)}

The document vector \eqn{v^T}{t(v)} in equation~(1) is identical to an additional 
column of an input textmatrix \eqn{M} with the term frequencies of the 
essay to be folded-in. \eqn{T_k}{Tk} and \eqn{S_k}{Sk} are the truncated matrices 
from the SVD applied through \code{lsa()} on a given text 
collection to construct the latent semantic space. The resulting vector
\eqn{\hat{m}}{mi} from equation~(2) is identical to an additional column in the
textmatrix representation of the latent semantic space (as produced by 
\code{as.textmatrix()}). Be careful when using weighting schemes: you
may want to use the global weights of the training textmatrix also for
your new data that you fold-in!

}
\value{
  \item{textmatrix}{a textmatrix representation of the additional documents in the latent semantic space.}
}
\author{ Fridolin Wild \email{f.wild@open.ac.uk} }
\seealso{ \code{\link{textmatrix}}, \code{\link{lsa}}, \code{\link{as.textmatrix}} }
\examples{

# create a first textmatrix with some files
td = tempfile()
dir.create(td)
write( c("dog", "cat", "mouse"), file=paste(td, "D1", sep="/") )
write( c("hamster", "mouse", "sushi"), file=paste(td, "D2", sep="/") )
write( c("dog", "monster", "monster"), file=paste(td, "D3", sep="/") )
matrix1 = textmatrix(td, minWordLength=1)
unlink(td, recursive=TRUE)

# create a second textmatrix with some more files
td = tempfile()
dir.create(td)
write( c("cat", "mouse", "mouse"), file=paste(td, "A1", sep="/") )
write( c("nothing", "mouse", "monster"), file=paste(td, "A2", sep="/") )
write( c("cat", "monster", "monster"), file=paste(td, "A3", sep="/") )
matrix2 = textmatrix(td, vocabulary=rownames(matrix1), minWordLength=1)
unlink(td, recursive=TRUE)

# create an LSA space from matrix1
space1 = lsa(matrix1, dims=dimcalc_share())
as.textmatrix(space1)

# fold matrix2 into the space generated by matrix1
fold_in( matrix2, space1)

}
\keyword{algebra}
